CJK keep-all
CSS gives you a single switch for "don't split a word mid-compound" in CJK scripts: word-break: keep-all. Browsers implement it; native readers feel it. The paragraph stops fracturing at every character and breaks only where punctuation gives it permission.
Two panes, one passage — the opening of the Dao De Jing. Same width, same font, same line-height. Left pane uses the browser default. Right pane sets word-break: keep-all. Drag the slider to feel the difference at the right-hand edge.
道可道,非常道。名可名,非常名。无名,天地之始;有名,万物之母。故常无欲,以观其妙;常有欲,以观其徼。此两者同出而异名,同谓之玄。玄之又玄,众妙之门。
道可道,非常道。名可名,非常名。无名,天地之始;有名,万物之母。故常无欲,以观其妙;常有欲,以观其徼。此两者同出而异名,同谓之玄。玄之又玄,众妙之门。
Mechanism
In default CJK, Pretext's segmenter treats each Han character as its own unit and every character boundary as a legal break opportunity — the same thing the browser does under word-break: normal. That lets lines pack tightly. It also means a two- or three-character compound like 万物 can land with half of itself on one line and the other half on the next. For ordinary Western prose that reads as "tight wrap"; for classical Chinese it reads as a typesetting bug.
Under CSS word-break: keep-all, the browser refuses those mid-word opportunities. Break points collapse to the punctuation marks that are semantically allowed to end a phrase — 。 、 , ; — plus explicit spaces. The line fills less tightly on the right, but every line starts on a phrase boundary the author would recognize.
Both panes below run through the same prepare()/layout() call — the displayed difference is CSS-driven at render time. Whether Pretext exposes keep-all at the library level is an open item on the engine's roadmap; for now the pragmatic pattern is: let CSS do the visual break, use Pretext for the measurement you already trust.
Application
Typography that feels native, not machine-translated:
- Editorial prose in Chinese, Japanese, or Korean where the line breaks land on phrase boundaries — the way a print editor would set them.
- A bilingual reader app that renders the English side with
normaland the Chinese side withkeep-all, from a single shared component. - Pull-quotes and chapter epigraphs in CJK scripts that look deliberate, not fractured.
- Pre-computed heights for virtualized feeds of Chinese articles, correct on the first render because Pretext and the browser agree on break points.
The application is small and specific: serve an East Asian reader a page whose line endings don't advertise, in every paragraph, that a Western developer built it.
"The Tao that can be trodden is not the enduring and unchanging Tao. The name that can be named is not the enduring and unchanging name. (Conceived of as) having no name, it is the Originator of heaven and earth; (conceived of as) having a name, it is the Mother of all things."
Laozi, Dao De Jing, opening of Ch. 1 (6th c. BCE); trans. James Legge (1891)Direct Claude
word-break: keep-all on the CJK container
"don't fracture compounds"
→
CSS word-break: keep-all; break only on punctuation
"one component, two languages"
→
swap the CSS rule per paragraph; one prepare()/layout() path
"classical Chinese, proper line endings"
→
set lang="zh" and word-break: keep-all; measure with Pretext
import { prepare, layout } from '@chenglou/pretext';
// 道德经 · opening of Chapter 1. Public domain (6th c. BCE).
const TEXT = "道可道,非常道。名可名,非常名。无名,天地之始;有名,万物之母。";
const FONT = "400 21px 'Noto Serif SC', serif";
const LINE_HEIGHT = 21 * 1.75;
// One prepare() covers measurement for both modes — Pretext's cached widths
// are width-independent and the measurement is the same regardless of which
// break policy the browser applies at render time.
const handle = prepare(TEXT, FONT);
// Apply the break policy in CSS. Right pane toggles:
// .cjk.keep-all { word-break: keep-all; }
// and the line endings line up on phrase punctuation instead of every glyph.
function relayout(width) {
const { height, lineCount } = layout(handle, width, LINE_HEIGHT);
// height is tight to the browser's default behavior.
// Under CSS word-break: keep-all, the browser returns fewer but taller
// runs to the container; Pretext stays the source of truth for the
// default-mode measurement, which is the one your virtualizer caches.
}