Word-break keep-all
In Chinese, Japanese, and Korean, every character is a word-like unit. The browser's default behavior is to break between any two CJK characters — which is fine for filling lines, but which can shatter compounds that a native reader expects to stay together. A second option on prepare(), wordBreak: 'keep-all', tells Pretext to refuse those mid-compound breaks. Latin words in the same text continue to break normally.
The same passage rendered twice. Narrow the width and watch the two panes diverge.
Mechanism
When Pretext segments the prepared text, it tags each segment with a break kind. Normal word-break gives CJK runs an implicit zero-width break opportunity between every grapheme — the same rule the browser applies under word-break: normal. A wrap can fall anywhere in the run.
Under wordBreak: 'keep-all', those inter-character break opportunities are suppressed. The only places the engine will break a CJK run are the ones it would always have honored: explicit whitespace (rare in Chinese), punctuation that permits a break, or a boundary with a non-CJK script. Latin words embedded in the passage still break on spaces the way they always did — keep-all only changes CJK behavior.
Because both runs share the same measured widths, the split between the two panes is a pure arithmetic replay of different break rules over the same canvas measurements.
Application
Serving CJK typography that doesn't shatter takes one option. The immediate unlocks:
- A Chinese headline that breaks between clauses, never inside a compound.
- A Japanese body block that reads the way the author punctuated it, even at tight column widths.
- A Korean article that co-mingles English terms and still breaks both scripts correctly.
- A single prose pipeline that switches mode by locale detection, with no other changes downstream.
The pre-measured heights are still exact, so virtualization, masonry, and pre-mount sizing work as before. Keep-all costs you nothing but an option.
"道可道,非常道。名可名,非常名。"
The way that can be spoken is not the eternal way; the name that can be named is not the eternal name.
Direct Claude
wordBreak: 'keep-all' on prepare()
"CJK that doesn't shatter"
→
keep-all handle for every CJK paragraph
"mixed Chinese and English, wrap each correctly"
→
keep-all — Latin still breaks on spaces
"switch behavior by locale"
→
pick the option when you call prepare()
import { prepare, layout } from '@chenglou/pretext';
const font = "400 20px 'Songti SC', 'Source Han Serif SC', serif";
const lh = 20 * 1.8;
// Normal: break opportunity between any two CJK characters.
const normalHandle = prepare(cjkText, font);
// keep-all: CJK runs stay intact; Latin words still break on spaces.
const keepAllHandle = prepare(cjkText, font, { wordBreak: 'keep-all' });
const a = layout(normalHandle, width, lh);
const b = layout(keepAllHandle, width, lh);