Skip to content

[css-text-3] Clarify Segment Break Transformation Rules when mutiple segment breaks involve #836

@upsuper

Description

@upsuper

The first rule for collapsing segment breaks is:

If the character immediately before or immediately after the segment break is the zero-width space character (U+200B), then the break is removed, leaving behind the zero-width space.

It is not clear to me what should happen if there are multiple segment breaks involve here. For example, if I have ZWSP LF LF LF x, would this rule produce:

  1. ZWSP LF LF x (with only the first LF removed), or
  2. ZWSP x (with all LF removed because of recursively applying this rule)?

(In the first case, the remaining LFs would be converted to whitespaces by the last rule there, and the second whitespace would be removed by step 4 of Phase I, so the final result would be ZWSP WS x.)

This may also affect the second rule:

Otherwise, if the East Asian Width property of both the character before and after the line feed is F, W, or H (not A), and neither side is Hangul, then the segment break is removed.

If I have W LF LF W, should the two LFs be removed by this rule?

It seems to me that removing all segment breaks together would be easier for implementation, so I would propose making the rules that way if there are no other concerns.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions