Skip to content

Case conversion malfunctions when the string length changes #11463

@benrg

Description

@benrg

To reproduce:

  • Paste the text ȺRGH into a UTF-8 encoded document.
  • Select it and press Ctrl+U to lower-case it.

Expected result: ⱥrgh

Actual result: ⱥrg

This happens because Ⱥ encodes to two bytes and encodes to three bytes, and the case-conversion code assumes that the length in bytes doesn't change. If the length increases (as above) then you lose bytes from the end. If it decreases (e.g. upper-casing ⱥrgh) then you get NULs and other garbage at the end.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions