Skip to content

Boost replacement case modifiers do not work as expected #9636

@guy038

Description

@guy038

Description of the Issue

The Boost case modifiers ( \U, \L; \u and \l ), in replacement, do not change the case of any accentuated character, so for any letter char with Unicode code-point over U+007F !

Steps to Reproduce the Issue

  • Place the simple French sentence, below, in a new tab
C'est là, près de la forêt, dans un gîte où régnait un grand capharnaüm, que l'aïeul ôta sa flûte et son bâton de son canoë.
  • Open the Replace dialog ( Ctrl + H )

    • SEARCH \w

    • REPLACE \U$0 or \u$0

    • Tick the Wrap around option

    • Click on the Replace All button

Expected Behavior

After the replacement, the text should be changed like below :

C'EST LÀ, PRÈS DE LA FORÊT, DANS UN GÎTE OÙ RÉGNAIT UN GRAND CAPHARNAÜM, QUE L'AÏEUL ÔTA SA FLÛTE ET SON BÂTON DE SON CANOË.

These expected modifications can be done, of course, by selecting the text and using the default Shift + Ctrl + U shortcut

Actual Behavior

After the replacement, we get this output :

C'EST Là, PRèS DE LA FORêT, DANS UN GîTE Où RéGNAIT UN GRAND CAPHARNAüM, QUE L'AïEUL ôTA SA FLûTE ET SON BâTON DE SON CANOë.

It's obvious that all the accentuated characters have not been modified, in their uppercase form, by the regex S/R !

The reasoning would be identical with the \L or \l case modifiers, applied on an initial uppercase text

Notes

  • This issue occurs, both, in ANSI or Unicode encoded files, as UTF-8

  • This issue exists since the implementation of the Boost regex library, on N++ v6.0.0. !

  • The last N++ version, used for the tests, was the v7.9.2 release


Best Regards,

guy038

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions