UTF-8 section symbol (0xC2A7) invokes TIS-620 decoding

I noticed some strange behavior today with how Notepad++ handles encoding detection: if a file contains 1 or more section symbols (`&sect;`) and no other non-ASCII characters, Notepad++ will decode the file using the [TIS-620](https://en.wikipedia.org/wiki/Thai_Industrial_Standard_620-2533) charset, resulting in each section symbol being decoded to `&#3618;&#3591;`. This behavior is entirely inconsistent and does not seem to occur for any other non-ASCII Latin character, and will not occur if other non-ASCII symbols are present in the file.

Edit: I should mention this occurs on multiple machines. all with an English locale active.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UTF-8 section symbol (0xC2A7) invokes TIS-620 decoding #940

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

UTF-8 section symbol (0xC2A7) invokes TIS-620 decoding #940

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions