Skip to content

UTF-8 section symbol (0xC2A7) invokes TIS-620 decoding #940

@caseif

Description

@caseif

I noticed some strange behavior today with how Notepad++ handles encoding detection: if a file contains 1 or more section symbols (§) and no other non-ASCII characters, Notepad++ will decode the file using the TIS-620 charset, resulting in each section symbol being decoded to ยง. This behavior is entirely inconsistent and does not seem to occur for any other non-ASCII Latin character, and will not occur if other non-ASCII symbols are present in the file.

Edit: I should mention this occurs on multiple machines. all with an English locale active.

Metadata

Metadata

Assignees

No one assigned

    Labels

    verifiedIssues verified to be valid and reproducible, PRs that have been tested thoroughly

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions