Skip to content

UTF-8 encoding option goes wrong with file with misidentified encoding #3188

@johnmreynolds

Description

@johnmreynolds

Description

I've got a file (will attach one which replicated this issue), which is almost entirely ASCII, but with a UTF-8 encoded character in it. This file does not have a leading BOM.

Notepad++ misidentifies the encoding as Thai TIS-620. By itself this is a minor inconvenience, but following this, the encoding options in the Encoding menu seem to end up broken.

Steps to Reproduce the Issue

  1. Open the file.
  2. Notepad++ misidentifies the encoding as TIS-620.
  3. Select Encode in UTF-8 from the Encoding menu.

Expected Behavior

Change the encoding to UTF-8 and display the character correctly.

Actual Behavior

Encoding at the bottom of the screen flickers, and reverts to TIS-620, and the character is not displayed correctly.

Selecting a different encoding from the Character Sets (e.g. ISO 8859-1) from the menu does change the encoding, but obviously ends up displaying the multiple characters. There doesn't seem to be any option under the Character Sets submenu for forcing 'Unicode' as the character set.

Debug Information

Notepad++ v7.3.3 (32-bit)
Build time : Mar 8 2017 - 03:37:37
Path : C:\Program Files (x86)\Notepad++\notepad++.exe
Admin mode : OFF
Local Conf mode : OFF
OS : Windows 10 (64-bit)
Plugins : mimeTools.dll NppConverter.dll NppExport.dll PluginManager.dll

test.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions