Skip to content

[BUG] Detects ISO-8859-1 incorrectly as ANSI; then, empties document when converting to UTF-8 #15271

@darthstark1138

Description

@darthstark1138

Is there an existing issue for this?

  • I have searched the existing issues

Description of the Issue

I have downloaded an SQL script (link below), but my SQL tool was complaining of non-Unicode characters. So I opened it in Notepad++ and told it to convert to UTF-8 -- and it erased the contents of the document. Digging deeper (and using python and chardet.universaldetector), I found out that the document was ISO-8859-1. Selecting the correct encoding before converting worked.

Steps To Reproduce

  1. Download the SQL script from https://sqlsharp.com/free/
  2. Open in Notepad++ - it is detected as ANSI
  3. Select "Encoding > Converto to UTF-8"

Current Behavior

The contents of the document are erased.

Expected Behavior

The document is converted from its original encoding to UTF-8.

Debug Information

Notepad++ v8.6.8   (64-bit)
Build time : Jun  4 2024 - 00:30:00
Path : C:\Program Files\Notepad++\notepad++.exe
Command Line : "C:\temp\SQLsharp_SETUP.sql" 
Admin mode : OFF
Local Conf mode : OFF
Cloud Config : OFF
Periodic Backup : ON
OS Name : Windows 11 Pro (64-bit)
OS Version : 23H2
OS Build : 22631.3672
Current ANSI codepage : 65001
Plugins : 
    mimeTools (3.1)
    NppConverter (4.6)
    NppExport (0.4)

Anything else?

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions