Skip to content

Specifying -p on command line can lead to data corruption #9142

@sasumner

Description

@sasumner

Description of the Issue

Similar to the data corruption that can happen with the Goto dialog's Offset option (see #9101 and #9129 (comment) ) , the -p command line parameter can set the current position between the bytes of a multibyte-encoded UTF-8 character, or between the bytes of a Windows' line-ending of a CRLF pair. This should not be allowed to occur.

Multibyte UTF-8 characters should be be considered "atomic" (strong feeling about) and so should Windows' line-endings (less strong of a feeling, but still fairly strong).

Steps to Reproduce the Issue

  1. Turn visible line-endings on via View menu > Show symbol > Show End of Line
  2. Open test_4byte_utf8.txt file (attached, below), observe the UTF-8 character (after zooming):
    image
  3. Optional, using HexEditor plugin, look at hex view, observe:
    image
  4. Open test_crlf.txt file (attached, below), observe:
    image
  5. Close all files; quit Notepad++
  6. Run the command line: notepad++.exe -p1 test_4byte_utf8.txt using the attached file of the same name
  7. After the file loads but before doing anything else, type a
  8. Observe data is corrupted as the 4-byte UTF-8 character has been split:
    image
  9. Repeat steps 5 through 7 using the test_crlf.txt file instead of the test_4byte_utf8.txt file in step 6.
  10. Observe line-endings, which should be CRLF, are "corrupted"; one line-ending is CR, the other is LF:
    image

Expected Behavior

No data corruption.

Actual Behavior

The data corruption shown in steps 8 and 10.

Debug Information

Notepad++ v7.9.1 (64-bit)
Build time : Nov 2 2020 - 01:07:46
Path : C:\........\npp.7.9.1.portable.x64\notepad++.exe
Admin mode : OFF
Local Conf mode : ON
OS Name : Windows 10 Enterprise (64-bit)
OS Version : 1809
OS Build : 17763.1518
Current ANSI codepage : 1252
Plugins : mimeTools.dll NppConverter.dll NppExport.dll

Test files

test_4byte_utf8.txt
test_crlf.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions