Skip to content

Enhance large files with syntax highlighting performance#15926

Closed
donho wants to merge 1 commit intonotepad-plus-plus:masterfrom
donho:enhance_large_file_with_syntax_highlighting_performance
Closed

Enhance large files with syntax highlighting performance#15926
donho wants to merge 1 commit intonotepad-plus-plus:masterfrom
donho:enhance_large_file_with_syntax_highlighting_performance

Conversation

@donho
Copy link
Member

@donho donho commented Dec 9, 2024

Fix #15653

@donho
Copy link
Member Author

donho commented Dec 9, 2024

@softmgr
Thank you for the code you've provided.
Could you review this PR and test the binary please?

I tested with 400 MB large XML file, put the position in the middle, then quit Notepad++ and relaunch it.
The content is loaded rapidly with the position set previously and without styling (yet). Few seconds later, the document is styled completely - which is good.

If I switch off this large file, by keeping its position, for switching in to a small file, close Notepad++, and relaunch it, the small file is displayed as expected. Now I switch back to this large XML file, same thing happens as described above (the content is loaded immediately and the right position scrolled, the styling comes after) - so far so good.

However, without quitting this Notepad++ instance, when the large XML file is switched off then switched back, it's frozen few seconds then the styled content is shown.

I didn't find what I've done wrong, so you might have the idea about it.

Note that the PR in its first stage, It could be modified during the code review.

@softmgr
Copy link

softmgr commented Dec 10, 2024

@softmgr

Thank you for the code you've provided.

Could you review this PR and test the binary please?

I tested with 400 MB large XML file, put the position in the middle, then quit Notepad++ and relaunch it.

The content is loaded rapidly with the position set previously and without styling (yet). Few seconds later, the document is styled completely - which is good.

If I switch off this large file, by keeping its position, for switching in to a small file, close Notepad++, and relaunch it, the small file is displayed as expected. Now I switch back to this large XML file, same thing happens as described above (the content is loaded immediately and the right position scrolled, the styling comes after) - so far so good.

However, without quitting this Notepad++ instance, when the large XML file is switched off then switched back, it's frozen few seconds then the styled content is shown.

I didn't find what I've done wrong, so you might have the idea about it.

Note that the PR in its first stage, It could be modified during the code review.

I have a more perfect method. I’ll organize it and send it to you later.

@softmgr
Copy link

softmgr commented Dec 10, 2024

According to my personal analysis, the factors affecting the performance of large files are mainly the following three:

  1. Syntax highlighting;
  2. Excessive number of folded lines;
  3. Auto-completion.

Additionally, I suggest removing the Clickable Link feature, as it slows down the overall performance of the editor. (We can use the right-click menu to copy and open links. If you’re interested, I can provide my improved code for reference.)

The following code is based on the 2024-12-10 development version and is for reference only:

  1. Improvements to Syntax Highlighting: Solves issues such as lag when loading large files, switching between large files, and moving the cursor within large files.

// ScintillaEditView.h

#define MODEVENTMASK_ON 3  //SC_MOD_INSERTTEXT | SC_MOD_DELETETEXT
#define MAX_FOLD_LINES_MORE_THAN 99  // When the number of rows collapsed in bulk is greater than this value, the WM_SETREDRAW message is blocked

// ...
class ScintillaEditView : public Window
{
friend class Finder;
public:
	// ...
	virtual void destroy()
	{
		if (_blankDocument != 0) // added
		{
			execute(SCI_RELEASEDOCUMENT, 0, _blankDocument); // added
			_blankDocument = 0; // added
		}
		::DestroyWindow(_hSelf);
		_hSelf = NULL;
		_pScintillaFunc = NULL;
	};
	// ...
	std::pair<size_t, size_t> getSelectionLinesRange(intptr_t selectionNumber = -1) const;
	std::pair<Sci_Position, Sci_Position> getSelectionPosition(intptr_t selectionNumber = -1, int sort = 1) const; // added

	// ...
	Document getBlankDocument(/*bool reset = false*/); // added

protected:
	// ...
	Buffer* _prevBuffer = nullptr; // added
	Document _blankDocument = 0; // added
	// ...
};

// ScintillaEditView.cpp

void ScintillaEditView::init(HINSTANCE hInst, HWND hPere)
{
	// ...
	if (!_pScintillaPtr)
	{
		throw std::runtime_error("ScintillaEditView::init : SCI_GETDIRECTPOINTER message failed");
	}

	execute(SCI_SETMODEVENTMASK, 0); // added
	execute(SCI_SETIDLESTYLING, SC_IDLESTYLING_ALL, 0); // added
	execute(SCI_SETMARGINMASKN, _SC_MARGE_FOLDER, SC_MASK_FOLDERS);
	
	// ...
	if (_defaultCharList.empty())
	{
		// ...
	}
	execute(SCI_SETMODEVENTMASK, MODEVENTMASK_ON); // added
	//Get the startup document and make a buffer for it so it can be accessed like a file
	attachDefaultDoc();
}

void ScintillaEditView::activateBuffer(BufferID buffer, bool force)
{
	// ...

	// put the state into the future ex buffer
	_currentBuffer->setHeaderLineState(lineStateVector, this);

	_prevBuffer = _currentBuffer; // added
	
	_currentBufferID = buffer;	//the magical switch happens here
	_currentBuffer = newBuf;
	
	// ...

	////////////////////////////////////////////////////////////////////
	const bool isSameLangType = _prevBuffer != nullptr && ((_prevBuffer == _currentBuffer) || (_prevBuffer->getLangType() == _currentBuffer->getLangType()));
	const int currentLangInt = static_cast<int>(_currentBuffer->getLangType());
	const bool isFirstActiveBuffer = (_currentBuffer->getLastLangType() != currentLangInt) || (_currentBuffer->isUntitled());
	if (!isSameLangType && !isFirstActiveBuffer)  // When entering the tab for the second or more times
	{
		execute(SCI_SETMODEVENTMASK, 0);  // Turn OFF the notifications
		execute(SCI_SETDOCPOINTER, 0, getBlankDocument());
		execute(SCI_SETMODEVENTMASK, MODEVENTMASK_ON);  // Turn ON the notifications
		defineDocType(_currentBuffer->getLangType());
	}
	////////////////////////////////////////////////////////////////////

	// change the doc, this operation will decrease
	// the ref count of old current doc and increase the one of the new doc. FileManager should manage the rest
	// Note that the actual reference in the Buffer itself is NOT decreased, Notepad_plus does that if neccessary
	execute(SCI_SETMODEVENTMASK, 0);  // Turn OFF the notifications
	execute(SCI_SETDOCPOINTER, 0, _currentBuffer->getDocument());
	execute(SCI_SETMODEVENTMASK, MODEVENTMASK_ON);  // Turn ON the notifications

	// Due to execute(SCI_CLEARDOCUMENTSTYLE); in defineDocType() function
	// defineDocType() function should be called here, but not be after the fold info loop
	if (isFirstActiveBuffer)  // This is necessary when entering the tab for the first time
		defineDocType(_currentBuffer->getLangType());
	
	_currentBuffer->setLastLangType(currentLangInt); // added

	setWordChars();

	//deleted code (It doesn't seem necessary to call it here?)
	//if (_currentBuffer->getNeedsLexing())
	//{
	//	restyleBuffer();
	//}

	// ...
	
	//deleted code (It doesn't seem necessary to call it here?)
	//runMarkers(true, 0, true, false);

	// ...
}

void ScintillaEditView::defineDocType(LangType typeDoc)
{
	// ...
	if (svp._indentGuideLineShow)
	{
		// ...
	}
	
	execute(SCI_SETLAYOUTCACHE, SC_CACHE_DOCUMENT, 0); // added
	execute(SCI_STARTSTYLING, 0, 0); // added
}

Document ScintillaEditView::getBlankDocument(/*bool reset*/)
{
	//if (reset)
	//{
	//	if (_blankDocument != 0)
	//	{
	//		execute(SCI_RELEASEDOCUMENT, 0, _blankDocument);
	//		_blankDocument = 0;
	//	}
	//}
	if (_blankDocument == 0)
	{
		_blankDocument = static_cast<Document>(execute(SCI_CREATEDOCUMENT, 0, SC_DOCUMENTOPTION_TEXT_LARGE));
		execute(SCI_ADDREFDOCUMENT, 0, _blankDocument);
	}
	return _blankDocument;
}

void ScintillaEditView::saveCurrentPos()
{
	//Save data so, that the current topline becomes visible again after restoring.
	size_t displayedLine = execute(SCI_GETFIRSTVISIBLELINE);
	size_t docLine = execute(SCI_DOCLINEFROMVISIBLE, displayedLine);		//linenumber of the line displayed in the top
	size_t offset = displayedLine - execute(SCI_VISIBLEFROMDOCLINE, docLine);		//use this to calc offset of wrap. If no wrap this should be zero
	size_t wrapCount = execute(SCI_WRAPCOUNT, docLine);

	Buffer * buf = MainFileManager.getBufferByID(_currentBufferID);

	Position pos;
	// the correct visible line number
	pos._firstVisibleLine = docLine;
	//deletetd code:  //pos._startPos = execute(SCI_GETANCHOR);
	//deletetd code:  //pos._endPos = execute(SCI_GETCURRENTPOS);
	pos._xOffset = execute(SCI_GETXOFFSET);
	pos._selMode = execute(SCI_GETSELECTIONMODE);
	pos._scrollWidth = execute(SCI_GETSCROLLWIDTH);
	pos._offset = offset;
	pos._wrapCount = wrapCount;

	const intptr_t nbSelections = execute(SCI_GETSELECTIONS);
	if (pos._selMode == SC_SEL_STREAM && nbSelections > 1)
	{
		const intptr_t startSelection = (nbSelections <= 99) ? 0 : nbSelections - 99;
		for (intptr_t i = startSelection; i < nbSelections; ++i)
		{
			pos._selections.emplace_back(getSelectionPosition(i, 0));
		}
	}
	else
	{
		std::pair<Sci_Position, Sci_Position> v = std::pair<Sci_Position, Sci_Position>(execute(SCI_GETANCHOR), execute(SCI_GETCURRENTPOS));
		pos._selections.emplace_back(v);
	}

	buf->setPosition(pos, this);
}

void ScintillaEditView::restoreCurrentPosPreStep()
{
	Buffer * buf = MainFileManager.getBufferByID(_currentBufferID);
	const Position & pos = buf->getPosition(this);

	execute(SCI_SETSELECTIONMODE, pos._selMode);	//enable
	if (pos._selMode == SC_SEL_STREAM && pos._selections.size() > 1)
	{
		execute(SCI_CANCEL);							//disable
		for (size_t i = 0; i < pos._selections.size(); ++i)
		{
			const Sci_Position startPos = pos._selections[i].first;
			const Sci_Position endPos = pos._selections[i].second;
			if (i == 0)
				execute(SCI_SETSELECTION, endPos, startPos);
			else
				execute(SCI_ADDSELECTION, endPos, startPos);
		}
		execute(SCI_SETMAINSELECTION, pos._selections.size() - 1);
	}
	else
	{
		if (pos._selections.size() > 0)
		{
			const Sci_Position startPos = pos._selections[0].first;
			const Sci_Position endPos = pos._selections[0].second;
			execute(SCI_SETANCHOR, startPos);
			execute(SCI_SETCURRENTPOS, endPos);
		}
		execute(SCI_CANCEL);							//disable
	}
	if (!isWrap()) //only offset if not wrapping, otherwise the offset isnt needed at all
	{
		execute(SCI_SETSCROLLWIDTH, pos._scrollWidth);
		execute(SCI_SETXOFFSET, pos._xOffset);
	}
	execute(SCI_CHOOSECARETX); // choose current x position
	intptr_t lineToShow = execute(SCI_VISIBLEFROMDOCLINE, pos._firstVisibleLine);
	execute(SCI_SETFIRSTVISIBLELINE, lineToShow);
	if (isWrap())
	{
		// Enable flag 'positionRestoreNeeded' so that function restoreCurrentPosPostStep get called
		// once scintilla send SCN_PAITED notification
		_positionRestoreNeeded = true;
	}
	_restorePositionRetryCount = 0;
}

void ScintillaEditView::styleChange()
{
	const bool isSameLangType = _prevBuffer != nullptr && ((_prevBuffer == _currentBuffer) || (_prevBuffer->getLangType() == _currentBuffer->getLangType())); // added
	const int currentLangInt = static_cast<int>(_currentBuffer->getLangType()); // added
	const bool isFirstActiveBuffer = (_currentBuffer->getLastLangType() != currentLangInt) || (_currentBuffer->isUntitled()); // added
	if (!isSameLangType && !isFirstActiveBuffer)  // When entering the tab for the second or more times
	{
		Document prevDoc = execute(SCI_GETDOCPOINTER); // added
		execute(SCI_SETMODEVENTMASK, 0); // added
		execute(SCI_SETDOCPOINTER, 0, getBlankDocument()); // added
		execute(SCI_SETMODEVENTMASK, MODEVENTMASK_ON); // added
		defineDocType(_currentBuffer->getLangType());

		execute(SCI_SETMODEVENTMASK, 0); // added
		execute(SCI_SETDOCPOINTER, 0, prevDoc); // added
		execute(SCI_SETMODEVENTMASK, MODEVENTMASK_ON); // added
	}
	if (isFirstActiveBuffer) // added
	{
		defineDocType(_currentBuffer->getLangType()); // added
	}
	_currentBuffer->setLastLangType(currentLangInt); // added
	restyleBuffer();
}

std::pair<Sci_Position, Sci_Position> ScintillaEditView::getSelectionPosition(intptr_t selectionNumber /* = -1 */, int sort/* = 1*/) const
{
	Sci_Position start_pos, end_pos;

	if (selectionNumber < 0)
	{
		start_pos = execute(SCI_GETSELECTIONSTART);
		end_pos = execute(SCI_GETSELECTIONEND);
	}
	else
	{
		start_pos = execute(SCI_GETSELECTIONNSTART, selectionNumber);
		end_pos = execute(SCI_GETSELECTIONNEND, selectionNumber);
	}

	if (sort >= 1)
	{
		if (start_pos > end_pos)
			std::swap(start_pos, end_pos);
		return std::pair<Sci_Position, Sci_Position>(start_pos, end_pos);
	}

	if (sort == 0)
	{
		Sci_Position caret_pos = execute(SCI_GETSELECTIONNCARET, selectionNumber < 0 ? 0 : selectionNumber);
		if (caret_pos == start_pos)
		{
			std::swap(start_pos, end_pos);
		}
	}

	//If sort <= -1, return directly

	return std::pair<Sci_Position, Sci_Position>(start_pos, end_pos);
}

// Parameters.h

struct Position
{
	intptr_t _firstVisibleLine = 0;
	//deletetd code:  //intptr_t _startPos = 0;
	//deletetd code:  //intptr_t _endPos = 0;
	intptr_t _xOffset = 0;
	intptr_t _selMode = 0;
	intptr_t _scrollWidth = 1;
	intptr_t _offset = 0;
	intptr_t _wrapCount = 0;
	std::vector< std::pair <Sci_Position, Sci_Position> > _selections;  // Save one or more selections
};

// Parameters.cpp

std::vector<std::pair<Sci_Position, Sci_Position>> getSelections(const wchar_t* posStr) // added
{
	std::vector<std::pair<Sci_Position, Sci_Position>> selections;
	if (posStr == nullptr || *posStr == L'\0')
		return selections;

	const wchar_t* ptr = posStr;

	while (*ptr)
	{
		wchar_t* endPtr = nullptr;
		long long start = std::wcstoll(ptr, &endPtr, 10);
		if (ptr == endPtr)
			break;
		if (*endPtr == L',')
			++endPtr;

		ptr = endPtr;
		long long end = std::wcstoll(ptr, &endPtr, 10);
		if (ptr == endPtr)
			break;
		selections.emplace_back(start, end);

		if (*endPtr == L';')
			++endPtr;
		ptr = endPtr;
	}
	return selections;
}

void setSelections(const std::vector<std::pair<Sci_Position, Sci_Position>>& selections, wchar_t* posStr) // added
{
	if (!posStr)
		return;

	wchar_t* ptr = posStr;

	const size_t selections_size = selections.size();

	if (selections_size == 0)
		return;

	const size_t startSelection = (selections_size <= 99) ? 0 : selections_size - 99;
	for (size_t i = startSelection; i < selections_size; ++i)
	{
		ptr += std::swprintf(ptr, 50, L"%lld,%lld", static_cast<long long>(selections[i].first), static_cast<long long>(selections[i].second));

		if (i != selections_size - 1)
		{
			*ptr++ = L';';
		}
	}
	*ptr = L'\0';
}

bool NppParameters::getSessionFromXmlTree(TiXmlDocument *pSessionDoc, Session& session)
{
	// ...
	for (size_t k = 0; k < nbView; ++k)
	{
		// ...
				if (fileName)
				{
					// ...
					//deletetd code:  //posStr = (childNode->ToElement())->Attribute(L"startPos");
					//deletetd code:  //if (posStr)
					//deletetd code:  //	position._startPos = static_cast<intptr_t>(_ttoi64(posStr));
					//deletetd code:  //posStr = (childNode->ToElement())->Attribute(L"endPos");
					//deletetd code:  //if (posStr)
					//deletetd code:  //	position._endPos = static_cast<intptr_t>(_ttoi64(posStr));
					// ...
					if (posStr)
						position._wrapCount = static_cast<intptr_t>(_ttoi64(posStr));

					posStr = (childNode->ToElement())->Attribute(L"selections"); // added
					if (posStr) // added
						position._selections = getSelections(posStr); // added

					// ...
				}
	}
	// ...
}

void NppParameters::writeSession(const Session & session, const wchar_t *fileName)
{
	// ...
		for (size_t k = 0; k < nbElem ; ++k)
		{
			// ...
			for (size_t i = 0, len = viewElems[k].viewFiles->size(); i < len ; ++i)
			{
				// ...
				//deletetd code:  //(fileNameNode->ToElement())->SetAttribute(L"startPos", _i64tot(static_cast<LONGLONG>(viewSessionFiles[i]._startPos), szInt64, 10));
				//deletetd code:  //(fileNameNode->ToElement())->SetAttribute(L"endPos", _i64tot(static_cast<LONGLONG>(viewSessionFiles[i]._endPos), szInt64, 10));
				// ...
				(fileNameNode->ToElement())->SetAttribute(L"wrapCount", _i64tot(static_cast<LONGLONG>(viewSessionFiles[i]._wrapCount), szInt64, 10));

				wchar_t* selections = new wchar_t[64 * viewSessionFiles[i]._selections.size() + 1] {}; // added
				setSelections(viewSessionFiles[i]._selections, selections); // added
				(fileNameNode->ToElement())->SetAttribute(L"selections", selections); // added
				delete[] selections; // added
				(fileNameNode->ToElement())->SetAttribute(L"lang", viewSessionFiles[i]._langName.c_str());
				// ...
			}
		}
}

// Buffer.h

class Buffer final {
	friend class FileManager;
public:
	// ...
	void setLangType(LangType lang, const wchar_t * userLangName = L"");

	int getLastLangType() const { return _lastLangType; } // added

	void setLastLangType(int val) { _lastLangType = val; } // added

	UniMode getUnicodeMode() const { return _unicodeMode; }
	// ...
private:
	// ...
	LangType _lang = L_TEXT;
	int _lastLangType = -1; // added
	// ...
};

// GoToLineDlg.cpp

intptr_t CALLBACK GoToLineDlg::run_dlgProc(UINT message, WPARAM wParam, LPARAM lParam)
{
	// ...
		case WM_COMMAND:
		{
			switch (LOWORD(wParam))
			{
				// ...
				case IDOK :
				{
					(*_ppEditView)->execute(SCI_SETMODEVENTMASK, 0); // added
					long long line = getLine();
					if (line != -1)
					{
						display(false);
						if (_mode == go2line)
						{
							// ...
						}
						else
						{
							size_t posToGoto = 0;
							line--; // Subtract 1 from the number on the interface
							if (line > 0)
							// ...
						}
					}
					(*_ppEditView)->execute(SCI_SETMODEVENTMASK, MODEVENTMASK_ON); // added
					// ...
				}
			}
		}
	// ...
}

void GoToLineDlg::updateLinesNumbers() const
{
	// ...
	if (_mode == go2line)
	{
		// ...
	}
	else
	{
		current = (*_ppEditView)->execute(SCI_GETCURRENTPOS) + 1; // Fixed inconsistency with status bar display, need+1 here  //current = (*_ppEditView)->execute(SCI_GETCURRENTPOS);
		size_t currentDocLength = (*_ppEditView)->getCurrentDocLen();
		limit = currentDocLength + 1; // Fixed inconsistency with status bar display, need+1 here  //limit = (currentDocLength > 0 ? currentDocLength - 1 : 0);
	}

	//Resolve the issue of window flickering during the loading process of large files
	char buffer[22] = { 0 }; // added
	::GetDlgItemTextA(_hSelf, ID_CURRLINE, buffer, 21); // added
	if (static_cast<size_t>(std::strtoul(buffer, nullptr, 10)) != current) // added
		::SetDlgItemTextA(_hSelf, ID_CURRLINE, std::to_string(current).c_str());
	::GetDlgItemTextA(_hSelf, ID_LASTLINE, buffer, 21); // added
	if (static_cast<size_t>(std::strtoul(buffer, nullptr, 10)) != limit) // added
		::SetDlgItemTextA(_hSelf, ID_LASTLINE, std::to_string(limit).c_str());
}

// NppIO.cpp

bool Notepad_plus::doReload(BufferID id, bool alert = true)
{
	// ...
	if (mainVisisble)
	{
		_mainEditView.saveCurrentPos();
		_mainEditView.execute(SCI_SETMODEVENTMASK, 0); // added
		_mainEditView.execute(SCI_SETDOCPOINTER, 0, 0);
		_mainEditView.execute(SCI_SETMODEVENTMASK, MODEVENTMASK_ON); // added
	}

	if (subVisisble)
	{
		_subEditView.saveCurrentPos();
		_subEditView.execute(SCI_SETMODEVENTMASK, 0); // added
		_subEditView.execute(SCI_SETDOCPOINTER, 0, 0);
		_subEditView.execute(SCI_SETMODEVENTMASK, MODEVENTMASK_ON); // added
	}

	// ...
	Buffer * pBuf = MainFileManager.getBufferByID(id);
	pBuf->setLastLangType(-1); // For reopened file, the last used language should be reset to its initial value here so that the language can be reloaded later in the activateBuffer() function
	if (mainVisisble)
	{
		_mainEditView.execute(SCI_SETMODEVENTMASK, 0); // added
		_mainEditView.execute(SCI_SETDOCPOINTER, 0, pBuf->getDocument());
		_mainEditView.execute(SCI_SETMODEVENTMASK, MODEVENTMASK_ON); // added
		_mainEditView.restoreCurrentPosPreStep();
	}

	if (subVisisble)
	{
		_subEditView.execute(SCI_SETMODEVENTMASK, 0); // added
		_subEditView.execute(SCI_SETDOCPOINTER, 0, pBuf->getDocument());
		_subEditView.execute(SCI_SETMODEVENTMASK, MODEVENTMASK_ON); // added
		_subEditView.restoreCurrentPosPreStep();
	}

	// ...
}

// NppCommands.cpp

void Notepad_plus::command(int id)
{
	switch (id)
	{
		// ...
		case IDM_LANG_USER :
		{
			LangType lang = menuID2LangType(id); // added
			setLanguage(lang);  //setLanguage(menuID2LangType(id));
			// Manually set language, don't change language even file extension changes.
			Buffer *buffer = _pEditView->getCurrentBuffer();
			buffer->langHasBeenSetFromMenu();
			buffer->setLastLangType(static_cast<int>(lang)); // After manually resetting the language, the last used language should be directly set to the current language.

			if (_pDocMap)
			{
				_pDocMap->setSyntaxHiliting();
			}
		}
		break;
		// ...
	}
}

// NppNotification.cpp

BOOL Notepad_plus::notify(SCNotification *notification)
{
	// ...
	switch (notification->nmhdr.code)
	{
		// ...
		case SCN_MODIFIED:
		{
			// ...
			if (notification->modificationType & (SC_MOD_DELETETEXT | SC_MOD_INSERTTEXT))
			{
				// ...
				//::InvalidateRect(notifyView->getHSelf(), NULL, TRUE);  //deleted code

				// for the backup system
				_pEditView->getCurrentBuffer()->setModifiedStatus(true); // Move here from the code below
			}

			//deleted code:
			//if (notification->modificationType & (SC_MOD_DELETETEXT | SC_MOD_INSERTTEXT | SC_PERFORMED_UNDO | SC_PERFORMED_REDO))
			//{
			//	// for the backup system
			//	_pEditView->getCurrentBuffer()->setModifiedStatus(true);
			//}

			//deleted code:
			//if (notification->modificationType & SC_MOD_CHANGEINDICATOR)
			//{
			//	::InvalidateRect(notifyView->getHSelf(), NULL, FALSE);
			//}
			break;
		}
		case NM_DBLCLK :
		{
			// ...
				if (lpnm->dwItemSpec == DWORD(STATUSBAR_CUR_POS))
				{
					//deleted code:
					//bool isFirstTime = !_goToLineDlg.isCreated();
					//_goToLineDlg.doDialog(_nativeLangSpeaker.isRTL());
					//if (isFirstTime)
					//	_nativeLangSpeaker.changeDlgLang(_goToLineDlg.getHSelf(), "GoToLine");

					command(IDM_SEARCH_GOTOLINE); // added
				}
			// ...
		}
		case SCN_CHARADDED:
		{
			// ...
				Buffer* currentBuf = _pEditView->getCurrentBuffer();
				if (currentBuf->allowAutoCompletion() && (!currentBuf->isReadOnly()))  //if (currentBuf->allowAutoCompletion())
				{
					// ...
				}
		}
		// ...
	}
}

// Notepad_plus.cpp

void Notepad_plus::loadBufferIntoView(BufferID id, int whichOne, bool dontClose)
{
	// ...
	//Check if the tab has a single clean buffer. Close it if so
	if (!dontClose && tabToOpen->nbItem() == 1)
	{
		idToClose = tabToOpen->getBufferByIndex(0);
		Buffer * buf = MainFileManager.getBufferByID(idToClose);
		if (buf->isDirty() || !buf->isUntitled())
		{
			idToClose = BUFFER_INVALID;
		}
		else
		{
			buf->setLastLangType(-1); // When replacing the "new" tab with an opened file, the last used language should be reset to its initial value so that the language can be reloaded later in the activateBuffer() function.
		}
	}

	// ...
}

bool Notepad_plus::braceMatch()
{
	// ...

	intptr_t braceAtCaret = -1;
	intptr_t braceOpposite = -1;
	if (currentBuf->getLangType() != L_TEXT) // added - Text files do not open "BraceMatch"
		findMatchingBracePos(braceAtCaret, braceOpposite);

	// ...
}

// scintilla\src\Document.cxx

Sci::Position Document::BraceMatch(Sci::Position position, Sci::Position /*maxReStyle*/, Sci::Position startPos, bool useStartPos) noexcept {
	// ...
	unsigned int charsCount = 0; // added - Fixed in large files where moving the cursor near unmatched parentheses would make the editor unresponsive
	while ((position >= 0) && (position < LengthNoExcept())) {
		// ...

		//After testing, it can traverse about 32MB of data in 0.5 seconds
		if (++charsCount > 32 * 1024 * 1024) // added
			break; // added
	}
	return -1;
}

Finally, we need to modify all calls related to SCI_SETDOCPOINTER in the code (these modifications are extensive and can be located by searching for the keyword SCI_SETDOCPOINTER, throughout the project). Except for the ScintillaEditView::init() function, all other occurrences need to be modified. I didn’t encapsulate it into an independent function because an additional layer of function calls would also affect performance. For example:

_invisibleEditView.execute(SCI_SETMODEVENTMASK, 0);  // added
_invisibleEditView.execute(SCI_SETDOCPOINTER, 0, buf->getDocument());
_invisibleEditView.execute(SCI_SETMODEVENTMASK, MODEVENTMASK_ON);  // added
  1. Optimization for Excessive Number of Folded Lines:
    (Note: I have more improvements in this area, but due to the complexity of logical changes, I have not provided all the code here. The optimizations below are based on the existing code to simply enhance performance.)

// ScintillaEditView.cpp

void ScintillaEditView::syncFoldStateWith(const std::vector<size_t> & lineStateVectorNew)
{
	size_t nbLineState = lineStateVectorNew.size();
	if (nbLineState > 0) // added
	{
		if (nbLineState > MAX_FOLD_LINES_MORE_THAN) // added
			::SendMessage(_hSelf, WM_SETREDRAW, FALSE, 0); // added

		for (size_t i = 0; i < nbLineState; ++i)
		{
			// ...
		}

		if (nbLineState > MAX_FOLD_LINES_MORE_THAN) // added
		{
			::SendMessage(_hSelf, WM_SETREDRAW, TRUE, 0); // added
			execute(SCI_SCROLLCARET, 0, 0); // added
			::InvalidateRect(_hSelf, nullptr, TRUE); // added
		}
	}
}

void ScintillaEditView::collapseFoldIndentationBased(int level2Collapse, bool mode)
{
	execute(SCI_COLOURISE, 0, -1);

	FoldLevelStack levelStack;
	++level2Collapse; // 1-based level number

	const intptr_t maxLine = execute(SCI_GETLINECOUNT);
	intptr_t line = 0;

	if (maxLine > MAX_FOLD_LINES_MORE_THAN) // added
		::SendMessage(_hSelf, WM_SETREDRAW, FALSE, 0); // added

	// ...

	runMarkers(true, 0, true, false);

	if (maxLine > MAX_FOLD_LINES_MORE_THAN) // added
	{
		::SendMessage(_hSelf, WM_SETREDRAW, TRUE, 0); // added
		execute(SCI_SCROLLCARET, 0, 0); // added
		::InvalidateRect(_hSelf, nullptr, TRUE); // added
	}
}

void ScintillaEditView::collapse(int level2Collapse, bool mode)
{
	if (isFoldIndentationBased())
	{
		collapseFoldIndentationBased(level2Collapse, mode);
		return;
	}

	execute(SCI_COLOURISE, 0, -1);

	intptr_t maxLine = execute(SCI_GETLINECOUNT);
	if (maxLine > MAX_FOLD_LINES_MORE_THAN) // added
		::SendMessage(_hSelf, WM_SETREDRAW, FALSE, 0); // added

	// ...

	runMarkers(true, 0, true, false);

	if (maxLine > MAX_FOLD_LINES_MORE_THAN) // added
	{
		::SendMessage(_hSelf, WM_SETREDRAW, TRUE, 0); // added
		execute(SCI_SCROLLCARET, 0, 0); // added
		::InvalidateRect(_hSelf, nullptr, TRUE); // added
	}
}
  1. Improvements to Auto-Completion: Resolves lag when typing a single character in large files.

// AutoCompletion.h

class AutoCompletion {
public:
	explicit AutoCompletion(ScintillaEditView * pEditView): _pEditView(pEditView) {  //explicit AutoCompletion(ScintillaEditView * pEditView): _pEditView(pEditView), _funcCalltip(pEditView) {
		//Do not load any language yet
		_funcCalltip = nullptr; // added
		_insertedMatchedChars.init(_pEditView);
	};

	~AutoCompletion(){
		//delete _pXmlFile; // deleted code
		delete _funcCalltip; // added
		_map_funcCalltip.clear(); // added
		for (std::map<LangType, TiXmlElement*>::iterator it = _map_pXmlKeyword.begin(); it != _map_pXmlKeyword.end(); ++it) // added
		{
			delete it->second;
		}
		_map_pXmlKeyword.clear(); // added
	};

private:
	// ...
	//TiXmlDocument *_pXmlFile = nullptr;  // deleted
	// ...
	FunctionCallTip* _funcCalltip;  //FunctionCallTip _funcCalltip;
	std::map<LangType, TiXmlElement*> _map_pXmlKeyword; // Create a cache
	std::map<LangType, FunctionCallTip*> _map_funcCalltip; // Create a cache
	// ...
};

// AutoCompletion.cpp

void AutoCompletion::getWordArray(vector<wstring> & wordArray, const wchar_t *beginChars, const wchar_t *allChars)
{
	// ...
	expr += L"[^ \\t\\n\\r.,;:\"(){}=<>'+!?\\[\\]]+";
	
	size_t fromPos = 0; // added
	size_t docLength = _pEditView->execute(SCI_GETLENGTH);

	static size_t fileSizeGreaterThan = 5; // Search for the file size of wordArray, default is 5MB
	if (docLength > fileSizeGreaterThan * 1024 * 1024) // If it is a file that is too large
	{
		intptr_t currentPos = static_cast<intptr_t>(_pEditView->execute(SCI_GETCURRENTPOS));
		intptr_t startPos = currentPos - ((fileSizeGreaterThan * 1024 * 1024) / 2); // Position 2.5MB in front of the cursor
		intptr_t endPos = currentPos + ((fileSizeGreaterThan * 1024 * 1024) / 2); // Position 2.5MB after of the cursor
		if (startPos < 0)
		{
			intptr_t num_offset = -startPos;
			startPos = 0;
			endPos += num_offset;
		}
		if (endPos > static_cast<intptr_t>(docLength))
		{
			intptr_t num_offset2 = endPos - static_cast<intptr_t>(docLength);
			endPos = static_cast<intptr_t>(docLength);
			startPos -= num_offset2;
		}
		int max_while = 0;
		UCHAR startChar = (UCHAR)_pEditView->execute(SCI_GETCHARAT, startPos);
		if ((startChar >= 'A' && startChar <= 'Z') || (startChar >= 'a' && startChar <= 'z'))
		{
			while (startPos > 0)
			{
				UCHAR c = (UCHAR)_pEditView->execute(SCI_GETCHARAT, startPos - 1);
				if (++max_while < 50 && ((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z')))
					--startPos;
				else
					break;
			}
		}
		max_while = 0;
		UCHAR endChar = (UCHAR)_pEditView->execute(SCI_GETCHARAT, endPos);
		if ((endChar >= 'A' && endChar <= 'Z') || (endChar >= 'a' && endChar <= 'z'))
		{
			while (endPos < static_cast<intptr_t>(docLength))
			{
				UCHAR c = (UCHAR)_pEditView->execute(SCI_GETCHARAT, endPos + 1);
				if (++max_while < 50 && ((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z')))
					++endPos;
				else
					break;
			}
		}
		fromPos = static_cast<size_t>(startPos);
		docLength = static_cast<size_t>(endPos);
	}
	auto time_start = std::chrono::high_resolution_clock::now(); // added

	// ...
	intptr_t posFind = _pEditView->searchInTarget(expr.c_str(), expr.length(), fromPos, docLength); //intptr_t posFind = _pEditView->searchInTarget(expr.c_str(), expr.length(), 0, docLength);

	wstring boxId = L"\x1E" + intToString(BOX_IMG_ID);
	wstring funcId = L"\x1E" + intToString(FUNC_IMG_ID);
	while (posFind >= 0)
	{
		// ...
		if (foundTextLen < bufSize)
		{
			// ...
		}
		auto time_end = std::chrono::high_resolution_clock::now(); // added
		auto time_duration = std::chrono::duration_cast<std::chrono::milliseconds>(time_end - time_start).count(); // added
		if (time_duration > 400) // 0.4 seconds
		{
			if (fileSizeGreaterThan > 1)
				fileSizeGreaterThan--;
			break;
		}
		posFind = _pEditView->searchInTarget(expr.c_str(), expr.length(), wordEnd, docLength);
	}
}

bool AutoCompletion::showFunctionComplete()
{
	if (!_funcCompletionActive)
		return false;

	if (_funcCalltip->updateCalltip(0, true)) // if (_funcCalltip.updateCalltip(0, true))
	{
		return true;
	}
	return false;
}

void AutoCompletion::update(int character)
{
	// ...
	if (nppGUI._funcParams || _funcCalltip->isVisible())  //if (nppGUI._funcParams || _funcCalltip.isVisible())
	{
		if (_funcCalltip->updateCalltip(character))  //if (_funcCalltip.updateCalltip(character)) //calltip visible because triggered by autocomplete, set mode
		{
			return;	//only return in case of success, else autocomplete
		}
	}
	// ...
}

void AutoCompletion::callTipClick(size_t direction)
{
	if (!_funcCompletionActive)
		return;

	if (direction == 1)
	{
		_funcCalltip->showPrevOverload();  //_funcCalltip.showPrevOverload();
	}
	else if (direction == 2)
	{
		_funcCalltip->showNextOverload();  //_funcCalltip.showNextOverload();
	}
}

bool AutoCompletion::setLanguage(LangType language)
{
	if (_curLang == language && _map_pXmlKeyword.find(language) != _map_pXmlKeyword.end())  //if (_curLang == language && _pXmlFile)
		return true;

	_curLang = language;

	_keyWords.clear(); // Move the code to the front of the function
	_keyWordArray.clear(); // Move the code to the front of the function

	if (_map_pXmlKeyword.find(language) != _map_pXmlKeyword.end()) // Read from cache
	{
		_pXmlKeyword = _map_pXmlKeyword[language]; // added
		_funcCalltip = _map_funcCalltip[language]; // added
		_funcCompletionActive = (_pXmlKeyword != NULL); // added
		_ignoreCase = _funcCalltip->_ignoreCase; // added
	}
	else
	{
		TiXmlDocument* pXmlFile = nullptr; // added

		wchar_t path[MAX_PATH];
		::GetModuleFileName(NULL, path, MAX_PATH);
		PathRemoveFileSpec(path);
		wcscat_s(path, L"\\autoCompletion\\");
		wcscat_s(path, getApiFileName());
		wcscat_s(path, L".xml");

		//if (_pXmlFile)
		//	delete _pXmlFile;

		if (doesFileExist(path)) // added
		{
			pXmlFile = new TiXmlDocument(path);  //_pXmlFile = new TiXmlDocument(path);
			_funcCompletionActive = pXmlFile->LoadFile();  //_funcCompletionActive = _pXmlFile->LoadFile();
		}
		else
			_funcCompletionActive = false; // added


		TiXmlElement* pXmlKeyword = nullptr; // added
		FunctionCallTip* funcCalltip = new FunctionCallTip(_pEditView); // added
		TiXmlNode* pAutoNode = NULL;
		if (_funcCompletionActive)
		{
			_funcCompletionActive = false;	//safety
			TiXmlNode* pNode = pXmlFile->FirstChild(L"NotepadPlus");  //TiXmlNode * pNode = _pXmlFile->FirstChild(L"NotepadPlus");
			if (!pNode)
			{
				_map_pXmlKeyword[language] = pXmlKeyword; // added
				_map_funcCalltip[language] = funcCalltip; // added
				return false;
			}

			pAutoNode = pNode = pNode->FirstChildElement(L"AutoComplete");
			if (!pNode)
			{
				_map_pXmlKeyword[language] = pXmlKeyword; // added
				_map_funcCalltip[language] = funcCalltip; // added
				return false;
			}

			pNode = pNode->FirstChildElement(L"KeyWord");
			if (!pNode)
			{
				_map_pXmlKeyword[language] = pXmlKeyword; // added
				_map_funcCalltip[language] = funcCalltip; // added
				return false;
			}

			pXmlKeyword = reinterpret_cast<TiXmlElement*>(pNode);  //_pXmlKeyword = reinterpret_cast<TiXmlElement*>(pNode);
			_funcCompletionActive = true;
		}

		if (_funcCompletionActive) //try setting up environment
		{
			//setup defaults
			_ignoreCase = true;
			//deleted  //_funcCalltip._start = '(';
			//deleted  //_funcCalltip._stop = ')';
			//deleted  //_funcCalltip._param = ',';
			//deleted  //_funcCalltip._terminal = ';';
			//deleted  //_funcCalltip._ignoreCase = true;
			//deleted  //_funcCalltip._additionalWordChar.clear();

			TiXmlElement* pElem = pAutoNode->FirstChildElement(L"Environment");
			if (pElem)
			{
				const wchar_t* val = 0;
				val = pElem->Attribute(L"ignoreCase");
				if (val && !lstrcmp(val, L"no"))
				{
					_ignoreCase = false;
					funcCalltip->_ignoreCase = false;  //_funcCalltip._ignoreCase = false;
				}
				val = pElem->Attribute(L"startFunc");
				if (val && val[0])
					funcCalltip->_start = val[0];  //_funcCalltip._start = val[0];
				val = pElem->Attribute(L"stopFunc");
				if (val && val[0])
					funcCalltip->_stop = val[0];  //_funcCalltip._stop = val[0];
				val = pElem->Attribute(L"paramSeparator");
				if (val && val[0])
					funcCalltip->_param = val[0];  //_funcCalltip._param = val[0];
				val = pElem->Attribute(L"terminal");
				if (val && val[0])
					funcCalltip->_terminal = val[0];  //_funcCalltip._terminal = val[0];
				val = pElem->Attribute(L"additionalWordChar");
				if (val && val[0])
					funcCalltip->_additionalWordChar = val;  //_funcCalltip._additionalWordChar = val;

				delete pElem; // added
			}
		}

		if (_funcCompletionActive)
		{
			funcCalltip->setLanguageXML(pXmlKeyword);  //_funcCalltip.setLanguageXML(_pXmlKeyword);
		}
		else
		{
			funcCalltip->setLanguageXML(NULL);  //_funcCalltip.setLanguageXML(NULL);
		}

		_pXmlKeyword = pXmlKeyword; // added
		_funcCalltip = funcCalltip; // added
		_map_pXmlKeyword[language] = pXmlKeyword; // added
		_map_funcCalltip[language] = funcCalltip; // added
	}

	//_keyWords.clear();
	//_keyWordArray.clear();
	
	// ...
}

@donho
The above code has been thoroughly tested and works as expected. Since the code is extensive, it is possible that some parts were not fully copied (as I have made many other modifications). If you encounter any issues, please feel free to contact me promptly.

@donho
Copy link
Member Author

donho commented Dec 14, 2024

@softmgr
The above code is based on master or on this PR?

@donho donho added the enhancement Proposed enhancements of existing features label Dec 14, 2024
@softmgr
Copy link

softmgr commented Dec 14, 2024

@softmgr

The above code is based on master or on this PR?

Based on master.

@donho
Copy link
Member Author

donho commented Dec 14, 2024

@softmgr

Based on master.

So this PR is not re-usable?

@softmgr
Copy link

softmgr commented Dec 14, 2024

@softmgr

Based on master.

So this PR is not re-usable?

As of now, the above code can be used both in the master branch and in pull requests. Simply copy and paste it into the corresponding function. Based on my personal testing, the performance for handling large files is already excellent. Whether it’s opening files, jumping to specific positions, switching tabs, moving the cursor, or typing characters, everything runs very smoothly and quickly.

@donho
Copy link
Member Author

donho commented Dec 15, 2024

@softmgr
It's impossible for me to copy / paste / merge this large amount of code into Notepad++ code base.

According to my personal analysis, the factors affecting the performance of large files are mainly the following three:

Syntax highlighting;
Excessive number of folded lines;
Auto-completion.

Let's do it one by one - firstly Syntax highlighting. Please open an issue with your code for improving Syntax highlighting performance. I will create PR so you can review it. Once it's merged, then we move on Excessive number of folded lines then Auto-completion, with the same process.

Is it doable for you?

@softmgr
Copy link

softmgr commented Dec 15, 2024

@softmgr

It's impossible for me to copy / paste / merge this large amount of code into Notepad++ code base.

According to my personal analysis, the factors affecting the performance of large files are mainly the following three:

Syntax highlighting;

Excessive number of folded lines;

Auto-completion.

Let's do it one by one - firstly Syntax highlighting. Please open an issue with your code for improving Syntax highlighting performance. I will create PR so you can review it. Once it's merged, then we move on Excessive number of folded lines then Auto-completion, with the same process.

Is it doable for you?

Sure, can you tell me how to do it?

@donho
Copy link
Member Author

donho commented Dec 15, 2024

Sure, can you tell me how to do it?

Please open an issue with your modified code for improving Syntax highlighting performance. Only for improving Syntax highlighting performance, and nothing more.

And provide the scenario to reproduce the bad performance, so we can compare with the PR.

@softmgr
Copy link

softmgr commented Dec 15, 2024

Sure, can you tell me how to do it?

Please open an issue with your modified code for improving Syntax highlighting performance. Only for improving Syntax highlighting performance, and nothing more.

And provide the scenario to reproduce the bad performance, so we can compare with the PR.

Here is the re-written feature request, with the complete solution provided in the attachments:
#15952

@donho
Copy link
Member Author

donho commented Jan 5, 2025

@softmgr

Let's do it one by one - firstly Syntax highlighting. Please open an issue with your code for improving Syntax highlighting performance. I will create PR so you can review it. Once it's merged, then we move on Excessive number of folded lines

I do believe #15952 is completely done (latest improvement: #16021).
We can move on Excessive number of folded lines if you want.

@softmgr
Copy link

softmgr commented Jan 6, 2025

@softmgr

Let's do it one by one - firstly Syntax highlighting. Please open an issue with your code for improving Syntax highlighting performance. I will create PR so you can review it. Once it's merged, then we move on Excessive number of folded lines

I do believe #15952 is completely done (latest improvement: #16021).

We can move on Excessive number of folded lines if you want.

I tested the latest master branch with a 200MB HTML file as a benchmark. After completely removing the clickable link feature, the loading time difference compared to disabling the setting in preferences is around 0.1-0.2 seconds. The impact is relatively small. However, personally, I still prefer to remove this feature since the context menu already provides the option to open links.

However, personally, I still prefer to remove this feature since I have already added the "Open File" option in the context menu to handle link opening functionality.

Additionally, it seems a new bug has appeared in the latest development version: even after disabling the "Clickable Link Settings" option in preferences, links in the document remain clickable (the underline for link text does not disappear).

donho added a commit to donho/notepad-plus-plus that referenced this pull request Jan 7, 2025
@donho
Copy link
Member Author

donho commented Jan 7, 2025

@softmgr

Additionally, it seems a new bug has appeared in the latest development version: even after disabling the "Clickable Link Settings" option in preferences, links in the document remain clickable (the underline for link text does not disappear).

The regression has been fixed by PR #16025

I tested the latest master branch with a 200MB HTML file as a benchmark. After completely removing the clickable link feature, the loading time difference compared to disabling the setting in preferences is around 0.1-0.2 seconds. The impact is relatively small. However, personally, I still prefer to remove this feature since the context menu already provides the option to open links.

Removing completely clickable link feature should get the exactly the same performance. OTOH, a lot of people use such feature. I see no advantage to remove this feature.

donho added a commit that referenced this pull request Jan 7, 2025
The regression was introduced by commit 71be434

Ref: #15926 (comment)

Close #16025
@donho donho closed this Jan 9, 2025
@donho donho deleted the enhance_large_file_with_syntax_highlighting_performance branch January 9, 2025 10:39
@softmgr
Copy link

softmgr commented Jan 13, 2025

@donho

I believe we can now start the second and third phases of optimization, namely:
2. Excessive number of folded lines;
3. Auto-completion.

@donho
Copy link
Member Author

donho commented Jan 13, 2025

@softmgr Yes, sounds good.
Let's start with the simpler one (of your judge).
Please create a issue with the zip of the modified master just as you've done before, with the info for reproducing the bad (the current master) & good (your modified master) performance.

@softmgr
Copy link

softmgr commented Jan 14, 2025

@softmgr Yes, sounds good. Let's start with the simpler one (of your judge). Please create a issue with the zip of the modified master just as you've done before, with the info for reproducing the bad (the current master) & good (your modified master) performance.

#16064

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Proposed enhancements of existing features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] "Smart highlighting" leads to inefficiency in reading and writing large files

2 participants