Skip to content
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions Lib/test/test_pyexpat.py
Original file line number Diff line number Diff line change
Expand Up @@ -712,6 +712,20 @@ def test_change_size_2(self):
parser.Parse(xml2, True)
self.assertEqual(self.n, 4)

@support.requires_resource('cpu')
@support.requires_resource('walltime')
@support.bigmemtest(size=2**31, memuse=4, dry_run=False)
def test_large_character_data_no_buffer_overflow(self):
Comment thread
picnixz marked this conversation as resolved.
Outdated
# See https://github.com/python/cpython/issues/148441
parser = expat.ParserCreate()
parser.buffer_text = True
parser.buffer_size = 2**31 - 1 # INT_MAX
N = 2049 * (1 << 20) - 3 # Character data greater than INT_MAX
self.assertGreater(N, parser.buffer_size)
parser.CharacterDataHandler = lambda text: None
xml_data = b"<r>" + b"A" * N + b"</r>"
self.assertEqual(parser.Parse(xml_data, True), 1)

class ElementDeclHandlerTest(unittest.TestCase):
def test_trigger_leak(self):
# Unfixed, this test would leak the memory of the so-called
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
:mod:`xml.parsers.expat`: Fix a heap buffer overflow in
:meth:`~xml.parsers.expat.xmlparser.CharacterDataHandler`
when the character data size exceeds the parser's
:attr:`buffer size <xml.parsers.expat.xmlparser.buffer_size>`.
Comment thread
picnixz marked this conversation as resolved.
Outdated
2 changes: 1 addition & 1 deletion Modules/pyexpat.c
Original file line number Diff line number Diff line change
Expand Up @@ -393,7 +393,7 @@ my_CharacterDataHandler(void *userData, const XML_Char *data, int len)
if (self->buffer == NULL)
call_character_handler(self, data, len);
else {
if ((self->buffer_used + len) > self->buffer_size) {
if (len > (self->buffer_size - self->buffer_used)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a fix to an integer overflow rather than a buffer overflow. Am missing something?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cause is indeed an integer overflow, but I think the ASAN report says "heap-buffer overflow" in this case (so the result is a buffer overflow at the end). I'll just write "a crash" (people don't really care about the cause/exact result: if it crashes, it's bad; they can read the issue for more details).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the title to reflect what the PR did but I'll stay vague in the NEWS.

if (flush_character_buffer(self) < 0)
return;
/* handler might have changed; drop the rest on the floor
Expand Down
Loading