Skip to content

gh-148441: Avoid integer overflow in Expat's CharacterDataHandler#148904

Open
ByteFlowing1337 wants to merge 11 commits intopython:mainfrom
ByteFlowing1337:fix-148441
Open

gh-148441: Avoid integer overflow in Expat's CharacterDataHandler#148904
ByteFlowing1337 wants to merge 11 commits intopython:mainfrom
ByteFlowing1337:fix-148441

Conversation

@ByteFlowing1337
Copy link
Copy Markdown
Contributor

@ByteFlowing1337 ByteFlowing1337 commented Apr 23, 2026

Fix the buffer overflow issue in #148441 , which will cause a core dump.

Comment thread Lib/test/test_pyexpat.py Outdated
Comment thread Lib/test/test_pyexpat.py Outdated
Comment thread Lib/test/test_pyexpat.py Outdated
Comment thread Misc/NEWS.d/next/Library/2026-04-23-12-50-15.gh-issue-148441.zvpCkR.rst Outdated
ByteFlowing1337 and others added 3 commits April 25, 2026 20:38
…vpCkR.rst

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
@ByteFlowing1337
Copy link
Copy Markdown
Contributor Author

Done.

@ByteFlowing1337
Copy link
Copy Markdown
Contributor Author

@picnixz Thanks for the review. Please let me know if you have any further suggestions.

@picnixz
Copy link
Copy Markdown
Member

picnixz commented Apr 25, 2026

The EMScripten tests fail. I suspect that we don't have enough memory on those machines. So you need to also ensure that you have enough memory (I don't remembner the resourdces to request, something likie big_memory)

@ByteFlowing1337
Copy link
Copy Markdown
Contributor Author

Now all CI tests pass!

@picnixz
Copy link
Copy Markdown
Member

picnixz commented Apr 26, 2026

Thanks, I'll merge it in a few days to let others have a look if they want to. Until then, you don't need to update your branch (in general, don't update your branch unless main has changed in a non-compatible way (e.g., new CI failures)).

Comment thread Modules/pyexpat.c
call_character_handler(self, data, len);
else {
if ((self->buffer_used + len) > self->buffer_size) {
if (len > (self->buffer_size - self->buffer_used)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a fix to an integer overflow rather than a buffer overflow. Am missing something?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cause is indeed an integer overflow, but I think the ASAN report says "heap-buffer overflow" in this case (so the result is a buffer overflow at the end). I'll just write "a crash" (people don't really care about the cause/exact result: if it crashes, it's bad; they can read the issue for more details).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the title to reflect what the PR did but I'll stay vague in the NEWS.

@picnixz picnixz changed the title gh-148441: Fix heap buffer overflow in pyexpat CharacterDataHandler gh-148441: Fix crash in Expat's CharacterDataHandler with large character data Apr 26, 2026
Comment thread Misc/NEWS.d/next/Library/2026-04-23-12-50-15.gh-issue-148441.zvpCkR.rst Outdated
Comment thread Lib/test/test_pyexpat.py Outdated
@picnixz picnixz changed the title gh-148441: Fix crash in Expat's CharacterDataHandler with large character data gh-148441: Avoid integer overflow in Expat's CharacterDataHandler Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants