Skip to content

[SPARK-57271][PYTHON] Add capability to print locals in traceback for Python UDF#56336

Open
gaogaotiantian wants to merge 1 commit into
apache:masterfrom
gaogaotiantian:exception-with-locals
Open

[SPARK-57271][PYTHON] Add capability to print locals in traceback for Python UDF#56336
gaogaotiantian wants to merge 1 commit into
apache:masterfrom
gaogaotiantian:exception-with-locals

Conversation

@gaogaotiantian
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

A new configuration spark.sql.execution.pyspark.udf.tracebackWithLocals.enabled is introduced for UDF tracebacks to include local variable values.

pyspark.errors.exceptions.captured.PythonException: An exception was thrown from the Python worker:
Traceback (most recent call last):
  File "/Users/tian.gao/programs/spark/python/example.py", line 10, in f
    return g(y)
           ^^^^
    x = 1
    y = 0
  File "/Users/tian.gao/programs/spark/python/example.py", line 6, in g
    return 1 / y
           ~~^~~
    y = 0

Why are the changes needed?

This is very helpful for UDF debugging - it includes all the local variables in each frame for users to understand the crash site. This is off by default so the original behavior is kept.

Does this PR introduce any user-facing change?

Yes, opt-in configuration to enable extra information in the UDF traceback.

How was this patch tested?

Some local manual tests were done. Two test cases were added. Pending CI.

Was this patch authored or co-authored using generative AI tooling?

Tests were generated by Claude Code (Opus 4.8).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant