Skip to content

fix: extract Docling async markdown result#3031

Open
he-yufeng wants to merge 1 commit intoHKUDS:devfrom
he-yufeng:fix/docling-markdown-extraction
Open

fix: extract Docling async markdown result#3031
he-yufeng wants to merge 1 commit intoHKUDS:devfrom
he-yufeng:fix/docling-markdown-extraction

Conversation

@he-yufeng
Copy link
Copy Markdown
Contributor

Summary

  • align the Docling async parser defaults with docling-serve v1: port 5001 examples, files upload field, task_status polling, and /v1/status/poll/{task_id}
  • add a result URL template fallback for services that return only task status while exposing results at /v1/result/{task_id}
  • extract document.md_content from the Docling result JSON instead of returning the whole response envelope as raw document text

Verified locally

  • python -m pytest tests\test_pipeline_release_closure.py -q -k "docling or mineru_empty_service_result"
  • python -m ruff check lightrag\pipeline.py tests\test_pipeline_release_closure.py
  • python -m ruff format --check lightrag\pipeline.py
  • python -m py_compile lightrag\pipeline.py tests\test_pipeline_release_closure.py
  • git diff --check

Note: running the full tests/test_pipeline_release_closure.py file on Windows still hits two existing path-separator assertions unrelated to this change.

Closes #2996

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant