Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions docs/guides/chat.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -102,23 +102,24 @@ gaia chat --query "Hello" --show-stats
## Document Q&A (RAG)

<Note>
RAG (Retrieval-Augmented Generation) enables chatting with PDF documents using semantic search and context retrieval.
RAG (Retrieval-Augmented Generation) enables chatting with PDF and PowerPoint (.pptx) documents using semantic search and context retrieval.
</Note>

### CLI with RAG

<Tabs>
<Tab title="Single Document">
```bash
# Chat with single document
# Chat with a PDF or PowerPoint document
gaia chat --index manual.pdf
gaia chat --index slides.pptx
```
</Tab>

<Tab title="Multiple Documents">
```bash
# Chat with multiple documents
gaia chat --index doc1.pdf doc2.pdf doc3.pdf
# Chat with multiple documents (PDF and PPTX supported)
gaia chat --index doc1.pdf doc2.pdf slides.pptx
```
</Tab>

Expand All @@ -131,7 +132,7 @@ RAG (Retrieval-Augmented Generation) enables chatting with PDF documents using s

<Tab title="Watch Folder">
```bash
# Auto-index every PDF in a folder, and any new ones dropped in later
# Auto-index every PDF/PPTX in a folder, and any new ones dropped in later
gaia chat --watch ./docs
```
</Tab>
Expand All @@ -152,7 +153,7 @@ RAG (Retrieval-Augmented Generation) enables chatting with PDF documents using s
</Tabs>

<Note>
**PDF Indexing Requirements:** Processing PDFs with images requires a Vision Language Model (VLM). GAIA uses `Qwen3-VL-4B-Instruct-GGUF` by default for extracting text from images in PDFs.
**Document Indexing Requirements:** Processing PDFs and PPTX files with images requires a Vision Language Model (VLM). GAIA uses `Qwen3-VL-4B-Instruct-GGUF` by default for extracting text from images in documents.

To download all models needed for chat (including VLM):
```bash
Expand Down
2 changes: 1 addition & 1 deletion docs/sdk/sdks/rag.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ This happens once when you index documents:
```mermaid
%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'secondaryColor':'#2d2d2d', 'tertiaryColor':'#f5f5f5', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
flowchart TD
A(["PDF Document"]) --> B(["EXTRACT TEXT"])
A(["PDF / PPTX Document"]) --> B(["EXTRACT TEXT"])
B --> C(["SPLIT INTO CHUNKS"])
C --> D(["GENERATE EMBEDDINGS"])
D --> E[("STORE IN FAISS")]
Expand Down
2 changes: 2 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,7 @@
"numpy>=1.24.0",
"pymupdf>=1.24.0",
"pypdf",
"python-pptx>=0.6.21",
"sentence-transformers",
"safetensors",
# torch is pinned lower-bound only. The "audio" extra caps
Expand Down Expand Up @@ -229,6 +230,7 @@
"numpy>=1.24.0",
"pymupdf>=1.24.0",
"pypdf",
"python-pptx>=0.6.21",
"sentence-transformers",
],
"lint": [
Expand Down
Loading
Loading