Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .claude/skills/lemonade-client-patterns.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ description: Patterns, gotchas, and conventions for threading changes through Le
# Lemonade Client Patterns

## Context
`src/gaia/llm/lemonade_client.py` is GAIA's primary Lemonade HTTP client (~3900 lines). Changes to its interface ripple through `LemonadeProvider`, `VLMClient`, Agent UI routers, `_chat_helpers.py`, `server.py`, and `agents/base/agent.py`. The test suite mixes `responses` (for `requests`-based calls) and `mocker.patch` on `httpx` (for async calls).
`src/gaia/llm/lemonade_client.py` is GAIA's primary Lemonade HTTP client (~4000 lines). Changes to its interface ripple through `LemonadeProvider`, `VLMClient`, Agent UI routers, `_chat_helpers.py`, `server.py`, and `agents/base/agent.py`. The test suite mixes `responses` (for `requests`-based calls) and `mocker.patch` on `httpx` (for async calls).

## Key Patterns

Expand All @@ -33,7 +33,7 @@ Misconfigured reverse proxies can reflect the `Authorization` header back in a 4

## Important Files & Locations

- `src/gaia/llm/lemonade_client.py` — Primary client (~3900 lines); `_send_request` is the central chokepoint but 4 bypass sites exist
- `src/gaia/llm/lemonade_client.py` — Primary client (~4000 lines); `_send_request` is the central chokepoint but 4 bypass sites exist
- `src/gaia/llm/providers/lemonade.py` — `LemonadeProvider.__init__` uses `backend_kwargs` dict to forward to `LemonadeClient`; add new params with `if param is not None: backend_kwargs["param"] = param`
- `src/gaia/llm/vlm_client.py` — `VLMClient.__init__` uses deferred import of `LemonadeClient`
- `src/gaia/ui/routers/system.py` — already imports `DEFAULT_CONTEXT_SIZE` from `lemonade_client` (established cross-package import precedent)
Expand Down
2 changes: 0 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,6 @@ Before starting implementation on a `consumer-critical` issue, confirm it has th
### Stop-the-line (foundational PRs in flight)
When a foundational PR (large, cross-cutting, with many downstream dependencies) is in flight, it carries the `stop-the-line` label and pins a comment listing **frozen file paths**. No PRs may merge changes to those paths until the stop-the-line PR lands.

**Currently active:** PR #606 (memory v2). See pinned comment for frozen paths.

### Parallel agent work
- **Parallelize when** file trees are disjoint, no architectural decision is shared, no sequential dependency exists
- **Serialize when** the same file tree is touched, when a design-system pattern needs to be pinned first, when one PR's output is the next PR's input
Expand Down
79 changes: 53 additions & 26 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -386,21 +386,31 @@ gaia/
│ │ └── _shared/ # Shared assets for apps
│ ├── audio/ # Audio processing (Whisper ASR, Kokoro TTS)
│ ├── chat/ # Agent SDK (AgentSDK class, prompts, app entry)
│ ├── code_index/ # Semantic code search (CodeIndexSDK)
│ ├── connectors/ # OAuth-bound external API access (OAuth2 PKCE, keychain token storage, per-agent grants)
│ ├── database/ # DatabaseMixin and DatabaseAgent
│ ├── electron/ # Electron app integration
│ ├── eval/ # Evaluation framework
│ ├── filesystem/ # Filesystem indexing/categorization (FileSystemIndexService)
│ ├── governance/ # Optional action-level governance layer (GovernedAgentMixin)
│ ├── img/ # Shared image assets
│ ├── installer/ # Install/init commands (gaia init, lemonade installer)
│ ├── llm/ # LLM backend clients (Lemonade, Claude, OpenAI) + providers/
│ ├── mcp/ # Model Context Protocol servers/clients
│ ├── messaging/ # Messaging adapters (Telegram) + media→VLM/RAG ingest
│ ├── rag/ # Document retrieval (RAG)
│ ├── scratchpad/ # SQLite scratchpad service for structured data analysis
│ ├── sd/ # Stable Diffusion tool mixin (SDToolsMixin)
│ ├── shell/ # Shell integration
│ ├── talk/ # Voice interaction SDK
│ ├── testing/ # Test utilities and fixtures
│ ├── ui/ # Agent UI backend (FastAPI server, routers, SSE, database)
│ ├── utils/ # Utility modules (FileWatcher, parsing)
│ ├── vlm/ # Vision LLM tool mixin (VLMToolsMixin, structured extraction)
│ ├── web/ # Web client utilities (WebClient, SSRF-guarded fetch)
│ ├── device.py # Device compatibility detection (Strix Halo / Radeon VRAM)
│ ├── perf_analysis.py # Perf-log parsing/plotting
│ ├── security.py # Path validation, allow-lists, write guardrails, audit logging
│ └── cli.py # Main CLI entry point (all `gaia <command>` subparsers)
├── tests/ # Test suite
│ ├── unit/ # Unit tests
Expand Down Expand Up @@ -454,15 +464,20 @@ Defined in [`setup.py`](setup.py) under `console_scripts`:

| Agent | Location | Description | Default Model |
|-------|----------|-------------|---------------|
| **ChatAgent** | `agents/chat/agent.py` | Document Q&A with RAG | Qwen3.5-35B |
| **CodeAgent** | `agents/code/agent.py` | Code generation with orchestration | Qwen3.5-35B |
| **BuilderAgent** | `agents/builder/agent.py` | Scaffolds new agents from templates | Qwen3.5-35B |
| **SummarizeAgent** | `agents/summarize/agent.py` | Document/text summarization | Qwen3.5-35B |
| **JiraAgent** | `agents/jira/agent.py` | Jira issue management | Qwen3.5-35B |
| **BlenderAgent** | `agents/blender/agent.py` | 3D scene automation | Qwen3.5-35B |
| **DockerAgent** | `agents/docker/agent.py` | Container management | Qwen3.5-35B |
| **MedicalIntakeAgent** | `agents/emr/agent.py` | Medical form processing | Qwen3-VL-4B (VLM) |
| **RoutingAgent** | `agents/routing/agent.py` | Intelligent agent selection | Qwen3.5-35B |
| **ChatAgent** | `agents/chat/agent.py` | Chat with RAG, file search, shell | Gemma-4-E4B-it-GGUF (global default) |
| **CodeAgent** | `agents/code/agent.py` | Code generation with orchestration | Qwen3.5-35B-A3B-GGUF |
| **DocumentQAAgent** | `agents/docqa/agent.py` | Document Q&A with RAG | Qwen3.5-35B-A3B-GGUF |
| **BuilderAgent** | `agents/builder/agent.py` | Scaffolds new agents from templates | Qwen3.5-35B-A3B-GGUF |
| **SummarizeAgent** | `agents/summarize/agent.py` | Document/text summarization | Qwen3-4B-Instruct-2507-GGUF |
| **AnalystAgent** | `agents/analyst/agent.py` | Structured data analysis (scratchpad tables) | Qwen3.5-35B-A3B-GGUF (base default) |
| **BrowserAgent** | `agents/browser/agent.py` | Web research (search, fetch, download) | Qwen3.5-35B-A3B-GGUF (base default) |
| **EmailTriageAgent** | `agents/email/agent.py` | Gmail triage, organize, reply | Gemma-4-E4B-it-GGUF (global default) |
| **FileIOAgent** | `agents/fileio/agent.py` | File read/write/edit operations | Qwen3.5-35B-A3B-GGUF (base default) |
| **JiraAgent** | `agents/jira/agent.py` | Jira issue management | Qwen3.5-35B-A3B-GGUF |
| **BlenderAgent** | `agents/blender/agent.py` | 3D scene automation | Qwen3.5-35B-A3B-GGUF (base default) |
| **DockerAgent** | `agents/docker/agent.py` | Container management | Qwen3.5-35B-A3B-GGUF |
| **MedicalIntakeAgent** | `agents/emr/agent.py` | Medical form processing (VLM) | Gemma-4-E4B-it-GGUF |
| **RoutingAgent** | `agents/routing/agent.py` | Intelligent agent selection | Qwen3.5-35B-A3B-GGUF |
| **SDAgent** | `agents/sd/agent.py` | Stable Diffusion image generation | SDXL-Turbo |

### Agent Registry & Tool Mixins
Expand All @@ -472,58 +487,68 @@ New agents are Python classes inheriting from `Agent` (see [`src/gaia/agents/bas
| Tool name | Mixin | Purpose |
|-----------|-------|---------|
| `rag` | `gaia.agents.chat.tools.rag_tools.RAGToolsMixin` | Document retrieval |
| `code_index` | `gaia.agents.code_index.tools.mixin.CodeIndexToolsMixin` | Semantic code search |
| `file_search` | `gaia.agents.tools.file_tools.FileSearchToolsMixin` | Fuzzy/glob file search |
| `file_io` | `gaia.agents.code.tools.file_io.FileIOToolsMixin` | Read/write/edit files |
| `shell` | `gaia.agents.chat.tools.shell_tools.ShellToolsMixin` | Sandboxed shell commands |
| `screenshot` | `gaia.agents.tools.screenshot_tools.ScreenshotToolsMixin` | Screen capture |
| `filesystem` | `gaia.agents.tools.filesystem_tools.FileSystemToolsMixin` | Filesystem indexing/categorization |
| `scratchpad` | `gaia.agents.tools.scratchpad_tools.ScratchpadToolsMixin` | SQLite scratchpad for data analysis |
| `browser` | `gaia.agents.tools.browser_tools.BrowserToolsMixin` | Web search/fetch/download |
| `sd` | `gaia.sd.mixin.SDToolsMixin` | Stable Diffusion image generation |
| `vlm` | `gaia.vlm.mixin.VLMToolsMixin` | Vision LLM / structured extraction |

When adding a new tool mixin, register it in `KNOWN_TOOLS` so other agents can compose it by name.

### Default Models
- General tasks: `Qwen3-0.6B-GGUF`
- Code/Agents: `Qwen3.5-35B-A3B-GGUF`
- Vision tasks: `Qwen3-VL-4B-Instruct-GGUF`
- Global default: `Gemma-4-E4B-it-GGUF` (`DEFAULT_MODEL_NAME` in [`src/gaia/llm/lemonade_client.py`](src/gaia/llm/lemonade_client.py))
- Code/Agents: `Qwen3.5-35B-A3B-GGUF` (CodeAgent, DocumentQAAgent, Jira, Docker, and the base-Agent fallback when `model_id` is unset)
- Vision tasks: `Qwen3-VL-4B-Instruct-GGUF` is a supported vision model, but the EMR/MedicalIntakeAgent's default VLM is currently `Gemma-4-E4B-it-GGUF`

## CLI Commands

All commands are registered in [`src/gaia/cli.py`](src/gaia/cli.py). Run `gaia -h` for the authoritative list.

**Agents & chat:**
- `gaia chat` - Interactive chat with RAG
- `gaia chat` - Interactive chat with RAG, file search, and shell execution
- `gaia chat --ui` - Launch Agent UI (browser-based, requires `[ui]` extras)
- `gaia chat --ui --ui-port 8080` - Agent UI on custom port
- `gaia browse` - Web research with search, page fetch, and download tools
- `gaia analyze` - Structured data analysis with scratchpad tables
- `gaia talk` - Voice interaction
- `gaia prompt "<text>"` - Single prompt to LLM (with system-prompt support)
- `gaia llm "<text>"` - Simple LLM queries
- `gaia summarize` - Document summarization
- `gaia summarize` - Document/transcript/email summarization
- `gaia blender` - Blender 3D agent
- `gaia sd` - Stable Diffusion image generation
- `gaia jira` - Jira integration
- `gaia jira` - Jira / Atlassian integration
- `gaia email` - Email Triage Agent (Gmail, requires Google connector)
- `gaia docker` - Docker management

**Servers & infrastructure:**
- `gaia api` - OpenAI-compatible API server
- `gaia mcp {start|stop|status|test|agent|docker|serve|add|list|remove|tools|test-client}` - MCP bridge
- `gaia telegram {start|stop|status}` - Telegram messaging adapter
- `gaia connectors` - Manage external connectors (OAuth, MCP servers) and per-agent grants
- `gaia cache {status|clear}` - Cache management
- `gaia memory` - Manage agent memory (bootstrap onboarding, view status)

**Setup & utilities:**
- `gaia init` - Setup Lemonade Server and download models
- `gaia install` - Install helper (e.g. Lemonade on first run)
- `gaia download` - Download a model
- `gaia install` - Install GAIA components (e.g. Lemonade on first run)
- `gaia uninstall` - Uninstall GAIA components (tiered cleanup of `~/.gaia` and caches)
- `gaia download` - Download all models required for GAIA agents
- `gaia agent` - Manage custom agents (export/import bundles)
- `gaia kill` - Kill stray GAIA / Lemonade processes
- `gaia diagnostics` - Bundle logs and system info into a tarball for bug reports
- `gaia stats` - Show GAIA statistics from the most recent run
- `gaia test` - Smoke tests
- `gaia yt` - YouTube transcript ingest
- `gaia template` - Scaffold agent templates
- `gaia youtube` - YouTube transcript utilities

**Evaluation & analysis** (see [`docs/reference/eval.mdx`](docs/reference/eval.mdx)):
- `gaia eval {fix-code|agent}` - Run evaluation harness
- `gaia gt` - Generate ground truth
- `gaia generate` - Dataset/response generation
- `gaia batch-exp` - Batch experiments
- `gaia report` - Render eval reports
- `gaia visualize` / `gaia perf-vis` - Visualize results
- `gaia eval agent` - Run agent eval benchmark scenarios
- `gaia report` - Generate summary report from an evaluation results directory
- `gaia perf-vis` - Visualize llama.cpp performance metrics from log files

**Standalone binaries** (separate `console_scripts`, not subcommands):
- `gaia-code` - CodeAgent entry (`src/gaia/agents/code/cli.py`)
Expand Down Expand Up @@ -593,7 +618,7 @@ The roadmap is at [`docs/roadmap.mdx`](docs/roadmap.mdx) ([live site](https://am
- [`docs/plans/docker-containers.mdx`](docs/plans/docker-containers.mdx) - Docker deployment

**Key architectural decisions (April 2026):**
- ChatAgent renamed to **GaiaAgent** in v0.20.0 (#696)
- ChatAgent **GaiaAgent** rename **planned** for v0.20.0 (#696) — not yet in code (class is still `ChatAgent`)
- Voice-first is P0 enabling technology (#702)
- No context compaction — memory + RAG handles long conversations
- Configuration dashboard + Observability dashboard as separate Agent UI panels
Expand Down Expand Up @@ -835,3 +860,5 @@ When a task fits a Superpowers skill (e.g. `superpowers:brainstorming`, `superpo
**Read these before starting related tasks:**

- `.claude/skills/lemonade-client-patterns.md` - Patterns, gotchas, and conventions for modifying LemonadeClient and threading changes through its callers (providers, VLM, UI routers, agent base). Covers: deferred import patch targets, assertLogs child logger levels, SSE test hang prevention, 401 error safety, openai.AuthenticationError ordering. (tags: lemonade, authentication, env-vars, testing, httpx, openai-sdk, tdd)

Directory-based skills also live under `.claude/skills/` (e.g. `.claude/skills/gaia-release/` — invoked via the `gaia-release` Skill).
3 changes: 2 additions & 1 deletion docs/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,8 @@
"connectors/index",
"connectors/google",
"connectors/github",
"security/connections"
"security/connections",
"security/connectors"
]
},
{
Expand Down
4 changes: 2 additions & 2 deletions docs/guides/agent-ui.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -317,7 +317,7 @@ The agent supports the **Model Context Protocol** in both directions — connect

Then restart `gaia chat --ui`.

**pip/PyPI installs (without gaia-ui):** Use the [npm install path](#install-and-launch) — the pip package does not include frontend source files.
**pip/PyPI installs (without gaia-ui):** Use the [npm install path](#install) — the pip package does not include frontend source files.
</Accordion>

<Accordion title="LLM response times out or fails">
Expand Down Expand Up @@ -356,7 +356,7 @@ The agent supports the **Model Context Protocol** in both directions — connect
flowchart TD
A(["Agent UI (Browser or Electron)"]) --> B(["FastAPI Backend · port 4200"])
B --> C(["GAIA Core SDKs"])
C --> D(["Lemonade Server · port 8000"])
C --> D(["Lemonade Server · port 13305"])

B -.- E(["REST API + SSE Streaming"])
B -.- F(["SQLite Database"])
Expand Down
2 changes: 1 addition & 1 deletion docs/guides/code.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ icon: "code"
<Badge text="development" color="orange" />

<Note>
The Code Agent now focuses on generating full-stack TypeScript web applications (Next.js + Prisma + Tailwind). Python code generation is no longer supported.
The Code Agent is optimized for full-stack TypeScript web apps (Next.js + Prisma + Tailwind), but Python code generation remains fully supported (and is the default for non-TypeScript requests).
</Note>

<Info>
Expand Down
12 changes: 6 additions & 6 deletions docs/guides/emr.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ flowchart TD
linkStyle 0,1,2,3 stroke:#ED1C24,stroke-width:2px
```

1. **Vision Language Model (VLM)** - The Qwen3-VL-4B model "sees" the intake form image and extracts text using a carefully crafted prompt that guides it to identify specific fields (name, DOB, allergies, medications, etc.). Unlike traditional OCR, the VLM understands context—it knows that "DOB" means date of birth and can handle handwritten entries, checkboxes, and varied form layouts.
1. **Vision Language Model (VLM)** - The Gemma-4-E4B-it model "sees" the intake form image and extracts text using a carefully crafted prompt that guides it to identify specific fields (name, DOB, allergies, medications, etc.). Unlike traditional OCR, the VLM understands context—it knows that "DOB" means date of birth and can handle handwritten entries, checkboxes, and varied form layouts.

2. **LLM Validation & Querying** - The Qwen3.5-35B model (a Mixture-of-Experts architecture that activates only 3B parameters per inference) validates extracted data, handles natural language queries, and generates SQL to search the patient database. When you ask "Which patients have penicillin allergies?", the LLM translates this to proper SQL.

Expand All @@ -55,7 +55,7 @@ flowchart TD

- **Automatic file watching** - Monitors a directory for new intake forms
- **Drag-and-drop upload** - Drop files directly into the Watch Folder panel
- **VLM-powered extraction** - Uses Qwen3-VL-4B-Instruct for OCR and data extraction
- **VLM-powered extraction** - Uses Gemma-4-E4B-it for OCR and data extraction
- **Local database storage** - SQLite with full patient record schema
- **New/returning detection** - Identifies returning patients and flags changes
- **Critical alerts** - Automatic alerts for allergies and missing fields
Expand All @@ -69,7 +69,7 @@ The EMR agent uses three models, downloaded automatically on first run via `gaia
| Model | Size | Purpose |
|-------|------|---------|
| Qwen3.5-35B-A3B-GGUF | 18.6 GB | LLM for chat queries and patient search |
| Qwen3-VL-4B-Instruct-GGUF | 3.3 GB | Vision language model for form extraction |
| Gemma-4-E4B-it-GGUF | ~3 GB | Vision language model for form extraction |
| nomic-embed-text-v2-moe-GGUF | 522 MB | Embedding model for similarity search |

<Note>
Expand Down Expand Up @@ -303,7 +303,7 @@ gaia-emr init
This command:
- Checks Lemonade server is running and context size is configured
- Downloads and loads all required models:
- **VLM**: Qwen3-VL-4B-Instruct-GGUF (form extraction)
- **VLM**: Gemma-4-E4B-it-GGUF (form extraction)
- **LLM**: Qwen3.5-35B-A3B-GGUF (chat/query processing)
- **Embedding**: nomic-embed-text-v2-moe-GGUF (similarity search)
- Verifies all models are loaded and ready
Expand Down Expand Up @@ -421,7 +421,7 @@ This command:

| Command | Description |
|---------|-------------|
| `init` | Download required models (VLM, optional LLM/embedding) |
| `init` | Download all required models (VLM, LLM, embedding) |
| `watch` | Watch folder and process forms |
| `process` | Process a single form file and exit |
| `dashboard` | Launch web dashboard (Electron or browser) |
Expand Down Expand Up @@ -660,7 +660,7 @@ If `gaia-emr init` fails repeatedly or the agent won't start due to model errors
2. Navigate to the model cache directory:
- Windows: `%LOCALAPPDATA%\AMD\LemonadeModels\`
- Linux: `~/.local/share/lemonade/models/`
3. Delete the corrupted model folder (e.g., `Qwen3-VL-4B-Instruct-GGUF/`)
3. Delete the corrupted model folder (e.g., `Gemma-4-E4B-it-GGUF/`)
4. Restart Lemonade Server
5. Run `gaia-emr init` again to re-download

Expand Down
19 changes: 0 additions & 19 deletions docs/guides/telegram.mdx

This file was deleted.

Loading
Loading