amd · kovtcharov · May 29, 2026
@@ -6,7 +6,7 @@ description: Patterns, gotchas, and conventions for threading changes through Le
 # Lemonade Client Patterns
 
 ## Context
-`src/gaia/llm/lemonade_client.py` is GAIA's primary Lemonade HTTP client (~3900 lines). Changes to its interface ripple through `LemonadeProvider`, `VLMClient`, Agent UI routers, `_chat_helpers.py`, `server.py`, and `agents/base/agent.py`. The test suite mixes `responses` (for `requests`-based calls) and `mocker.patch` on `httpx` (for async calls).
+`src/gaia/llm/lemonade_client.py` is GAIA's primary Lemonade HTTP client (~4000 lines). Changes to its interface ripple through `LemonadeProvider`, `VLMClient`, Agent UI routers, `_chat_helpers.py`, `server.py`, and `agents/base/agent.py`. The test suite mixes `responses` (for `requests`-based calls) and `mocker.patch` on `httpx` (for async calls).
 
 ## Key Patterns
 
@@ -33,7 +33,7 @@ Misconfigured reverse proxies can reflect the `Authorization` header back in a 4
 
 ## Important Files & Locations
 
-- `src/gaia/llm/lemonade_client.py` — Primary client (~3900 lines); `_send_request` is the central chokepoint but 4 bypass sites exist
+- `src/gaia/llm/lemonade_client.py` — Primary client (~4000 lines); `_send_request` is the central chokepoint but 4 bypass sites exist
 - `src/gaia/llm/providers/lemonade.py` — `LemonadeProvider.__init__` uses `backend_kwargs` dict to forward to `LemonadeClient`; add new params with `if param is not None: backend_kwargs["param"] = param`
 - `src/gaia/llm/vlm_client.py` — `VLMClient.__init__` uses deferred import of `LemonadeClient`
 - `src/gaia/ui/routers/system.py` — already imports `DEFAULT_CONTEXT_SIZE` from `lemonade_client` (established cross-package import precedent)

@@ -52,8 +52,6 @@ Before starting implementation on a `consumer-critical` issue, confirm it has th
 ### Stop-the-line (foundational PRs in flight)
 When a foundational PR (large, cross-cutting, with many downstream dependencies) is in flight, it carries the `stop-the-line` label and pins a comment listing **frozen file paths**. No PRs may merge changes to those paths until the stop-the-line PR lands.
 
-**Currently active:** PR #606 (memory v2). See pinned comment for frozen paths.
-
 ### Parallel agent work
 - **Parallelize when** file trees are disjoint, no architectural decision is shared, no sequential dependency exists
 - **Serialize when** the same file tree is touched, when a design-system pattern needs to be pinned first, when one PR's output is the next PR's input

@@ -386,21 +386,31 @@ gaia/
 │   │   └── _shared/    # Shared assets for apps
 │   ├── audio/          # Audio processing (Whisper ASR, Kokoro TTS)
 │   ├── chat/           # Agent SDK (AgentSDK class, prompts, app entry)
+│   ├── code_index/     # Semantic code search (CodeIndexSDK)
+│   ├── connectors/     # OAuth-bound external API access (OAuth2 PKCE, keychain token storage, per-agent grants)
 │   ├── database/       # DatabaseMixin and DatabaseAgent
 │   ├── electron/       # Electron app integration
 │   ├── eval/           # Evaluation framework
+│   ├── filesystem/     # Filesystem indexing/categorization (FileSystemIndexService)
+│   ├── governance/     # Optional action-level governance layer (GovernedAgentMixin)
 │   ├── img/            # Shared image assets
 │   ├── installer/      # Install/init commands (gaia init, lemonade installer)
 │   ├── llm/            # LLM backend clients (Lemonade, Claude, OpenAI) + providers/
 │   ├── mcp/            # Model Context Protocol servers/clients
+│   ├── messaging/      # Messaging adapters (Telegram) + media→VLM/RAG ingest
 │   ├── rag/            # Document retrieval (RAG)
+│   ├── scratchpad/     # SQLite scratchpad service for structured data analysis
 │   ├── sd/             # Stable Diffusion tool mixin (SDToolsMixin)
 │   ├── shell/          # Shell integration
 │   ├── talk/           # Voice interaction SDK
 │   ├── testing/        # Test utilities and fixtures
 │   ├── ui/             # Agent UI backend (FastAPI server, routers, SSE, database)
 │   ├── utils/          # Utility modules (FileWatcher, parsing)
 │   ├── vlm/            # Vision LLM tool mixin (VLMToolsMixin, structured extraction)
+│   ├── web/            # Web client utilities (WebClient, SSRF-guarded fetch)
+│   ├── device.py       # Device compatibility detection (Strix Halo / Radeon VRAM)
+│   ├── perf_analysis.py # Perf-log parsing/plotting
+│   ├── security.py     # Path validation, allow-lists, write guardrails, audit logging
 │   └── cli.py          # Main CLI entry point (all `gaia <command>` subparsers)
 ├── tests/              # Test suite
 │   ├── unit/           # Unit tests
@@ -454,15 +464,20 @@ Defined in [`setup.py`](setup.py) under `console_scripts`:
 
 | Agent | Location | Description | Default Model |
 |-------|----------|-------------|---------------|
-| **ChatAgent** | `agents/chat/agent.py` | Document Q&A with RAG | Qwen3.5-35B |
-| **CodeAgent** | `agents/code/agent.py` | Code generation with orchestration | Qwen3.5-35B |
-| **BuilderAgent** | `agents/builder/agent.py` | Scaffolds new agents from templates | Qwen3.5-35B |
-| **SummarizeAgent** | `agents/summarize/agent.py` | Document/text summarization | Qwen3.5-35B |
-| **JiraAgent** | `agents/jira/agent.py` | Jira issue management | Qwen3.5-35B |
-| **BlenderAgent** | `agents/blender/agent.py` | 3D scene automation | Qwen3.5-35B |
-| **DockerAgent** | `agents/docker/agent.py` | Container management | Qwen3.5-35B |
-| **MedicalIntakeAgent** | `agents/emr/agent.py` | Medical form processing | Qwen3-VL-4B (VLM) |
-| **RoutingAgent** | `agents/routing/agent.py` | Intelligent agent selection | Qwen3.5-35B |
+| **ChatAgent** | `agents/chat/agent.py` | Chat with RAG, file search, shell | Gemma-4-E4B-it-GGUF (global default) |
+| **CodeAgent** | `agents/code/agent.py` | Code generation with orchestration | Qwen3.5-35B-A3B-GGUF |
+| **DocumentQAAgent** | `agents/docqa/agent.py` | Document Q&A with RAG | Qwen3.5-35B-A3B-GGUF |
+| **BuilderAgent** | `agents/builder/agent.py` | Scaffolds new agents from templates | Qwen3.5-35B-A3B-GGUF |
+| **SummarizeAgent** | `agents/summarize/agent.py` | Document/text summarization | Qwen3-4B-Instruct-2507-GGUF |
+| **AnalystAgent** | `agents/analyst/agent.py` | Structured data analysis (scratchpad tables) | Qwen3.5-35B-A3B-GGUF (base default) |
+| **BrowserAgent** | `agents/browser/agent.py` | Web research (search, fetch, download) | Qwen3.5-35B-A3B-GGUF (base default) |
+| **EmailTriageAgent** | `agents/email/agent.py` | Gmail triage, organize, reply | Gemma-4-E4B-it-GGUF (global default) |
+| **FileIOAgent** | `agents/fileio/agent.py` | File read/write/edit operations | Qwen3.5-35B-A3B-GGUF (base default) |
+| **JiraAgent** | `agents/jira/agent.py` | Jira issue management | Qwen3.5-35B-A3B-GGUF |
+| **BlenderAgent** | `agents/blender/agent.py` | 3D scene automation | Qwen3.5-35B-A3B-GGUF (base default) |
+| **DockerAgent** | `agents/docker/agent.py` | Container management | Qwen3.5-35B-A3B-GGUF |
+| **MedicalIntakeAgent** | `agents/emr/agent.py` | Medical form processing (VLM) | Gemma-4-E4B-it-GGUF |
+| **RoutingAgent** | `agents/routing/agent.py` | Intelligent agent selection | Qwen3.5-35B-A3B-GGUF |
 | **SDAgent** | `agents/sd/agent.py` | Stable Diffusion image generation | SDXL-Turbo |
 
 ### Agent Registry & Tool Mixins
@@ -472,58 +487,68 @@ New agents are Python classes inheriting from `Agent` (see [`src/gaia/agents/bas
 | Tool name | Mixin | Purpose |
 |-----------|-------|---------|
 | `rag` | `gaia.agents.chat.tools.rag_tools.RAGToolsMixin` | Document retrieval |
+| `code_index` | `gaia.agents.code_index.tools.mixin.CodeIndexToolsMixin` | Semantic code search |
 | `file_search` | `gaia.agents.tools.file_tools.FileSearchToolsMixin` | Fuzzy/glob file search |
 | `file_io` | `gaia.agents.code.tools.file_io.FileIOToolsMixin` | Read/write/edit files |
 | `shell` | `gaia.agents.chat.tools.shell_tools.ShellToolsMixin` | Sandboxed shell commands |
 | `screenshot` | `gaia.agents.tools.screenshot_tools.ScreenshotToolsMixin` | Screen capture |
+| `filesystem` | `gaia.agents.tools.filesystem_tools.FileSystemToolsMixin` | Filesystem indexing/categorization |
+| `scratchpad` | `gaia.agents.tools.scratchpad_tools.ScratchpadToolsMixin` | SQLite scratchpad for data analysis |
+| `browser` | `gaia.agents.tools.browser_tools.BrowserToolsMixin` | Web search/fetch/download |
 | `sd` | `gaia.sd.mixin.SDToolsMixin` | Stable Diffusion image generation |
 | `vlm` | `gaia.vlm.mixin.VLMToolsMixin` | Vision LLM / structured extraction |
 
 When adding a new tool mixin, register it in `KNOWN_TOOLS` so other agents can compose it by name.
 
 ### Default Models
-- General tasks: `Qwen3-0.6B-GGUF`
-- Code/Agents: `Qwen3.5-35B-A3B-GGUF`
-- Vision tasks: `Qwen3-VL-4B-Instruct-GGUF`
+- Global default: `Gemma-4-E4B-it-GGUF` (`DEFAULT_MODEL_NAME` in [`src/gaia/llm/lemonade_client.py`](src/gaia/llm/lemonade_client.py))
+- Code/Agents: `Qwen3.5-35B-A3B-GGUF` (CodeAgent, DocumentQAAgent, Jira, Docker, and the base-Agent fallback when `model_id` is unset)
+- Vision tasks: `Qwen3-VL-4B-Instruct-GGUF` is a supported vision model, but the EMR/MedicalIntakeAgent's default VLM is currently `Gemma-4-E4B-it-GGUF`
 
 ## CLI Commands
 
 All commands are registered in [`src/gaia/cli.py`](src/gaia/cli.py). Run `gaia -h` for the authoritative list.
 
 **Agents & chat:**
-- `gaia chat` - Interactive chat with RAG
+- `gaia chat` - Interactive chat with RAG, file search, and shell execution
 - `gaia chat --ui` - Launch Agent UI (browser-based, requires `[ui]` extras)
 - `gaia chat --ui --ui-port 8080` - Agent UI on custom port
+- `gaia browse` - Web research with search, page fetch, and download tools
+- `gaia analyze` - Structured data analysis with scratchpad tables
 - `gaia talk` - Voice interaction
 - `gaia prompt "<text>"` - Single prompt to LLM (with system-prompt support)
 - `gaia llm "<text>"` - Simple LLM queries
-- `gaia summarize` - Document summarization
+- `gaia summarize` - Document/transcript/email summarization
 - `gaia blender` - Blender 3D agent
 - `gaia sd` - Stable Diffusion image generation
-- `gaia jira` - Jira integration
+- `gaia jira` - Jira / Atlassian integration
+- `gaia email` - Email Triage Agent (Gmail, requires Google connector)
 - `gaia docker` - Docker management
 
 **Servers & infrastructure:**
 - `gaia api` - OpenAI-compatible API server
 - `gaia mcp {start|stop|status|test|agent|docker|serve|add|list|remove|tools|test-client}` - MCP bridge
+- `gaia telegram {start|stop|status}` - Telegram messaging adapter
+- `gaia connectors` - Manage external connectors (OAuth, MCP servers) and per-agent grants
 - `gaia cache {status|clear}` - Cache management
+- `gaia memory` - Manage agent memory (bootstrap onboarding, view status)
 
 **Setup & utilities:**
 - `gaia init` - Setup Lemonade Server and download models
-- `gaia install` - Install helper (e.g. Lemonade on first run)
-- `gaia download` - Download a model
+- `gaia install` - Install GAIA components (e.g. Lemonade on first run)
+- `gaia uninstall` - Uninstall GAIA components (tiered cleanup of `~/.gaia` and caches)
+- `gaia download` - Download all models required for GAIA agents
+- `gaia agent` - Manage custom agents (export/import bundles)
 - `gaia kill` - Kill stray GAIA / Lemonade processes
+- `gaia diagnostics` - Bundle logs and system info into a tarball for bug reports
+- `gaia stats` - Show GAIA statistics from the most recent run
 - `gaia test` - Smoke tests
-- `gaia yt` - YouTube transcript ingest
-- `gaia template` - Scaffold agent templates
+- `gaia youtube` - YouTube transcript utilities
 
 **Evaluation & analysis** (see [`docs/reference/eval.mdx`](docs/reference/eval.mdx)):
-- `gaia eval {fix-code|agent}` - Run evaluation harness
-- `gaia gt` - Generate ground truth
-- `gaia generate` - Dataset/response generation
-- `gaia batch-exp` - Batch experiments
-- `gaia report` - Render eval reports
-- `gaia visualize` / `gaia perf-vis` - Visualize results
+- `gaia eval agent` - Run agent eval benchmark scenarios
+- `gaia report` - Generate summary report from an evaluation results directory
+- `gaia perf-vis` - Visualize llama.cpp performance metrics from log files
 
 **Standalone binaries** (separate `console_scripts`, not subcommands):
 - `gaia-code` - CodeAgent entry (`src/gaia/agents/code/cli.py`)
@@ -593,7 +618,7 @@ The roadmap is at [`docs/roadmap.mdx`](docs/roadmap.mdx) ([live site](https://am
 - [`docs/plans/docker-containers.mdx`](docs/plans/docker-containers.mdx) - Docker deployment
 
 **Key architectural decisions (April 2026):**
-- ChatAgent renamed to **GaiaAgent** in v0.20.0 (#696)
+- ChatAgent → **GaiaAgent** rename **planned** for v0.20.0 (#696) — not yet in code (class is still `ChatAgent`)
 - Voice-first is P0 enabling technology (#702)
 - No context compaction — memory + RAG handles long conversations
 - Configuration dashboard + Observability dashboard as separate Agent UI panels
@@ -835,3 +860,5 @@ When a task fits a Superpowers skill (e.g. `superpowers:brainstorming`, `superpo
 **Read these before starting related tasks:**
 
 - `.claude/skills/lemonade-client-patterns.md` - Patterns, gotchas, and conventions for modifying LemonadeClient and threading changes through its callers (providers, VLM, UI routers, agent base). Covers: deferred import patch targets, assertLogs child logger levels, SSE test hang prevention, 401 error safety, openai.AuthenticationError ordering. (tags: lemonade, authentication, env-vars, testing, httpx, openai-sdk, tdd)
+
+Directory-based skills also live under `.claude/skills/` (e.g. `.claude/skills/gaia-release/` — invoked via the `gaia-release` Skill).
@@ -90,7 +90,8 @@
                   "connectors/index",
                   "connectors/google",
                   "connectors/github",
-                  "security/connections"
+                  "security/connections",
+                  "security/connectors"
                 ]
               },
               {

@@ -317,7 +317,7 @@ The agent supports the **Model Context Protocol** in both directions — connect
 
     Then restart `gaia chat --ui`.
 
-    **pip/PyPI installs (without gaia-ui):** Use the [npm install path](#install-and-launch) — the pip package does not include frontend source files.
+    **pip/PyPI installs (without gaia-ui):** Use the [npm install path](#install) — the pip package does not include frontend source files.
   </Accordion>
 
   <Accordion title="LLM response times out or fails">
@@ -356,7 +356,7 @@ The agent supports the **Model Context Protocol** in both directions — connect
 flowchart TD
     A(["Agent UI (Browser or Electron)"]) --> B(["FastAPI Backend · port 4200"])
     B --> C(["GAIA Core SDKs"])
-    C --> D(["Lemonade Server · port 8000"])
+    C --> D(["Lemonade Server · port 13305"])
 
     B -.- E(["REST API + SSE Streaming"])
     B -.- F(["SQLite Database"])

@@ -11,7 +11,7 @@ icon: "code"
 <Badge text="development" color="orange" />
 
 <Note>
-  The Code Agent now focuses on generating full-stack TypeScript web applications (Next.js + Prisma + Tailwind). Python code generation is no longer supported.
+  The Code Agent is optimized for full-stack TypeScript web apps (Next.js + Prisma + Tailwind), but Python code generation remains fully supported (and is the default for non-TypeScript requests).
 </Note>
 
 <Info>

@@ -41,7 +41,7 @@ flowchart TD
     linkStyle 0,1,2,3 stroke:#ED1C24,stroke-width:2px
 ```
 
-1. **Vision Language Model (VLM)** - The Qwen3-VL-4B model "sees" the intake form image and extracts text using a carefully crafted prompt that guides it to identify specific fields (name, DOB, allergies, medications, etc.). Unlike traditional OCR, the VLM understands context—it knows that "DOB" means date of birth and can handle handwritten entries, checkboxes, and varied form layouts.
+1. **Vision Language Model (VLM)** - The Gemma-4-E4B-it model "sees" the intake form image and extracts text using a carefully crafted prompt that guides it to identify specific fields (name, DOB, allergies, medications, etc.). Unlike traditional OCR, the VLM understands context—it knows that "DOB" means date of birth and can handle handwritten entries, checkboxes, and varied form layouts.
 
 2. **LLM Validation & Querying** - The Qwen3.5-35B model (a Mixture-of-Experts architecture that activates only 3B parameters per inference) validates extracted data, handles natural language queries, and generates SQL to search the patient database. When you ask "Which patients have penicillin allergies?", the LLM translates this to proper SQL.
 
@@ -55,7 +55,7 @@ flowchart TD
 
 - **Automatic file watching** - Monitors a directory for new intake forms
 - **Drag-and-drop upload** - Drop files directly into the Watch Folder panel
-- **VLM-powered extraction** - Uses Qwen3-VL-4B-Instruct for OCR and data extraction
+- **VLM-powered extraction** - Uses Gemma-4-E4B-it for OCR and data extraction
 - **Local database storage** - SQLite with full patient record schema
 - **New/returning detection** - Identifies returning patients and flags changes
 - **Critical alerts** - Automatic alerts for allergies and missing fields
@@ -69,7 +69,7 @@ The EMR agent uses three models, downloaded automatically on first run via `gaia
 | Model | Size | Purpose |
 |-------|------|---------|
 | Qwen3.5-35B-A3B-GGUF | 18.6 GB | LLM for chat queries and patient search |
-| Qwen3-VL-4B-Instruct-GGUF | 3.3 GB | Vision language model for form extraction |
+| Gemma-4-E4B-it-GGUF | ~3 GB | Vision language model for form extraction |
 | nomic-embed-text-v2-moe-GGUF | 522 MB | Embedding model for similarity search |
 
 <Note>
@@ -303,7 +303,7 @@ gaia-emr init
 This command:
 - Checks Lemonade server is running and context size is configured
 - Downloads and loads all required models:
-  - **VLM**: Qwen3-VL-4B-Instruct-GGUF (form extraction)
+  - **VLM**: Gemma-4-E4B-it-GGUF (form extraction)
   - **LLM**: Qwen3.5-35B-A3B-GGUF (chat/query processing)
   - **Embedding**: nomic-embed-text-v2-moe-GGUF (similarity search)
 - Verifies all models are loaded and ready
@@ -421,7 +421,7 @@ This command:
 
     | Command | Description |
     |---------|-------------|
-    | `init` | Download required models (VLM, optional LLM/embedding) |
+    | `init` | Download all required models (VLM, LLM, embedding) |
     | `watch` | Watch folder and process forms |
     | `process` | Process a single form file and exit |
     | `dashboard` | Launch web dashboard (Electron or browser) |
@@ -660,7 +660,7 @@ If `gaia-emr init` fails repeatedly or the agent won't start due to model errors
 2. Navigate to the model cache directory:
    - Windows: `%LOCALAPPDATA%\AMD\LemonadeModels\`
    - Linux: `~/.local/share/lemonade/models/`
-3. Delete the corrupted model folder (e.g., `Qwen3-VL-4B-Instruct-GGUF/`)
+3. Delete the corrupted model folder (e.g., `Gemma-4-E4B-it-GGUF/`)
 4. Restart Lemonade Server
 5. Run `gaia-emr init` again to re-download
-Original file line number
+Diff line change
@@ Expand Up / @@ -11,7 +11,7 @@ icon: "code" @@
     <Badge text="development" color="orange" />
     <Note>
-      The Code Agent now focuses on generating full-stack TypeScript web applications (Next.js + Prisma + Tailwind). Python code generation is no longer supported.
+      The Code Agent is optimized for full-stack TypeScript web apps (Next.js + Prisma + Tailwind), but Python code generation remains fully supported (and is the default for non-TypeScript requests).
     </Note>
     <Info>
@@ Expand Down @@