Add API architecture documentation for Agent, Skill, and Tool pipeline

2026-05-14 14:28:17 +02:00
parent 2adf17493a
commit 0634e7400a
1 changed files with 0 additions and 36 deletions
@@ -183,39 +183,3 @@ When the LLM responds with a tool call, the loop:
 4. Tool executes the request → ✅ Success
 ```
 ---
 ## File Map
 ```
 main.py                          # FastAPI app entry point, creates singletons
 ├── core/
 │   ├── config.py                # .env loader, config constants
 │   └── llm.py                   # create_client() factory for OpenAI client
 ├── api/
 │   ├── dependencies.py          # FastAPI Depends: get_llm_client()
 │   └── v1/
 │       └── chat.py              # APIRouter, endpoints, tool-calling loop
 ├── agents/
 │   ├── __init__.py              # Agent dataclass, registry, load_all_agents()
 │   ├── naked.py                 # Agent: barebone LLM, no skills
 │   └── media_agent.py           # Agent: media assistant with Seerr skills
 └── skills/
    ├── __init__.py              # Skill dataclass, ToolResult, registry, execution
    ├── media_info.py            # Skill: base media assistant persona (prompt-only)
    ├── seerr.py                 # Skill: Seerr API tools (7 tools, real API calls)
    └── triage.py                # Skill: fallback for unsupported actions (prompt-only)
 ```
 ## Key Design Decisions
 1. **Full multi-turn history**: `req.messages` passes through unchanged. The LLM has access to its own previous responses (including `[tmdb:IDs]`). No external state management needed.
 2. **No deterministic pre-processing**: No affirmation detectors, reference resolvers, or hardcoded rules. The LLM interprets user intent naturally from full conversation context.
 3. **Agent selection via `model` field**: OpenWebUI sends `model` in the request. `_resolve_agent()` maps it to a registered agent. The `/v1/models` endpoint lists all agents as selectable models.
 4. **Skills = prompts + tools**: Skills inject prompt fragments AND optionally expose OpenAI function-calling tools. Prompt-only skills (like `triage`) just shape behavior. Tool-enabled skills (like `seerr`) let the LLM take real actions.
 5. **Singleton LLM client**: Created once in `main.py`, stored on `app.state.llm_client`, accessed via FastAPI `Depends(get_llm_client)`.