# API Architecture — Agent + Skill + Tool Pipeline This document explains how the API routes user messages through the agent/skill/tool pipeline to produce responses. --- ## Overview ``` ┌─────────────────────────────────────────────────────────────────┐ │ OpenWebUI / Client │ │ POST /v1/chat/completions { model, messages, stream } │ └──────────────────────────────┬──────────────────────────────────┘ │ ▼ ┌──────────────────────────────────────────────────────────────────┐ │ api/v1/chat.py — chat_completions() │ │ │ │ 1. _resolve_agent(req.model) → Agent │ │ 2. agent.build_system_prompt() → system prompt │ │ 3. Build full_messages = [system] + req.messages │ │ 4. run_agent_with_tools(client, messages, agent_id) │ └──────────────────────────────┬───────────────────────────────────┘ │ ▼ ┌──────────────────────────────────────────────────────────────────┐ │ Tool-Calling Loop (run_agent_with_tools / run_agent_stream) │ │ │ │ while turns < max_turns: │ │ response = LLM.chat(messages, tools=agent_tools) │ │ if response has tool_calls: │ │ for each tool_call: │ │ result = execute_tool(skills, name, args) │ │ append result to messages │ │ else: │ │ return response.text (stream tokens if streaming) │ └──────────────────────────────────────────────────────────────────┘ ``` --- ## Key Concepts ### 1. Agent An **Agent** is a persona + skill bundle. Defined in `agents/`. ```python # agents/media_agent.py Agent( agent_id="media-agent", description="Media assistant with Seerr integration", skills=["media_info", "seerr", "triage"], base_prompt="You are a media assistant...", ) ``` - `agent_id` — unique name, exposed as a model in OpenWebUI - `skills` — list of skill names to load - `base_prompt` — starting system prompt, combined with skill fragments - `build_system_prompt()` — merges base_prompt + all skill prompt fragments Agents self-register at import time via `agents/__init__.py`'s `register()`. `main.py` calls `load_all_agents()` at startup to import all agent/skill modules. ### 2. Skill A **Skill** is a capability bundle. Defined in `skills/`. ```python # skills/seerr.py Skill( name="seerr", description="Seerr integration — trending, discover, request media, submit issues", prompt_fragment="## Seerr Media Tools\n...", tools=[...], # OpenAI function-calling schema execute=_execute, # async handler: tool_name + args → ToolResult ) ``` - `prompt_fragment` — injected into the agent's system prompt. Teaches the LLM what tools are available and when to use them. - `tools` — list of OpenAI function definitions (name, description, parameters). - `execute` — async callable that routes tool calls to API handlers. ### 3. Tool A **Tool** is a single function the LLM can call. Defined as part of a skill's `tools` list. ```python { "type": "function", "function": { "name": "seerr_trending", "description": "Get trending movies and TV shows from Seerr...", "parameters": { "type": "object", "properties": { "kind": {"type": "string", "enum": ["movie", "tv", "all"]}, "language": {"type": "string"}, }, "required": ["kind"], }, }, } ``` When the LLM responds with a tool call, the loop: 1. Extracts `function.name` (e.g. `"seerr_trending"`) and `function.arguments` (e.g. `{"kind": "movie"}`) 2. Calls `execute_tool(agent.skills, name, args)` which finds the owning skill and runs it 3. Appends the result text to the message history 4. Sends back to the LLM for a follow-up response --- ## Full Request Flow ### Step-by-step: "What are trending movies?" ``` 1. OpenWebUI sends: POST /v1/chat/completions { "model": "media-agent", "messages": [ {"role": "user", "content": "What are trending movies?"} ], "stream": false } 2. chat_completions(): → _resolve_agent(model="media-agent") → get_agent("media-agent") → Agent(skills=["media_info", "seerr", "triage"]) → tools = get_all_tools(["media_info", "seerr", "triage"]) → Returns 7 tool definitions from seerr.py → system_prompt = agent.build_system_prompt() → base_prompt + media_info fragment + seerr fragment + triage fragment 3. run_agent_with_tools() — Turn 1: → LLM receives: [system prompt with tools] + [user: "What are trending movies?"] → LLM responds: tool_calls = [{"function": {"name": "seerr_trending", "arguments": {"kind": "movie"}}}] 4. Execute tool: → execute_tool(["media_info", "seerr", "triage"], "seerr_trending", {"kind": "movie"}) → Finds seerr skill → calls _execute("seerr_trending", ...) → _trending(args) → GET /api/v1/discover/trending?mediaType=movie → Returns formatted list with [tmdb:IDs] 5. run_agent_with_tools() — Turn 2: → LLM receives: previous messages + [tool: "Found 20 trending movies..."] → LLM responds: text = "Here are the top trending movies! 🎬 ..." → finish_reason="stop" → return the text 6. chat_completions() returns: { "choices": [{"message": {"content": "Here are the top trending movies!..."}}] } ``` ### Step-by-step: "Request the 2026 one" (multi-turn context) ``` 1. OpenWebUI sends the FULL history: { "model": "media-agent", "messages": [ {"role": "user", "content": "What are trending movies?"}, {"role": "assistant", "content": "Here are the top 10 trending movies! 1. **Mortal Kombat II** (2026) [tmdb:931285] — ..."}, {"role": "user", "content": "could request the mortal kombat one?"}, {"role": "assistant", "content": "There are several Mortal Kombat entries! ..."}, {"role": "user", "content": "the 2026 one"} ] } 2. chat_completions(): → req.messages contains the ENTIRE conversation history → System prompt prepended → full_messages = [system] + 5 history messages → LLM sees everything: the trending list with [tmdb:931285], the disambiguation, "the 2026 one" 3. LLM reasons: - I previously listed Mortal Kombat II (2026) with [tmdb:931285] - The user said "request the mortal kombat one" → I searched and showed 4 options - Now they say "the 2026 one" → that matches Mortal Kombat II (2026) [tmdb:931285] - I should call seerr_request_media(kind="movie", title="Mortal Kombat II", tmdb_id=931285) 4. Tool executes the request → ✅ Success ```