Add API architecture documentation for Agent, Skill, and Tool pipeline
Build and Push Agent API / build (push) Successful in 6s

This commit is contained in:
2026-05-14 14:28:17 +02:00
parent 2adf17493a
commit 0634e7400a
+185
View File
@@ -0,0 +1,185 @@
# API Architecture — Agent + Skill + Tool Pipeline
This document explains how the API routes user messages through the agent/skill/tool pipeline to produce responses.
---
## Overview
```
┌─────────────────────────────────────────────────────────────────┐
│ OpenWebUI / Client │
│ POST /v1/chat/completions { model, messages, stream } │
└──────────────────────────────┬──────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ api/v1/chat.py — chat_completions() │
│ │
│ 1. _resolve_agent(req.model) → Agent │
│ 2. agent.build_system_prompt() → system prompt │
│ 3. Build full_messages = [system] + req.messages │
│ 4. run_agent_with_tools(client, messages, agent_id) │
└──────────────────────────────┬───────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ Tool-Calling Loop (run_agent_with_tools / run_agent_stream) │
│ │
│ while turns < max_turns: │
│ response = LLM.chat(messages, tools=agent_tools) │
│ if response has tool_calls: │
│ for each tool_call: │
│ result = execute_tool(skills, name, args) │
│ append result to messages │
│ else: │
│ return response.text (stream tokens if streaming) │
└──────────────────────────────────────────────────────────────────┘
```
---
## Key Concepts
### 1. Agent
An **Agent** is a persona + skill bundle. Defined in `agents/`.
```python
# agents/media_agent.py
Agent(
agent_id="media-agent",
description="Media assistant with Seerr integration",
skills=["media_info", "seerr", "triage"],
base_prompt="You are a media assistant...",
)
```
- `agent_id` — unique name, exposed as a model in OpenWebUI
- `skills` — list of skill names to load
- `base_prompt` — starting system prompt, combined with skill fragments
- `build_system_prompt()` — merges base_prompt + all skill prompt fragments
Agents self-register at import time via `agents/__init__.py`'s `register()`.
`main.py` calls `load_all_agents()` at startup to import all agent/skill modules.
### 2. Skill
A **Skill** is a capability bundle. Defined in `skills/`.
```python
# skills/seerr.py
Skill(
name="seerr",
description="Seerr integration — trending, discover, request media, submit issues",
prompt_fragment="## Seerr Media Tools\n...",
tools=[...], # OpenAI function-calling schema
execute=_execute, # async handler: tool_name + args → ToolResult
)
```
- `prompt_fragment` — injected into the agent's system prompt. Teaches the LLM what tools are available and when to use them.
- `tools` — list of OpenAI function definitions (name, description, parameters).
- `execute` — async callable that routes tool calls to API handlers.
### 3. Tool
A **Tool** is a single function the LLM can call. Defined as part of a skill's `tools` list.
```python
{
"type": "function",
"function": {
"name": "seerr_trending",
"description": "Get trending movies and TV shows from Seerr...",
"parameters": {
"type": "object",
"properties": {
"kind": {"type": "string", "enum": ["movie", "tv", "all"]},
"language": {"type": "string"},
},
"required": ["kind"],
},
},
}
```
When the LLM responds with a tool call, the loop:
1. Extracts `function.name` (e.g. `"seerr_trending"`) and `function.arguments` (e.g. `{"kind": "movie"}`)
2. Calls `execute_tool(agent.skills, name, args)` which finds the owning skill and runs it
3. Appends the result text to the message history
4. Sends back to the LLM for a follow-up response
---
## Full Request Flow
### Step-by-step: "What are trending movies?"
```
1. OpenWebUI sends:
POST /v1/chat/completions
{
"model": "media-agent",
"messages": [
{"role": "user", "content": "What are trending movies?"}
],
"stream": false
}
2. chat_completions():
→ _resolve_agent(model="media-agent")
→ get_agent("media-agent") → Agent(skills=["media_info", "seerr", "triage"])
→ tools = get_all_tools(["media_info", "seerr", "triage"])
→ Returns 7 tool definitions from seerr.py
→ system_prompt = agent.build_system_prompt()
→ base_prompt + media_info fragment + seerr fragment + triage fragment
3. run_agent_with_tools() — Turn 1:
→ LLM receives: [system prompt with tools] + [user: "What are trending movies?"]
→ LLM responds: tool_calls = [{"function": {"name": "seerr_trending", "arguments": {"kind": "movie"}}}]
4. Execute tool:
→ execute_tool(["media_info", "seerr", "triage"], "seerr_trending", {"kind": "movie"})
→ Finds seerr skill → calls _execute("seerr_trending", ...) → _trending(args)
→ GET /api/v1/discover/trending?mediaType=movie
→ Returns formatted list with [tmdb:IDs]
5. run_agent_with_tools() — Turn 2:
→ LLM receives: previous messages + [tool: "Found 20 trending movies..."]
→ LLM responds: text = "Here are the top trending movies! 🎬 ..."
→ finish_reason="stop" → return the text
6. chat_completions() returns:
{ "choices": [{"message": {"content": "Here are the top trending movies!..."}}] }
```
### Step-by-step: "Request the 2026 one" (multi-turn context)
```
1. OpenWebUI sends the FULL history:
{
"model": "media-agent",
"messages": [
{"role": "user", "content": "What are trending movies?"},
{"role": "assistant", "content": "Here are the top 10 trending movies!
1. **Mortal Kombat II** (2026) [tmdb:931285] — ..."},
{"role": "user", "content": "could request the mortal kombat one?"},
{"role": "assistant", "content": "There are several Mortal Kombat entries! ..."},
{"role": "user", "content": "the 2026 one"}
]
}
2. chat_completions():
→ req.messages contains the ENTIRE conversation history
→ System prompt prepended → full_messages = [system] + 5 history messages
→ LLM sees everything: the trending list with [tmdb:931285], the disambiguation, "the 2026 one"
3. LLM reasons:
- I previously listed Mortal Kombat II (2026) with [tmdb:931285]
- The user said "request the mortal kombat one" → I searched and showed 4 options
- Now they say "the 2026 one" → that matches Mortal Kombat II (2026) [tmdb:931285]
- I should call seerr_request_media(kind="movie", title="Mortal Kombat II", tmdb_id=931285)
4. Tool executes the request → ✅ Success
```