Implement LangGraph integration: refactor agent-tool interaction, add graph compilation, and enhance state management
Build and Push Agent API / build (push) Successful in 22s

This commit is contained in:
2026-05-24 10:18:59 +02:00
parent 1d821d18fe
commit 2f7f94f1ce
8 changed files with 534 additions and 242 deletions
+114 -64
View File
@@ -1,6 +1,7 @@
# API Architecture — Agent + Skill + Tool Pipeline
# API Architecture — Agent + Skill + Graph Pipeline
This document explains how the API routes user messages through the agent/skill/tool pipeline to produce responses.
This document explains how the API routes user messages through the
agent / skill / LangGraph pipeline to produce responses.
---
@@ -17,27 +18,22 @@ This document explains how the API routes user messages through the agent/skill/
│ api/v1/chat.py — chat_completions() │
│ │
│ 1. _resolve_agent(req.model) → Agent │
│ 2. agent.build_system_prompt() → system prompt
│ 3. Build full_messages = [system] + req.messages
│ 4. run_agent_with_tools(client, messages, agent_id) │
│ 2. get_agent_graph(agent_id) → compiled StateGraph
│ 3. graph.ainvoke(state) or _stream_graph(graph, messages)
└──────────────────────────────┬───────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
Tool-Calling Loop (run_agent_with_tools / run_agent_stream)
LangGraph StateGraph (core/graph.py)
│ │
while turns < max_turns:
response = LLM.chat(messages, tools=agent_tools)
if response has tool_calls:
for each tool_call:
result = execute_tool(skills, name, args)
append result to messages
else:
│ return response.text (stream tokens if streaming) │
┌──────────────┐ tool_calls? ┌──────────────┐
│ agent_node │ ───────────────▶ │ tool_node │
│ (LLM call) │ ◀─────────────── │ (skill exec) │
└──────┬───────┘ └──────────────┘
│ no tool_calls
[END]
└──────────────────────────────────────────────────────────────────┘
```
---
## Key Concepts
@@ -61,7 +57,8 @@ Agent(
- `build_system_prompt()` — merges base_prompt + all skill prompt fragments
Agents self-register at import time via `agents/__init__.py`'s `register()`.
`main.py` calls `load_all_agents()` at startup to import all agent/skill modules.
`main.py` calls `load_all_agents()` at startup to import every agent and skill
module.
### 2. Skill
@@ -78,37 +75,47 @@ Skill(
)
```
- `prompt_fragment` — injected into the agent's system prompt. Teaches the LLM what tools are available and when to use them.
- `prompt_fragment` — injected into the agent's system prompt.
- `tools` — list of OpenAI function definitions (name, description, parameters).
- `execute` — async callable that routes tool calls to API handlers.
### 3. Tool
### 3. Graph
A **Tool** is a single function the LLM can call. Defined as part of a skill's `tools` list.
Each agent gets a **compiled LangGraph StateGraph** built by
`core/graph.py:create_agent_graph()`. The graph is compiled lazily on the
first request and cached on `app.state.agent_graphs` for the lifetime of the
process.
| Graph node / edge | What it does |
|---|---|
| `agent_node` | Converts state messages to OpenAI dicts, calls the LLM with the agent's system prompt + tool definitions, returns an `AIMessage` |
| `tool_node` | Reads `tool_calls` from the last AI message, calls `execute_tool()` from the skill system, returns `ToolMessage` results |
| `_should_continue` | Conditional edge — returns `"tool_node"` if the AI message has `tool_calls`, else `END` |
### 4. State
Defined in `core/state.py`:
```python
{
"type": "function",
"function": {
"name": "seerr_trending",
"description": "Get trending movies and TV shows from Seerr...",
"parameters": {
"type": "object",
"properties": {
"kind": {"type": "string", "enum": ["movie", "tv", "all"]},
"language": {"type": "string"},
},
"required": ["kind"],
},
},
}
class AgentState(TypedDict):
messages: Annotated[list, add_messages]
```
When the LLM responds with a tool call, the loop:
1. Extracts `function.name` (e.g. `"seerr_trending"`) and `function.arguments` (e.g. `{"kind": "movie"}`)
2. Calls `execute_tool(agent.skills, name, args)` which finds the owning skill and runs it
3. Appends the result text to the message history
4. Sends back to the LLM for a follow-up response
LangGraph's `add_messages` reducer appends new messages and replaces messages
with matching IDs (so tool-call results overwrite their placeholders).
### 5. Message Conversion
Because we use the raw `openai` client (not `langchain-openai`), messages must
be converted between LangChain and OpenAI formats at every LLM call:
- **LangChain → OpenAI** (`_lc_role_to_openai`, `_langchain_tc_to_openai`):
Maps `type``role` and converts top-level `name`/`args` tool-calls into
the nested `function` sub-object that the OpenAI API expects.
- **OpenAI → LangChain** (inside `agent_node`):
Converts the `ChatCompletionMessage` response into an `AIMessage` with
LangChain-format `tool_calls` (top-level `name`/`args`/`id`).
---
@@ -130,28 +137,36 @@ When the LLM responds with a tool call, the loop:
2. chat_completions():
→ _resolve_agent(model="media-agent")
→ get_agent("media-agent") → Agent(skills=["media_info", "seerr", "triage"])
tools = get_all_tools(["media_info", "seerr", "triage"])
Returns 7 tool definitions from seerr.py
→ system_prompt = agent.build_system_prompt()
→ base_prompt + media_info fragment + seerr fragment + triage fragment
get_agent_graph("media-agent", request)
looks up app.state.agent_graphs["media-agent"]
→ first call → create_agent_graph() compiles the graph with 7 Seerr tools
→ run_agent_with_tools(request, messages, agent_id)
→ _invoke_graph(graph, messages)
3. run_agent_with_tools() — Turn 1:
→ LLM receives: [system prompt with tools] + [user: "What are trending movies?"]
→ LLM responds: tool_calls = [{"function": {"name": "seerr_trending", "arguments": {"kind": "movie"}}}]
3. Graph — Pass 1 (agent_node):
→ LLM receives: [system prompt] + [user: "What are trending movies?"]
→ LLM responds with tool_calls: seerr_trending(kind="movie")
→ agent_node returns AIMessage with tool_calls in LangChain format
4. Execute tool:
execute_tool(["media_info", "seerr", "triage"], "seerr_trending", {"kind": "movie"})
→ Finds seerr skill → calls _execute("seerr_trending", ...) → _trending(args)
→ GET /api/v1/discover/trending?mediaType=movie
→ Returns formatted list with [tmdb:IDs]
4. Graph — _should_continue:
AIMessage has tool_calls → route to "tool_node"
5. run_agent_with_tools() — Turn 2:
LLM receives: previous messages + [tool: "Found 20 trending movies..."]
LLM responds: text = "Here are the top trending movies! 🎬 ..."
finish_reason="stop" → return the text
5. Graph — tool_node:
Reads tool_call: name="seerr_trending", args={"kind": "movie"}
execute_tool(["media_info", "seerr", "triage"], "seerr_trending", ...)
Seerr API → GET /api/v1/discover/trending?mediaType=movie
→ Returns ToolMessage with formatted results including [tmdb:IDs]
6. chat_completions() returns:
{ "choices": [{"message": {"content": "Here are the top trending movies!..."}}] }
6. Graph — Pass 2 (agent_node):
→ LLM receives previous exchange + tool result
→ LLM responds with text only (no tool_calls)
→ agent_node returns AIMessage(content="Here are the top trending movies!...")
7. Graph — _should_continue:
→ No tool_calls → route to END
8. chat_completions() returns:
{ "choices": [{"message": {"role": "assistant", "content": "Here are the top..."}}] }
```
### Step-by-step: "Request the 2026 one" (multi-turn context)
@@ -172,14 +187,49 @@ When the LLM responds with a tool call, the loop:
2. chat_completions():
→ req.messages contains the ENTIRE conversation history
System prompt prepended → full_messages = [system] + 5 history messages
LLM sees everything: the trending list with [tmdb:931285], the disambiguation, "the 2026 one"
graph.ainvoke({"messages": all_messages})
agent_node prepends system prompt and sends everything to the LLM
3. LLM reasons:
- I previously listed Mortal Kombat II (2026) with [tmdb:931285]
3. LLM reasons from full context:
- Previously listed Mortal Kombat II (2026) with [tmdb:931285]
- The user said "request the mortal kombat one" → I searched and showed 4 options
- Now they say "the 2026 one" → that matches Mortal Kombat II (2026) [tmdb:931285]
- I should call seerr_request_media(kind="movie", title="Mortal Kombat II", tmdb_id=931285)
4. Tool executes the request → ✅ Success
4. tool_node executes the request → ✅ Success
```
---
## Streaming
Streaming works slightly differently from the sync path:
```
chat_completions(stream=True)
→ _stream_graph(graph, messages)
→ graph.ainvoke(state) # runs graph to completion (tools execute silently)
→ yields content character-by-character via SSE
```
For true token-level streaming (tokens appear as the LLM generates them),
the agent_node would need to use `langchain-openai`'s `ChatOpenAI` instead of
the raw `openai` client. The current approach is a pragmatic middle ground
that avoids adding another dependency while still giving the SSE client
incremental output.
---
## File Map
| File | Responsibility |
|---|---|
| `main.py` | FastAPI app, singleton creation, router mounting |
| `api/v1/chat.py` | Endpoints — resolves agent, invokes graph, formats responses |
| `api/dependencies.py` | `get_llm_client()`, `get_agent_graph()` — FastAPI `Depends` |
| `core/graph.py` | `create_agent_graph()` — builds the StateGraph |
| `core/state.py` | `AgentState` TypedDict |
| `core/llm.py` | `create_client()` — OpenAI client factory |
| `core/config.py` | Environment variable loader |
| `agents/` | Agent definitions (dataclass + self-registration) |
| `skills/` | Skill definitions (prompt fragments + tools + executors) |