Implement LangGraph integration: refactor agent-tool interaction, add graph compilation, and enhance state management
Build and Push Agent API / build (push) Successful in 22s
Build and Push Agent API / build (push) Successful in 22s
This commit is contained in:
+114
-64
@@ -1,6 +1,7 @@
|
||||
# API Architecture — Agent + Skill + Tool Pipeline
|
||||
# API Architecture — Agent + Skill + Graph Pipeline
|
||||
|
||||
This document explains how the API routes user messages through the agent/skill/tool pipeline to produce responses.
|
||||
This document explains how the API routes user messages through the
|
||||
agent / skill / LangGraph pipeline to produce responses.
|
||||
|
||||
---
|
||||
|
||||
@@ -17,27 +18,22 @@ This document explains how the API routes user messages through the agent/skill/
|
||||
│ api/v1/chat.py — chat_completions() │
|
||||
│ │
|
||||
│ 1. _resolve_agent(req.model) → Agent │
|
||||
│ 2. agent.build_system_prompt() → system prompt │
|
||||
│ 3. Build full_messages = [system] + req.messages │
|
||||
│ 4. run_agent_with_tools(client, messages, agent_id) │
|
||||
│ 2. get_agent_graph(agent_id) → compiled StateGraph │
|
||||
│ 3. graph.ainvoke(state) or _stream_graph(graph, messages) │
|
||||
└──────────────────────────────┬───────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌──────────────────────────────────────────────────────────────────┐
|
||||
│ Tool-Calling Loop (run_agent_with_tools / run_agent_stream) │
|
||||
│ LangGraph StateGraph (core/graph.py) │
|
||||
│ │
|
||||
│ while turns < max_turns: │
|
||||
│ response = LLM.chat(messages, tools=agent_tools) │
|
||||
│ if response has tool_calls: │
|
||||
│ for each tool_call: │
|
||||
│ result = execute_tool(skills, name, args) │
|
||||
│ append result to messages │
|
||||
│ else: │
|
||||
│ return response.text (stream tokens if streaming) │
|
||||
│ ┌──────────────┐ tool_calls? ┌──────────────┐ │
|
||||
│ │ agent_node │ ───────────────▶ │ tool_node │ │
|
||||
│ │ (LLM call) │ ◀─────────────── │ (skill exec) │ │
|
||||
│ └──────┬───────┘ └──────────────┘ │
|
||||
│ │ no tool_calls │
|
||||
│ ▼ │
|
||||
│ [END] │
|
||||
└──────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Concepts
|
||||
|
||||
@@ -61,7 +57,8 @@ Agent(
|
||||
- `build_system_prompt()` — merges base_prompt + all skill prompt fragments
|
||||
|
||||
Agents self-register at import time via `agents/__init__.py`'s `register()`.
|
||||
`main.py` calls `load_all_agents()` at startup to import all agent/skill modules.
|
||||
`main.py` calls `load_all_agents()` at startup to import every agent and skill
|
||||
module.
|
||||
|
||||
### 2. Skill
|
||||
|
||||
@@ -78,37 +75,47 @@ Skill(
|
||||
)
|
||||
```
|
||||
|
||||
- `prompt_fragment` — injected into the agent's system prompt. Teaches the LLM what tools are available and when to use them.
|
||||
- `prompt_fragment` — injected into the agent's system prompt.
|
||||
- `tools` — list of OpenAI function definitions (name, description, parameters).
|
||||
- `execute` — async callable that routes tool calls to API handlers.
|
||||
|
||||
### 3. Tool
|
||||
### 3. Graph
|
||||
|
||||
A **Tool** is a single function the LLM can call. Defined as part of a skill's `tools` list.
|
||||
Each agent gets a **compiled LangGraph StateGraph** built by
|
||||
`core/graph.py:create_agent_graph()`. The graph is compiled lazily on the
|
||||
first request and cached on `app.state.agent_graphs` for the lifetime of the
|
||||
process.
|
||||
|
||||
| Graph node / edge | What it does |
|
||||
|---|---|
|
||||
| `agent_node` | Converts state messages to OpenAI dicts, calls the LLM with the agent's system prompt + tool definitions, returns an `AIMessage` |
|
||||
| `tool_node` | Reads `tool_calls` from the last AI message, calls `execute_tool()` from the skill system, returns `ToolMessage` results |
|
||||
| `_should_continue` | Conditional edge — returns `"tool_node"` if the AI message has `tool_calls`, else `END` |
|
||||
|
||||
### 4. State
|
||||
|
||||
Defined in `core/state.py`:
|
||||
|
||||
```python
|
||||
{
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "seerr_trending",
|
||||
"description": "Get trending movies and TV shows from Seerr...",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"kind": {"type": "string", "enum": ["movie", "tv", "all"]},
|
||||
"language": {"type": "string"},
|
||||
},
|
||||
"required": ["kind"],
|
||||
},
|
||||
},
|
||||
}
|
||||
class AgentState(TypedDict):
|
||||
messages: Annotated[list, add_messages]
|
||||
```
|
||||
|
||||
When the LLM responds with a tool call, the loop:
|
||||
1. Extracts `function.name` (e.g. `"seerr_trending"`) and `function.arguments` (e.g. `{"kind": "movie"}`)
|
||||
2. Calls `execute_tool(agent.skills, name, args)` which finds the owning skill and runs it
|
||||
3. Appends the result text to the message history
|
||||
4. Sends back to the LLM for a follow-up response
|
||||
LangGraph's `add_messages` reducer appends new messages and replaces messages
|
||||
with matching IDs (so tool-call results overwrite their placeholders).
|
||||
|
||||
### 5. Message Conversion
|
||||
|
||||
Because we use the raw `openai` client (not `langchain-openai`), messages must
|
||||
be converted between LangChain and OpenAI formats at every LLM call:
|
||||
|
||||
- **LangChain → OpenAI** (`_lc_role_to_openai`, `_langchain_tc_to_openai`):
|
||||
Maps `type` → `role` and converts top-level `name`/`args` tool-calls into
|
||||
the nested `function` sub-object that the OpenAI API expects.
|
||||
|
||||
- **OpenAI → LangChain** (inside `agent_node`):
|
||||
Converts the `ChatCompletionMessage` response into an `AIMessage` with
|
||||
LangChain-format `tool_calls` (top-level `name`/`args`/`id`).
|
||||
|
||||
---
|
||||
|
||||
@@ -130,28 +137,36 @@ When the LLM responds with a tool call, the loop:
|
||||
2. chat_completions():
|
||||
→ _resolve_agent(model="media-agent")
|
||||
→ get_agent("media-agent") → Agent(skills=["media_info", "seerr", "triage"])
|
||||
→ tools = get_all_tools(["media_info", "seerr", "triage"])
|
||||
→ Returns 7 tool definitions from seerr.py
|
||||
→ system_prompt = agent.build_system_prompt()
|
||||
→ base_prompt + media_info fragment + seerr fragment + triage fragment
|
||||
→ get_agent_graph("media-agent", request)
|
||||
→ looks up app.state.agent_graphs["media-agent"]
|
||||
→ first call → create_agent_graph() compiles the graph with 7 Seerr tools
|
||||
→ run_agent_with_tools(request, messages, agent_id)
|
||||
→ _invoke_graph(graph, messages)
|
||||
|
||||
3. run_agent_with_tools() — Turn 1:
|
||||
→ LLM receives: [system prompt with tools] + [user: "What are trending movies?"]
|
||||
→ LLM responds: tool_calls = [{"function": {"name": "seerr_trending", "arguments": {"kind": "movie"}}}]
|
||||
3. Graph — Pass 1 (agent_node):
|
||||
→ LLM receives: [system prompt] + [user: "What are trending movies?"]
|
||||
→ LLM responds with tool_calls: seerr_trending(kind="movie")
|
||||
→ agent_node returns AIMessage with tool_calls in LangChain format
|
||||
|
||||
4. Execute tool:
|
||||
→ execute_tool(["media_info", "seerr", "triage"], "seerr_trending", {"kind": "movie"})
|
||||
→ Finds seerr skill → calls _execute("seerr_trending", ...) → _trending(args)
|
||||
→ GET /api/v1/discover/trending?mediaType=movie
|
||||
→ Returns formatted list with [tmdb:IDs]
|
||||
4. Graph — _should_continue:
|
||||
→ AIMessage has tool_calls → route to "tool_node"
|
||||
|
||||
5. run_agent_with_tools() — Turn 2:
|
||||
→ LLM receives: previous messages + [tool: "Found 20 trending movies..."]
|
||||
→ LLM responds: text = "Here are the top trending movies! 🎬 ..."
|
||||
→ finish_reason="stop" → return the text
|
||||
5. Graph — tool_node:
|
||||
→ Reads tool_call: name="seerr_trending", args={"kind": "movie"}
|
||||
→ execute_tool(["media_info", "seerr", "triage"], "seerr_trending", ...)
|
||||
→ Seerr API → GET /api/v1/discover/trending?mediaType=movie
|
||||
→ Returns ToolMessage with formatted results including [tmdb:IDs]
|
||||
|
||||
6. chat_completions() returns:
|
||||
{ "choices": [{"message": {"content": "Here are the top trending movies!..."}}] }
|
||||
6. Graph — Pass 2 (agent_node):
|
||||
→ LLM receives previous exchange + tool result
|
||||
→ LLM responds with text only (no tool_calls)
|
||||
→ agent_node returns AIMessage(content="Here are the top trending movies!...")
|
||||
|
||||
7. Graph — _should_continue:
|
||||
→ No tool_calls → route to END
|
||||
|
||||
8. chat_completions() returns:
|
||||
{ "choices": [{"message": {"role": "assistant", "content": "Here are the top..."}}] }
|
||||
```
|
||||
|
||||
### Step-by-step: "Request the 2026 one" (multi-turn context)
|
||||
@@ -172,14 +187,49 @@ When the LLM responds with a tool call, the loop:
|
||||
|
||||
2. chat_completions():
|
||||
→ req.messages contains the ENTIRE conversation history
|
||||
→ System prompt prepended → full_messages = [system] + 5 history messages
|
||||
→ LLM sees everything: the trending list with [tmdb:931285], the disambiguation, "the 2026 one"
|
||||
→ graph.ainvoke({"messages": all_messages})
|
||||
→ agent_node prepends system prompt and sends everything to the LLM
|
||||
|
||||
3. LLM reasons:
|
||||
- I previously listed Mortal Kombat II (2026) with [tmdb:931285]
|
||||
3. LLM reasons from full context:
|
||||
- Previously listed Mortal Kombat II (2026) with [tmdb:931285]
|
||||
- The user said "request the mortal kombat one" → I searched and showed 4 options
|
||||
- Now they say "the 2026 one" → that matches Mortal Kombat II (2026) [tmdb:931285]
|
||||
- I should call seerr_request_media(kind="movie", title="Mortal Kombat II", tmdb_id=931285)
|
||||
|
||||
4. Tool executes the request → ✅ Success
|
||||
4. tool_node executes the request → ✅ Success
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Streaming
|
||||
|
||||
Streaming works slightly differently from the sync path:
|
||||
|
||||
```
|
||||
chat_completions(stream=True)
|
||||
→ _stream_graph(graph, messages)
|
||||
→ graph.ainvoke(state) # runs graph to completion (tools execute silently)
|
||||
→ yields content character-by-character via SSE
|
||||
```
|
||||
|
||||
For true token-level streaming (tokens appear as the LLM generates them),
|
||||
the agent_node would need to use `langchain-openai`'s `ChatOpenAI` instead of
|
||||
the raw `openai` client. The current approach is a pragmatic middle ground
|
||||
that avoids adding another dependency while still giving the SSE client
|
||||
incremental output.
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
| File | Responsibility |
|
||||
|---|---|
|
||||
| `main.py` | FastAPI app, singleton creation, router mounting |
|
||||
| `api/v1/chat.py` | Endpoints — resolves agent, invokes graph, formats responses |
|
||||
| `api/dependencies.py` | `get_llm_client()`, `get_agent_graph()` — FastAPI `Depends` |
|
||||
| `core/graph.py` | `create_agent_graph()` — builds the StateGraph |
|
||||
| `core/state.py` | `AgentState` TypedDict |
|
||||
| `core/llm.py` | `create_client()` — OpenAI client factory |
|
||||
| `core/config.py` | Environment variable loader |
|
||||
| `agents/` | Agent definitions (dataclass + self-registration) |
|
||||
| `skills/` | Skill definitions (prompt fragments + tools + executors) |
|
||||
|
||||
Reference in New Issue
Block a user