added some quick .md files

2026-05-25 12:24:53 +02:00
parent b0f10b6bb1
commit 4b87b817a8
4 changed files with 385 additions and 214 deletions
@@ -0,0 +1,106 @@
+# V1 — Chat & Agent API Endpoints
+
+This is the primary HTTP API surface for the chatbot agent system. It exposes
+both a custom streaming chat endpoint and an OpenAI-compatible
+`/chat/completions` endpoint so it works as a drop-in backend for OpenWebUI,
+LibreChat, or any OpenAI-compatible client.
+
+---
+
+## Endpoints
+
+| Method | Path | Description |
+|---|---|---|
+| `GET ` | `/v1/` | Health check — returns `{"status": "ok"}` |
+| `GET ` | `/v1/agents` | List all registered agents (id + description) |
+| `GET ` | `/v1/models` | OpenAI-compatible model list (one entry per agent) |
+| `POST` | `/v1/chat` | Chat with an agent — streaming (SSE) |
+| `POST` | `/v1/chat/sync` | Chat with an agent — non-streaming |
+| `POST` | `/v1/chat/completions` | OpenAI-compatible chat completions (supports `stream: true`) |
+
+All `/v1/*` endpoints are mounted by `main.py` via:
+
+```python
+app.include_router(v1_router, prefix="/v1")
+```
+
+---
+
+## Agent Resolution
+
+Each request can target a specific agent. The resolution order is:
+
+1. **Explicit `agent_id`** field in the request body
+2. **OpenAI `model` field** (OpenWebUI sends this — mapped to `agent_id` if a matching agent is registered)
+3. **Fallback** to the `"naked"` agent (a plain LLM with no tools)
+
+This means an OpenWebUI client can simply set `model: "media-agent"` and get
+the full Media Agent with Seerr tools.
+
+---
+
+## Request Flow
+
+```
+Client (OpenWebUI / HTTP)
+  │  POST /v1/chat/completions
+  │  { model: "media-agent", messages: [...], stream: true/false }
+  ▼
+chat_completions()
+  │  1. _resolve_agent(req.model) → Agent(id="media-agent", skills=[...])
+  │  2. get_agent_graph("media-agent", request)
+  │     → lazy-compiled LangGraph StateGraph, cached on app.state
+  │  3. stream=True  → _stream_graph(graph, messages)  → SSE token stream
+  │     stream=False → _invoke_graph(graph, messages)   → plain response
+  ▼
+LangGraph StateGraph  (src/graph.py)
+  │
+  ├── agent_node: calls LLM with system prompt + tool definitions
+  │   └── LLM returns text OR tool_calls
+  │
+  ├── _should_continue: if tool_calls → tool_node, else → END
+  │
+  └── tool_node: executes tool via agents/skills system → ToolMessage
+      └── loops back to agent_node with the result
+```
+
+For a detailed walkthrough, see [api.md](../api.md).
+
+---
+
+## Streaming
+
+Two streaming modes exist:
+
+### SSE (Server-Sent Events) — `/v1/chat`
+```
+data: {"token": "Here"}
+data: {"token": " are"}
+data: {"token": " the"}
+...
+data: [DONE]
+```
+
+The graph runs to completion (tools execute silently), then the final text is
+yielded token-by-token as SSE events.
+
+### OpenAI-compatible — `/v1/chat/completions` with `stream: true`
+```
+data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"}}]}
+data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"!"}}]}
+data: [DONE]
+```
+
+> **Future improvement:** true token-level streaming (tokens appear as the LLM
+> generates them) would require using `langchain-openai`'s `ChatOpenAI` in
+> place of the raw `openai` client. The current approach avoids adding that
+> dependency.
+
+---
+
+## Dependencies
+
+Endpoints receive shared singletons via FastAPI `Depends`:
+
+- **`get_llm_client(request)`** → returns `request.app.state.llm_client` (OpenAI client singleton, created once in `main.py`)
+- **`get_agent_graph(agent_id, request)`** → returns a lazy-compiled LangGraph from `request.app.state.agent_graphs`