added some quick .md files
This commit is contained in:
@@ -0,0 +1,106 @@
|
||||
# V1 — Chat & Agent API Endpoints
|
||||
|
||||
This is the primary HTTP API surface for the chatbot agent system. It exposes
|
||||
both a custom streaming chat endpoint and an OpenAI-compatible
|
||||
`/chat/completions` endpoint so it works as a drop-in backend for OpenWebUI,
|
||||
LibreChat, or any OpenAI-compatible client.
|
||||
|
||||
---
|
||||
|
||||
## Endpoints
|
||||
|
||||
| Method | Path | Description |
|
||||
|---|---|---|
|
||||
| `GET ` | `/v1/` | Health check — returns `{"status": "ok"}` |
|
||||
| `GET ` | `/v1/agents` | List all registered agents (id + description) |
|
||||
| `GET ` | `/v1/models` | OpenAI-compatible model list (one entry per agent) |
|
||||
| `POST` | `/v1/chat` | Chat with an agent — streaming (SSE) |
|
||||
| `POST` | `/v1/chat/sync` | Chat with an agent — non-streaming |
|
||||
| `POST` | `/v1/chat/completions` | OpenAI-compatible chat completions (supports `stream: true`) |
|
||||
|
||||
All `/v1/*` endpoints are mounted by `main.py` via:
|
||||
|
||||
```python
|
||||
app.include_router(v1_router, prefix="/v1")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Agent Resolution
|
||||
|
||||
Each request can target a specific agent. The resolution order is:
|
||||
|
||||
1. **Explicit `agent_id`** field in the request body
|
||||
2. **OpenAI `model` field** (OpenWebUI sends this — mapped to `agent_id` if a matching agent is registered)
|
||||
3. **Fallback** to the `"naked"` agent (a plain LLM with no tools)
|
||||
|
||||
This means an OpenWebUI client can simply set `model: "media-agent"` and get
|
||||
the full Media Agent with Seerr tools.
|
||||
|
||||
---
|
||||
|
||||
## Request Flow
|
||||
|
||||
```
|
||||
Client (OpenWebUI / HTTP)
|
||||
│ POST /v1/chat/completions
|
||||
│ { model: "media-agent", messages: [...], stream: true/false }
|
||||
▼
|
||||
chat_completions()
|
||||
│ 1. _resolve_agent(req.model) → Agent(id="media-agent", skills=[...])
|
||||
│ 2. get_agent_graph("media-agent", request)
|
||||
│ → lazy-compiled LangGraph StateGraph, cached on app.state
|
||||
│ 3. stream=True → _stream_graph(graph, messages) → SSE token stream
|
||||
│ stream=False → _invoke_graph(graph, messages) → plain response
|
||||
▼
|
||||
LangGraph StateGraph (src/graph.py)
|
||||
│
|
||||
├── agent_node: calls LLM with system prompt + tool definitions
|
||||
│ └── LLM returns text OR tool_calls
|
||||
│
|
||||
├── _should_continue: if tool_calls → tool_node, else → END
|
||||
│
|
||||
└── tool_node: executes tool via agents/skills system → ToolMessage
|
||||
└── loops back to agent_node with the result
|
||||
```
|
||||
|
||||
For a detailed walkthrough, see [api.md](../api.md).
|
||||
|
||||
---
|
||||
|
||||
## Streaming
|
||||
|
||||
Two streaming modes exist:
|
||||
|
||||
### SSE (Server-Sent Events) — `/v1/chat`
|
||||
```
|
||||
data: {"token": "Here"}
|
||||
data: {"token": " are"}
|
||||
data: {"token": " the"}
|
||||
...
|
||||
data: [DONE]
|
||||
```
|
||||
|
||||
The graph runs to completion (tools execute silently), then the final text is
|
||||
yielded token-by-token as SSE events.
|
||||
|
||||
### OpenAI-compatible — `/v1/chat/completions` with `stream: true`
|
||||
```
|
||||
data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"}}]}
|
||||
data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"!"}}]}
|
||||
data: [DONE]
|
||||
```
|
||||
|
||||
> **Future improvement:** true token-level streaming (tokens appear as the LLM
|
||||
> generates them) would require using `langchain-openai`'s `ChatOpenAI` in
|
||||
> place of the raw `openai` client. The current approach avoids adding that
|
||||
> dependency.
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
Endpoints receive shared singletons via FastAPI `Depends`:
|
||||
|
||||
- **`get_llm_client(request)`** → returns `request.app.state.llm_client` (OpenAI client singleton, created once in `main.py`)
|
||||
- **`get_agent_graph(agent_id, request)`** → returns a lazy-compiled LangGraph from `request.app.state.agent_graphs`
|
||||
Reference in New Issue
Block a user