This document provides comprehensive documentation for the PocketLLM backend API, built with NestJS. The API follows RESTful principles and provides endpoints for authentication, user management, chat functionality, and background jobs.
The backend is structured as a NestJS application with the following key modules:
- Auth Module: User authentication and authorization
- Users Module: User profile management
- Chats Module: Chat sessions and message handling
- Jobs Module: Background task processing (image generation)
- Providers Module: Integration with AI providers (OpenAI, Anthropic, Ollama)
https://pocket-llm-api.vercel.app/v1
For self-hosted deployments, replace the base URL with your server address.
Swagger UI for the managed deployment is available at https://pocket-llm-api.vercel.app/docs (legacy path https://pocket-llm-api.vercel.app/api/docs) unless disabled via ENABLE_SWAGGER_DOCS. The root endpoint (GET /) echoes
the active docs path for quick verification.
All protected endpoints require a valid JWT token in the Authorization header:
Authorization: Bearer <access_token>
POST /auth/signup
Register a new user account.
Request Body:
{
"email": "[email protected]",
"password": "securepassword123"
}Response:
{
"success": true,
"data": {
"user": {
"id": "uuid",
"email": "[email protected]",
"created_at": "2023-10-27T10:00:00.000Z",
"aud": "authenticated",
"role": "authenticated"
},
"session": null,
"message": "User created successfully. Please sign in to get a session."
}
}POST /auth/signin
Authenticate a user and obtain an access token.
Request Body:
{
"email": "[email protected]",
"password": "securepassword123"
}Response:
{
"success": true,
"data": {
"user": {
"id": "uuid",
"email": "[email protected]",
"created_at": "2023-10-27T10:00:00.000Z"
},
"session": {
"access_token": "jwt_token_here",
"refresh_token": "refresh_token_here",
"expires_in": 3600,
"token_type": "bearer"
}
}
}POST /auth/refresh
Exchange a valid refresh token for a new access token without requiring the user to re-enter credentials.
Request Body:
{
"refresh_token": "refresh_token_here"
}Response:
{
"success": true,
"data": {
"tokens": {
"access_token": "new_jwt_token",
"refresh_token": "refresh_token_here",
"expires_in": 3600,
"token_type": "bearer"
},
"session": {
"session_id": "uuid",
"expires_at": "2023-10-27T11:00:00.000Z"
}
}
}User profile operations require authentication.
GET /users/profile
Retrieve the authenticated user's profile information.
Response:
{
"success": true,
"data": {
"id": "uuid",
"full_name": "John Doe",
"username": "johndoe",
"bio": "Software developer",
"date_of_birth": "1990-01-01",
"profession": "Developer",
"avatar_url": "https://example.com/avatar.jpg",
"survey_completed": true,
"created_at": "2023-10-27T10:00:00.000Z",
"updated_at": "2023-10-27T10:00:00.000Z"
}
}PUT /users/profile
Update the authenticated user's profile information.
Request Body:
{
"full_name": "John Doe Updated",
"username": "johndoe_new",
"bio": "Senior Software Developer",
"date_of_birth": "1990-01-01",
"profession": "Senior Developer",
"avatar_url": "https://example.com/new_avatar.jpg",
"survey_completed": true
}Response:
{
"success": true,
"data": {
"id": "uuid",
"full_name": "John Doe Updated",
"username": "johndoe_new",
"bio": "Senior Software Developer",
"date_of_birth": "1990-01-01",
"profession": "Senior Developer",
"avatar_url": "https://example.com/new_avatar.jpg",
"survey_completed": true,
"created_at": "2023-10-27T10:00:00.000Z",
"updated_at": "2023-10-27T11:00:00.000Z"
}
}POST /users/profile/onboarding
Capture the user's onboarding answers, mark their profile as complete, and persist structured survey data.
Request Body:
{
"full_name": "Ada Lovelace",
"username": "adalovelace",
"bio": "Working on AI productivity workflows",
"date_of_birth": "1995-06-15",
"profession": "Software Engineer",
"heard_from": "Friend recommendation",
"avatar_url": "https://example.com/avatar.png",
"age": 28,
"onboarding": {
"primary_goal": "Improve daily productivity with AI assistance",
"interests": ["Productivity", "Coding"],
"experience_level": "Intermediate",
"usage_frequency": "Daily",
"other_notes": "Prefers concise answers and actionable steps."
}
}Response:
{
"success": true,
"data": {
"id": "uuid",
"email": "[email protected]",
"full_name": "Ada Lovelace",
"username": "adalovelace",
"bio": "Working on AI productivity workflows",
"date_of_birth": "1995-06-15",
"profession": "Software Engineer",
"heard_from": "Friend recommendation",
"avatar_url": "https://example.com/avatar.png",
"survey_completed": true,
"age": 28,
"onboarding": {
"primary_goal": "Improve daily productivity with AI assistance",
"interests": ["Productivity", "Coding"],
"experience_level": "Intermediate",
"usage_frequency": "Daily",
"other_notes": "Prefers concise answers and actionable steps."
},
"created_at": "2023-10-27T10:00:00.000Z",
"updated_at": "2023-10-27T10:00:00.000Z"
}
}DELETE /users/profile
Permanently delete the authenticated user's account and all associated data.
Response:
{
"success": true,
"data": {
"message": "User account deleted successfully"
}
}Chat operations require authentication and are scoped to the authenticated user.
GET /chats
Retrieve all chats for the authenticated user.
Response:
{
"success": true,
"data": [
{
"id": "uuid",
"user_id": "uuid",
"title": "My Chat",
"model_config": {
"provider": "openai",
"model": "gpt-4",
"apiKey": "sk-...",
"systemPrompt": "You are a helpful assistant",
"temperature": 0.7,
"maxTokens": 1000
},
"created_at": "2023-10-27T10:00:00.000Z",
"updated_at": "2023-10-27T10:00:00.000Z"
}
]
}POST /chats
Create a new chat session.
Request Body:
{
"title": "New Chat",
"model_config": {
"provider": "openai",
"model": "gpt-4",
"apiKey": "sk-...",
"systemPrompt": "You are a helpful assistant",
"temperature": 0.7,
"maxTokens": 1000
}
}Response:
{
"success": true,
"data": {
"id": "uuid",
"user_id": "uuid",
"title": "New Chat",
"model_config": {
"provider": "openai",
"model": "gpt-4",
"apiKey": "sk-...",
"systemPrompt": "You are a helpful assistant",
"temperature": 0.7,
"maxTokens": 1000
},
"created_at": "2023-10-27T10:00:00.000Z",
"updated_at": "2023-10-27T10:00:00.000Z"
}
}GET /chats/{chatId}
Retrieve details of a specific chat.
Response:
{
"success": true,
"data": {
"id": "uuid",
"user_id": "uuid",
"title": "My Chat",
"model_config": {
"provider": "openai",
"model": "gpt-4"
},
"created_at": "2023-10-27T10:00:00.000Z",
"updated_at": "2023-10-27T10:00:00.000Z"
}
}PUT /chats/{chatId}
Update a chat's title or model configuration.
Request Body:
{
"title": "Updated Chat Title",
"model_config": {
"provider": "anthropic",
"model": "claude-3-sonnet",
"apiKey": "sk-...",
"temperature": 0.5
}
}Response:
{
"success": true,
"data": {
"id": "uuid",
"user_id": "uuid",
"title": "Updated Chat Title",
"model_config": {
"provider": "anthropic",
"model": "claude-3-sonnet",
"temperature": 0.5
},
"created_at": "2023-10-27T10:00:00.000Z",
"updated_at": "2023-10-27T11:00:00.000Z"
}
}DELETE /chats/{chatId}
Permanently delete a chat and all its messages.
Response:
{
"success": true,
"data": {
"message": "Chat deleted successfully"
}
}POST /chats/{chatId}/messages
Send a message in a chat and receive the AI response.
Request Body:
{
"content": "Hello, how are you?",
"model_config": {
"provider": "openai",
"model": "gpt-4",
"apiKey": "sk-...",
"systemPrompt": "You are a helpful assistant",
"temperature": 0.7,
"maxTokens": 1000
}
}Response:
{
"success": true,
"data": {
"userMessage": {
"id": "uuid",
"chat_id": "uuid",
"content": "Hello, how are you?",
"role": "user",
"created_at": "2023-10-27T10:00:00.000Z"
},
"assistantMessage": {
"id": "uuid",
"chat_id": "uuid",
"content": "Hello! I'm doing well, thank you for asking. How can I help you today?",
"role": "assistant",
"created_at": "2023-10-27T10:00:01.000Z"
}
}
}GET /chats/{chatId}/messages?limit=50&offset=0
Retrieve messages from a specific chat.
Query Parameters:
limit(optional): Maximum number of messages (default: 50, max: 100)offset(optional): Number of messages to skip (default: 0)
Response:
{
"success": true,
"data": [
{
"id": "uuid",
"chat_id": "uuid",
"content": "Hello, how are you?",
"role": "user",
"created_at": "2023-10-27T10:00:00.000Z"
},
{
"id": "uuid",
"chat_id": "uuid",
"content": "Hello! I'm doing well, thank you for asking.",
"role": "assistant",
"created_at": "2023-10-27T10:00:01.000Z"
}
]
}Job operations require authentication and are scoped to the authenticated user.
GET /jobs?status=completed&limit=10
Retrieve background jobs for the authenticated user.
Query Parameters:
status(optional): pending, processing, completed, failed, cancelledtype(optional): image_generationlimit(optional): Max 100, default 50offset(optional): Default 0
Response:
{
"success": true,
"data": [
{
"id": "uuid",
"user_id": "uuid",
"type": "image_generation",
"status": "completed",
"parameters": {
"prompt": "A beautiful sunset",
"model": "dall-e-3",
"size": "1024x1024",
"quality": "hd"
},
"result": {
"imageUrl": "https://example.com/image.png",
"revisedPrompt": "A beautiful sunset over mountains"
},
"estimated_cost": 0.04,
"actual_cost": 0.04,
"created_at": "2023-10-27T10:00:00.000Z",
"completed_at": "2023-10-27T10:01:00.000Z"
}
]
}POST /jobs/image-generation
Create a new image generation job.
Request Body:
{
"prompt": "A beautiful sunset over mountains",
"model": "dall-e-3",
"size": "1024x1024",
"quality": "hd",
"style": "vivid",
"n": 1
}Response:
{
"success": true,
"data": {
"id": "uuid",
"user_id": "uuid",
"type": "image_generation",
"status": "pending",
"parameters": {
"prompt": "A beautiful sunset over mountains",
"model": "dall-e-3",
"size": "1024x1024",
"quality": "hd",
"style": "vivid",
"n": 1
},
"estimated_cost": 0.04,
"created_at": "2023-10-27T10:00:00.000Z"
}
}GET /jobs/{jobId}
Retrieve details of a specific job.
Response:
{
"success": true,
"data": {
"id": "uuid",
"user_id": "uuid",
"type": "image_generation",
"status": "completed",
"parameters": {
"prompt": "A beautiful sunset over mountains",
"model": "dall-e-3",
"size": "1024x1024"
},
"result": {
"imageUrl": "https://example.com/image.png",
"revisedPrompt": "A beautiful sunset over mountains with golden light"
},
"estimated_cost": 0.04,
"actual_cost": 0.04,
"created_at": "2023-10-27T10:00:00.000Z",
"completed_at": "2023-10-27T10:01:00.000Z"
}
}DELETE /jobs/{jobId}
Cancel or delete a job.
Response:
{
"success": true,
"data": {
"message": "Job cancelled/deleted successfully"
}
}POST /jobs/{jobId}/retry
Retry a failed job.
Response:
{
"success": true,
"data": {
"id": "uuid",
"status": "pending",
"message": "Job queued for retry"
}
}GET /jobs/image-generation/models
Retrieve available image generation models and their capabilities.
Response:
{
"success": true,
"data": [
{
"name": "dall-e-3",
"provider": "openai",
"sizes": ["1024x1024", "1792x1024", "1024x1792"],
"quality": ["standard", "hd"],
"pricing": {
"1024x1024_standard": 0.04,
"1024x1024_hd": 0.08,
"1792x1024_standard": 0.08,
"1792x1024_hd": 0.12
}
},
{
"name": "dall-e-2",
"provider": "openai",
"sizes": ["256x256", "512x512", "1024x1024"],
"quality": ["standard"],
"pricing": {
"256x256": 0.016,
"512x512": 0.018,
"1024x1024": 0.02
}
}
]
}POST /jobs/image-generation/estimate-cost
Estimate the cost of an image generation request.
Request Body:
{
"model": "dall-e-3",
"size": "1024x1024",
"quality": "hd",
"n": 2
}Response:
{
"success": true,
"data": {
"estimatedCost": 0.16,
"currency": "USD",
"breakdown": {
"model": "dall-e-3",
"size": "1024x1024",
"quality": "hd",
"quantity": 2,
"unitCost": 0.08
}
}
}All API responses follow a standardized format:
Success Response:
{
"success": true,
"data": { ... },
"error": null,
"metadata": {
"timestamp": "2023-10-27T10:00:00.000Z",
"requestId": "uuid-v4-string",
"processingTime": 123.45
}
}Error Response:
{
"success": false,
"data": null,
"error": {
"message": "Error description here"
},
"metadata": {
"timestamp": "2023-10-27T10:00:00.000Z",
"requestId": "uuid",
"processingTime": 123.45
}
}Model operations require authentication. All responses follow the standard envelope documented above.
GET /models
Aggregate live models from every configured provider. The backend validates that API keys exist before invoking each official SDK and returns human-friendly guidance when configuration is missing.
Query Parameters:
| Parameter | Description |
|---|---|
provider |
Optional provider identifier (openai, groq, openrouter, imagerouter). |
name |
Case-insensitive substring filter applied to model names. |
modelId |
Case-insensitive substring filter applied to model identifiers. |
query |
Free-text search across id, name, description, and metadata. |
Response:
{
"models": [
{
"provider": "openai",
"id": "gpt-4o",
"name": "GPT-4 Omni",
"description": "General purpose assistant",
"context_window": 128000,
"max_output_tokens": null,
"metadata": {
"owned_by": "openai"
}
}
],
"message": null,
"configured_providers": ["openai"],
"missing_providers": ["groq", "openrouter", "imagerouter"]
}If the user has not configured any API keys the models array is empty and message clarifies which providers still need credentials.
GET /models/saved
Return the models that the user has explicitly imported into their workspace.
Response:
[
{
"id": "uuid",
"name": "GPT-4 Preview",
"provider": "openrouter",
"model": "openrouter/gpt-4o-mini",
"isDefault": true,
"description": "Balanced reasoning + speed",
"createdAt": "2024-02-15T10:00:00.000Z",
"updatedAt": "2024-02-18T12:00:00.000Z"
}
]POST /models/import
Import one or more provider models into the user's workspace.
Request Body:
{
"provider": "openrouter",
"providerId": "uuid",
"sharedSettings": {
"temperature": 0.7,
"systemPrompt": "You are a helpful assistant."
},
"selections": [
{
"id": "openrouter/gpt-4o-mini",
"name": "GPT-4o Mini",
"description": "Fast reasoning tuned for chat"
}
]
}Response:
{
"success": true,
"data": [
{
"id": "uuid",
"name": "GPT-4o Mini",
"provider": "openrouter",
"model": "openrouter/gpt-4o-mini",
"isDefault": false,
"createdAt": "2024-02-18T12:05:00.000Z",
"updatedAt": "2024-02-18T12:05:00.000Z"
}
]
}GET /models/{modelId}
Return a single saved model configuration.
DELETE /models/{modelId}
Remove a saved model. The backend will automatically choose a fallback default if the deleted model was marked as default.
POST /models/{modelId}/default
Mark the supplied model as the workspace default. The backend ensures only one default exists at a time and updates cached model metadata accordingly.
Provider configuration endpoints manage API credentials and fetch remote catalog metadata. All routes require authentication and resolve the provider code via the path parameter.
GET /providers
Return every provider configured for the authenticated user. Each record indicates whether credentials are active and whether an API key preview is available.
POST /providers/activate
Request Body:
{
"provider": "openrouter",
"apiKey": "sk-or-xxx",
"baseUrl": "https://openrouter.ai/api",
"metadata": {
"defaultModel": "gpt-4o-mini"
}
}The backend encrypts the API key, stores a secure hash preview, and upserts the provider record.
PATCH /providers/{provider}
Supply any subset of fields (apiKey, baseUrl, metadata, displayName, isActive). Passing apiKey: null removes stored credentials.
DELETE /providers/{provider}
Marks the provider inactive and erases stored API key material.
GET /providers/{provider}/models
Fetch the catalog of available models from a specific provider. The response format mirrors GET /models so the UI can treat both endpoints uniformly.
Response:
{
"models": [
{
"provider": "groq",
"id": "llama-guard",
"name": "LLaMA Guard",
"metadata": {
"capabilities": ["moderation"]
}
}
],
"message": null,
"configured_providers": ["groq"],
"missing_providers": []
}To exercise the NestJS backend from a physical phone or tablet:
- Ensure both devices share a network. Connect the development machine running
npm run start:devand the mobile device to the same Wi‑Fi/LAN. - Expose the backend on the LAN. The NestJS server already listens on
0.0.0.0; confirm it is reachable by visitinghttp://<your-computer-ip>:8000/api/docsfrom another device. - Run Flutter with the LAN URL. When launching the app, override the backend base URL so API calls target your desktop instead of
localhost:Addflutter run \ --dart-define=BACKEND_BASE_URL=http://<your-computer-ip>:8000 \ --dart-define=BACKEND_API_SUFFIX=v1
--dart-define=FALLBACK_BACKEND_URL=<optional-backup-url>if you want to supply a secondary server. Replace<your-computer-ip>with the IPv4 address shown byipconfig(Windows) orifconfig/ip addr(macOS/Linux). The legacyPOCKETLLM_BACKEND_URLflag still works and can be used if you prefer providing the full URL (for examplehttp://<your-computer-ip>:8000/v1). - Verify authentication calls. After the app starts, the Sign In and Sign Up flows will send requests to the configured backend URL. Monitor the NestJS console logs to confirm the
/v1/auth/signupand/v1/auth/signinhandlers execute when you tap the respective buttons in the mobile UI.
If the device cannot reach the backend, double-check VPNs or firewalls, and ensure the mobile network does not isolate clients (some public hotspots do).
POST /prompt-enhancer/improve
Enhances a user-supplied prompt using Groq's openai/gpt-oss-120b model with task-specific instructions. Requests require authentication and are rate limited (10 per minute per user).
Request Body:
{
"prompt": "Draw a dragon in the mountains",
"task": "image_generation",
"session_id": "chat-session-123"
}Response:
{
"task": "image_generation",
"enhanced_prompt": "Cinematic illustration of a crimson dragon circling snow-capped alpine peaks, dramatic golden-hour lighting, ultra-wide 16:9 aspect ratio, volumetric mist in the valley, painted in the style of Studio Ghibli and Ruan Jia.",
"guidance": "Emphasize lighting, include foreground terrain, reference painterly texture.",
"raw_response": "{...full model JSON...}"
}GET /agents/list
Returns all registered LangChain/LangGraph agents along with any custom components discovered in docs/langchain.
Response:
{
"agents": [
{
"name": "prompt_enhancer",
"description": "Enhances user prompts according to the requested task category.",
"capabilities": ["prompt_optimization", "task_routing", "contextual_memory"]
},
{
"name": "workflow",
"description": "Multi-step coordinator that enhances prompts and dispatches to specialist agents.",
"capabilities": ["prompt_enhancement", "task_routing", "multi_agent_execution"]
}
],
"discovered_components": {
"agents": ["Tool calling agents", "Structured output chains"],
"tools": ["Python REPL tool", "Search tool"]
}
}POST /agents/run
Executes a registered agent. The payload accepts optional metadata and session identifiers to persist memory across calls.
Request Body:
{
"agent": "workflow",
"prompt": "Summarize yesterday's meeting notes",
"task": "summarization",
"session_id": "chat-session-123",
"metadata": {
"audience": "product_team"
}
}ℹ️ Python execution is disabled. Any
metadata.testsvalues are ignored to prevent untrusted code from running inside the API environment.
Response:
{
"agent": "workflow",
"output": "Knowledge-grounded response: ...",
"data": {
"task": "summarization",
"enhanced_prompt": "...",
"result": "...",
"metadata": {
"enhancement": {"enhanced_prompt": "..."},
"execution": {"sources": ["..."]}
}
}
}The API uses standard HTTP status codes:
- 200: Success
- 201: Created
- 400: Bad Request (validation errors)
- 401: Unauthorized (missing/invalid token)
- 403: Forbidden
- 404: Not Found
- 500: Internal Server Error
- All API communication should use HTTPS in production
- Authentication tokens expire after 1 hour
- API keys are encrypted before storage
- Rate limiting is implemented to prevent abuse
- Input validation is performed on all endpoints
For testing the API, you can use:
- Postman: Import the collection from POSTMAN_API_GUIDE.md
- curl: Command-line HTTP client
- Swagger UI: Available at
https://pocket-llm-api.vercel.app/api/docswhen the server is running