Contexto 🚀

Up-to-date documentation for LLMs and AI code editors

Contexto is a documentation aggregation platform inspired by Context7 from Upstash. It helps developers provide fresh, accurate documentation to AI assistants like Claude, GPT, and AI code editors like Cursor and Windsurf.

🌟 Features

Core Functionality

🕷️ Intelligent Crawler: Automatically crawl entire documentation sites with GitHub integration
📚 Documentation Aggregation: Index documentation from any library or framework
🔍 Semantic Search: Vector-based search using OpenAI embeddings
🤖 LLM Enrichment: Automatically enhance documentation with explanations
⚡ Lightning Fast: Redis caching for optimal performance
📝 llms.txt Generation: Export documentation in LLM-friendly format

Advanced Features

🗺️ Sitemap Support: Automatic sitemap.xml detection and parsing for efficient crawling
⏱️ Rate Limiting: Built-in rate limiting to respect server resources (1 req/sec)
📊 Analytics Dashboard: Real-time statistics and activity tracking
🔗 MCP Server: Model Context Protocol for AI editor integration
📊 Job Queue: Real-time progress tracking for crawl operations
🎨 Modern UI: Clean, responsive interface built with Next.js 15 and Tailwind CSS
🐳 Docker Ready: Complete Docker setup for easy deployment

🏗️ Architecture

Contexto uses a 5-stage processing pipeline:

Parse: Extract code snippets and examples from documentation
Enrich: Add short explanations and metadata using LLMs
Vectorize: Embed content for semantic search
Rerank: Score results for relevance using custom algorithm
Cache: Serve requests from Redis for best performance

🛠️ Tech Stack

Frontend: Next.js 15 (App Router), React 19, TypeScript, Tailwind CSS
Backend: Next.js API Routes
Database: Upstash Redis (caching), Upstash Vector (embeddings)
AI: OpenAI API (embeddings + enrichment)
Parsing: Cheerio, Markdown-it
Icons: Lucide React

🚀 Getting Started

Prerequisites

Node.js 18+ and npm
Upstash Redis database
Upstash Vector database
OpenAI API key

Installation

Clone the repository:

```bash git clone cd contexto ```

Install dependencies:

```bash npm install ```

Set up environment variables:

```bash cp .env.example .env ```

Edit `.env` and add your credentials:

```env

Upstash Redis

UPSTASH_REDIS_REST_URL=your_redis_url UPSTASH_REDIS_REST_TOKEN=your_redis_token

Upstash Vector

UPSTASH_VECTOR_REST_URL=your_vector_url UPSTASH_VECTOR_REST_TOKEN=your_vector_token

OpenAI API

OPENAI_API_KEY=your_openai_api_key ```

Run the development server:

```bash npm run dev ```

Open http://localhost:3000 in your browser

Docker Installation (Recommended)

Clone the repository and navigate to the directory
Create .env file with your credentials:

cp .env.example .env
# Edit .env with your credentials

Build and run with Docker Compose:

docker-compose up -d

Access the application at http://localhost:3000

See DEPLOYMENT.md for detailed deployment options.

📖 Usage

Adding a Library

Click "Add Library" in the navigation
Fill in the library details:
- Name (e.g., "React")
- Description
- Version
- Documentation URL
- Category
Click "Add Library" to create the library

Crawling Documentation

After adding a library, you have two indexing options:

Option 1: Intelligent Crawler (Recommended)

The crawler automatically discovers and indexes all documentation pages:

Navigate to your library's detail page
Click "Crawl Site"
Configure crawler options:
- Max Pages: Maximum number of pages to crawl (default: 100)
- Max Depth: How deep to follow links (default: 5)
- Require Code: Only index pages with code snippets
- Follow External Links: Whether to crawl external domains
- Include/Exclude Patterns: Filter URLs (e.g., /docs/, /api/)
Click "Start Crawling" and monitor the progress

Supported File Types:

.md, .mdx - Markdown files
.html, .htm - HTML documentation
.rst - reStructuredText
.ipynb - Jupyter notebooks

GitHub Integration:

Automatically detects GitHub repository URLs
Uses GitHub API for faster and more reliable crawling
Supports branch and path selection
Ideal for open-source project documentation

Example URLs:

https://nextjs.org/docs - Regular site crawl
https://github.com/facebook/react/tree/main/docs - GitHub repo crawl
https://docs.python.org/3/ - Python documentation

Option 2: Quick Reindex

For single-page documentation or quick updates:

Click "Quick Reindex" on the library detail page
The system will re-process only the main documentation URL

Analytics Dashboard

Monitor your platform's performance:

Navigate to the Dashboard page
View statistics:
- Total libraries and documentation chunks
- Search analytics
- Crawl job success/failure rates
- Library-specific statistics
- Recent activity feed

Searching Documentation

Use the search bar on the homepage or navigate to the Search page
Enter your query (e.g., "useState hook", "routing in Next.js")
View results with enriched explanations and code examples
Copy code snippets directly to your editor

Downloading llms.txt

Each library can be exported as an `llms.txt` file:

Visit a library detail page
Click "Download llms.txt"
Paste the content into your AI editor's context

🔌 MCP Server Integration

Contexto includes an MCP (Model Context Protocol) server for integration with AI editors like Cursor.

Running the MCP Server

```bash npx tsx mcp-server.ts ```

Available Tools

`search_documentation`: Search across all libraries
`get_library`: Get details about a specific library
`list_libraries`: List all available libraries

Cursor Integration

Add to your Cursor configuration:

```json { "mcpServers": { "contexto": { "command": "npx", "args": ["tsx", "/path/to/contexto/mcp-server.ts"] } } } ```

🏃 Development

Advanced Features

Sitemap Support

The crawler automatically detects and parses sitemap.xml files for efficient URL discovery:

Checks multiple sitemap locations (/sitemap.xml, /sitemap_index.xml, etc.)
Parses sitemap indexes recursively
Falls back to regular crawling if no sitemap found
Respects sitemap priorities and update frequencies

Rate Limiting

Built-in rate limiting ensures respectful crawling:

1 second delay between requests (default)
Prevents server overload
Configurable per crawler instance
Applies to both web and sitemap-based crawling

Analytics System

Comprehensive analytics tracking:

Search query tracking
Library statistics
Crawl job monitoring
Activity feed
Real-time dashboard updates

Project Structure

``` contexto/ ├── app/ # Next.js app directory │ ├── api/ # API routes │ │ ├── analytics/ # Analytics endpoint │ │ ├── jobs/ # Job status endpoints │ │ ├── libraries/ # Library CRUD operations │ │ ├── search/ # Search endpoint │ │ └── llmstxt/ # llms.txt generation │ ├── dashboard/ # Analytics dashboard │ ├── libraries/ # Library pages │ ├── search/ # Search page │ ├── add/ # Add library page │ ├── layout.tsx # Root layout │ ├── page.tsx # Homepage │ └── globals.css # Global styles ├── components/ # React components │ ├── Header.tsx │ ├── CrawlerConfig.tsx # Crawler configuration UI │ ├── JobProgress.tsx # Job progress tracking │ └── ... │ ├── LibraryCard.tsx │ └── SearchBar.tsx ├── lib/ # Utility libraries │ ├── analytics.ts # Analytics service │ ├── crawler.ts # Web crawler │ ├── github-crawler.ts # GitHub API crawler │ ├── job-queue.ts # Job queue management │ ├── sitemap.ts # Sitemap parser │ ├── redis.ts # Redis client │ ├── vector.ts # Vector store client │ ├── openai.ts # OpenAI client │ └── parser.ts # Documentation parser ├── types/ # TypeScript types │ ├── index.ts │ ├── crawler.ts │ └── analytics.ts ├── examples/ # Example data │ └── example-libraries.json ├── mcp-server.ts # MCP server ├── Dockerfile # Docker configuration ├── docker-compose.yml # Docker Compose setup ├── .dockerignore ├── DEPLOYMENT.md # Deployment guide ├── package.json ├── tsconfig.json ├── tailwind.config.ts └── README.md ```

Available Scripts

`npm run dev`: Start development server
`npm run build`: Build for production
`npm run start`: Start production server
`npm run lint`: Run ESLint

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

MIT License - feel free to use this project for personal or commercial purposes.

🙏 Acknowledgments

Inspired by Context7 from Upstash
Built with Next.js
Powered by Upstash and OpenAI

📞 Support

For questions or issues, please open an issue on GitHub.

Built with ❤️ for the AI development community

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
components		components
examples		examples
lib		lib
types		types
.dockerignore		.dockerignore
.env.example		.env.example
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
DEPLOYMENT.md		DEPLOYMENT.md
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
mcp-server.ts		mcp-server.ts
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

Contexto 🚀

🌟 Features

Core Functionality

Advanced Features

🏗️ Architecture

🛠️ Tech Stack

🚀 Getting Started

Prerequisites

Installation

Upstash Redis

Upstash Vector

OpenAI API

Docker Installation (Recommended)

📖 Usage

Adding a Library

Crawling Documentation

Option 1: Intelligent Crawler (Recommended)

Option 2: Quick Reindex

Analytics Dashboard

Searching Documentation

Downloading llms.txt

🔌 MCP Server Integration

Running the MCP Server

Available Tools

Cursor Integration

🏃 Development

Advanced Features

Sitemap Support

Rate Limiting

Analytics System

Project Structure

Available Scripts

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Support

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages