A hands-on project exploring modern LangChain patterns using LCEL (LangChain Expression Language) for building composable chains and LangGraph for intelligent routing decisions.
- Pipe Operator (
|): Chain components together explicitly:prompt | llm | parser - Composability: Build complex workflows from simple, reusable pieces
- Transparency: Every step is visible and debuggable (no "black box" magic)
- Dictionary Mapping: Use
itemgetterand dict literals for explicit data flow
- State Management: Define a
TypedDictas shared memory between nodes - Conditional Routing: Route user queries to specialized chains based on intent
- Node Functions: Pure functions that receive state and return updates
- Dependency Injection: Use
functools.partialfor clean, testable nodes
├── chain.py # LCEL chain composition and execution
├── chatbot.py # Memory-based chatbot with session management
├── chunking.py # Document text splitting
├── clients.py # LLM client factories (Gemini, Groq, Perplexity)
├── document_reader.py # PDF and TXT document loaders
├── embeddings.py # ChromaDB vector store with hash-based deduplication
├── lang_graph.py # LangGraph routing implementation
├── main.py # RAG pipeline execution
├── main_graph.py # LangGraph execution demo
├── model.py # Pydantic models and State definition
├── prompts.py # All prompt templates
├── utils.py # Helper functions (API keys, formatting, hashing)
├── documentos/ # Source documents for RAG
└── db_chroma/ # Persistent vector store
# Simple chain composition
chain = prompt | llm | parser
# Complex chain with state accumulation
composed_chain = (
RunnablePassthrough.assign(city_obj=chain_city)
| RunnablePassthrough.assign(city=lambda x: x["city_obj"].city)
| RunnablePassthrough.assign(restaurants=chain_restaurants)
)# Define state as shared memory
class State(TypedDict):
input: str
route: str
answer: str
# Router decides which specialist to call
def choose_node(state: State):
return state["route"].strip().lower() # "beach", "city", or "mountain"# Hash-based deduplication saves API credits
ids = [generate_id(chunk.page_content) for chunk in chunks]
existing_ids = set(vector_store.get(ids=ids)["ids"])
new_chunks = [c for i, c in enumerate(chunks) if ids[i] not in existing_ids]- Groq (Llama 3.3 70B) - Fast inference
- Google Gemini (Gemma 3 27B) - Balanced performance
- Perplexity (Sonar) - Web-augmented responses
- Python 3.10+
- API keys for at least one LLM provider
# Clone the repository
git clone <repo-url>
cd <repo-name>
# Create virtual environment
python -m venv venv
source venv/Scripts/activate # Windows
# source venv/bin/activate # Linux/Mac
# Install dependencies
pip install -r requirements.txtCreate a .env file based on .env.example:
GEMINI_API_KEY=your_gemini_key
GROQ_API_KEY=your_groq_key
PPLX_API_KEY=your_perplexity_key# Run LangGraph routing demo
python main_graph.py
# Run RAG pipeline
python main.pyInput Dict → itemgetter("key") → retriever → format_docs → prompt → llm → parser → Output
The pipe operator passes the output of each step as the input to the next. When using dictionary literals, each key's value is computed independently and merged into a single dict.
START → router_node → [writes route to State] → choose_node → [reads route] → specialist_node → END
The State acts as shared memory. Each node reads from it, processes, and writes back. The router sets the route key, and conditional edges read it to decide the next node.
| Category | Technology |
|---|---|
| Framework | LangChain, LangGraph |
| LLMs | Groq, Google Gemini, Perplexity |
| Vector Store | ChromaDB |
| Embeddings | Google Generative AI Embeddings |
| Validation | Pydantic |
Free to use