Skip to content

anurag965/OdiaGenAI_JanaSathi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🗣️ JanaSathi: Odia E-Governance Chatbot

JanaSathi is a bilingual AI chatbot that helps users understand and access Odisha Government schemes. It uses a Retrieval-Augmented Generation (RAG) architecture to retrieve relevant content from official documents and generate responses in English and Odia.


🤖 Features

  • ✅ Answers queries about government schemes like KALIA Yojana, Mission Shakti, and Biju Swasthya Kalyan Yojana
  • 📄 Reads and processes official government PDF documents
  • 🌐 Uses Large Language Models via OpenRouter API
  • 🔍 Retrieves relevant information using semantic search (MiniLM)
  • 🧠 Generates responses in English and translates to Odia
  • 🖥️ Streamlit frontend for interactive usage

🛠️ Tech Stack

  • Python 3.10+
  • Streamlit
  • Sentence-Transformers (all-MiniLM-L6-v2)
  • OpenAI-compatible LLMs via OpenRouter
  • Cohere models for Odia translation
  • PyPDF2 for PDF parsing
  • scikit-learn for cosine similarity

🚀 Getting Started

1. Clone the repository

git clone https://github.com/your-username/odia-e-gov-chatbot.git
cd odia-e-gov-chatbot

2. Install dependencies

pip install -r requirements.txt

3. Add government scheme PDFs

Place your scheme-related PDFs in the project root directory.

4. Run the chatbot

streamlit run streamlit_app.py

💡 How It Works

  • PDF documents are processed and chunked into smaller sections.
  • Embeddings are generated using all-MiniLM-L6-v2.
  • The chatbot retrieves top relevant chunks using cosine similarity.
  • Uses a 49B LLM to generate responses in English.
  • Translates the responses into Odia using Cohere’s LLM.
  • Presents both responses through a clean Streamlit interface.

📷 Demo

Streamlit UI


📌 To Do

  • Improve document chunking for regional formatting
  • Add document upload feature in the UI
  • Support voice-based queries and responses
  • Log chat history for audit and learning

🙋‍♂️ Author

Anurag Pradhan
📧 [email protected]
🔗 LinkedInGitHub


📄 License

This project is licensed under the MIT License.

About

Built a bilingual Retrieval-Augmented Generation (RAG) chatbot for Odisha government schemes using OpenRouter-hosted LLMs. Embedded chunked PDF data using all-MiniLM-L6 v2, performed cosine similarity-based retrieval, and integrated a 49B instruction-tuned model (thedrummer/valkyrie-49b-v1) for English and cohere/command-r-plus for odia responses

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages