GitHub - mehikmat/ollama-rag-chatbot: Streamlit and Ollama based Agentic chatbot using Langchain

Building a Local RAG-Based Chatbot Using

ChromaDB as a vector database
LangChain4J or LangChain python for RAG orchestration in backend
Streamlit for an interactive chatbot UI
Ollama for model servicing locally
Llama3.2 as an inferencing model inside Ollama
LangChain community for document parsing

OLaMA

Download and install ollama from https://ollama.com/download/Ollama-darwin.zip
ollama pull ollama3.2 # for inferencing
ollama pull mxbai-embed-large # for embedding generation

RAG

Retrieval Phase The AI first searches for relevant data from an external knowledge source such as ChromaDB,web,etc. Augmentation Phase The retrieved information is injected into the AI’s context along with user question before generating a response. Generation Phase The AI model uses both pre-trained knowledge and the retrieved data to generate a context-aware response.

Langchain

For orchestration of RAG.

Langchain Text Splitters

CharacterTextSplitter:
- It splits the text into chunks based on the chunk size given.
- It does not consider natural breaks like punctuation or whitespace. It will split text exactly at the character count.
- Use it if your text is random words and has no symantic linking across lines like web page text.
RecursiveCharacterTextSplitter:
- it splits text into chunks based on given chunk size but it also considers natural breaks not just chunk size.
- So actual chunk size might be lesser or higher than given chunk size.
- Use it if the document has well-defined symantic linking across lines like long paragraph documents.
ParagraphTextSplitter:
- Use it if the document has well defined paragraphs and chunk size needed is equals to paragraph size.
RegexTextSplitter:
- This splitter allows you to split text based on a regular expression pattern.
SentenceTextSplitter:
- This splitter splits text by sentences, preserving the structure of the document.
- It’s useful when you want to ensure that chunks do not break sentences or disrupt the flow of the text.
LineTextSplitter:
- This splitter splits text based on lines, making it useful when you're processing text where each line should be preserved (e.g., CSV or log file).

ChromaDB

for vector database to store split documents.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
bin		bin
chroma_db		chroma_db
src		src
.env		.env
.gitignore		.gitignore
README.md		README.md
config.toml		config.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Building a Local RAG-Based Chatbot Using

OLaMA

RAG

Langchain

ChromaDB

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Building a Local RAG-Based Chatbot Using

OLaMA

RAG

Langchain

ChromaDB

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages