DEV Community: Vishnu Sivan

Introducing uv: Next-Gen Python Package Manager

Vishnu Sivan — Mon, 16 Dec 2024 16:08:23 +0000

Python evolution has been closely tied to advancements in package management, from manual installations to tools like pip and poetry. However, as projects grow in complexity, traditional tools often fall short in speed and efficiency.

uv is a cutting-edge Python package and project manager built with Rust, aims to change that. Combining the functionality of tools like pip, poetry, and virtualenv, uv streamlines tasks like dependency management, script execution, and project building—all with exceptional performance. Its seamless compatibility with pip commands, requiring no additional learning curve.

In this tutorial, we will explore how to install uv and make the most of its features. From setting up a project and managing dependencies to running scripts and leveraging its enhanced pip interface.

Getting Started

pip limitations
What is uv
Key features of uv
Benchmarks
Installation
Creating virtual environments
Building a flask app using uv
Installing python with uv
Tools
Cheatsheet
Current Limitations

pip limitations

Pip is a widely used package management system written in Python, designed to install and manage software packages. However, despite its popularity, it is often criticized for being one of the slowest package management tools for Python. Complaints about “pip install being slow” are so common that they frequently appear in developer forums and threads.

One significant drawback of pip is its susceptibility to dependency smells, which occur when dependency configuration files are poorly written or maintained. These issues can lead to serious consequences, such as increased complexity and reduced maintainability of projects.

Another limitation of pip is its inability to consistently match Python code accurately when restoring runtime environments. This mismatch can result in a low success rate for dependency inference, making it challenging to reliably recreate project environments.

What is uv

uv is a modern, high-performance Python package manager, developed by the creators of ruff and written in Rust. Designed as a drop-in replacement for pip and pip-tools, it delivers exceptional speed and compatibility with existing tools.

Key features include support for editable installs, Git and URL dependencies, constraint files, custom indexes, and more. uv’s standards-compliant virtual environments work seamlessly with other tools, avoiding lock-in or customization. It is cross-platform, supporting Linux, Windows, and macOS, and has been tested extensively against the PyPI index.

Focusing on simplicity, speed, and reliability, uv addresses common developer pain points like slow installations, version conflicts, and complex dependency management, offering an intuitive solution for modern Python development.

Key features of uv

⚖️ Drop-in Replacement: Seamlessly replaces pip, pip-tools, virtualenv, and other tools with full compatibility.
⚡ Blazing Speed: 10–100x faster than traditional tools like pip, pip-compile, and pip-sync.
💾 Disk-Space Efficient: Utilizes a global cache for dependency deduplication, saving storage.
🐍 Flexible Installation: Installable via curl, pip, or pipx without requiring Rust or Python.
🧪 Thoroughly Tested: Proven performance at scale with the top 10,000 PyPI packages.
🖥️ Cross-Platform Support: Fully compatible with macOS, Linux, and Windows.
🔩 Advanced Dependency Management: Features include dependency version overrides, alternative resolution strategies, and a conflict-tracking resolver.
⁉️ Clear Error Messaging: Best-in-class error handling ensures developers can resolve conflicts efficiently.
🤝 Modern Python Features: Supports editable installs, Git dependencies, direct URLs, local dependencies, constraint files, and more.
🚀 Unified Tooling: Combines the functionality of tools like pip, pipx, poetry, pyenv, twine, and more into a single solution.
🛠️ Application and Script Management: Installs and manages Python versions, runs scripts with inline dependency metadata, and supports comprehensive project workflows.
🗂️ Universal Lockfile: Simplifies project management with consistent and portable lockfiles.
🏢 Workspace Support: Handles scalable projects with Cargo-style workspace management.

Benchmarks

source: https://blog.kusho.ai/uv-pip-killer-or-yet-another-package-manager
Resolving (left) and installing (right) dependencies using a warm cache, simulating the process of recreating a virtual environment or adding a new dependency to an existing project.

source: https://blog.kusho.ai/uv-pip-killer-or-yet-another-package-manager
Resolving (left) and installing (right) dependencies with a cold cache simulate execution in a clean environment. Without caching, uv is 8–10x faster than pip and pip-tools, and with a warm cache, it achieves speeds 80–115x faster.

source: https://blog.kusho.ai/uv-pip-killer-or-yet-another-package-manager
Creating a virtual environment with (left) and without (right) seed packages like pip and setuptools. uv is approximately 80x faster than python -m venv and 7x faster than virtualenv, all while operating independently of Python.

Installation

Installing uv is quick and straightforward. You can opt for standalone installers or install it directly from PyPI.

# On macOS and Linux.
curl -LsSf https://astral.sh/uv/install.sh | sh

# On Windows.
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

# With pip.
pip install uv

# With pipx.
pipx install uv

# With Homebrew.
brew install uv

# With Pacman.
pacman -S uv

Before using uv, we have to add the uv path to environment variables.
For Linux and macOS, modify the PATH environment variable using the following command in the terminal:

export PATH="$HOME/.local/bin:$PATH"

For windows, To add a directory to the PATH environment variable for both user and system on Windows, search for Environment Variables in the search panel. Under User variables / System variables, select the Path variable, click Edit, then click New and add the desired path.

%USERPROFILE%\.local\bin

After the installation, run the uv command in the terminal to verify that it has been installed correctly.

Creating virtual environments

Creating a virtual environment with uv is simple and straightforward. Use the following command, along with your desired environment name, to create it.

uv venv

Run the following command to activate the virtual environment.

# On macOS and Linux.
source .venv/bin/activate

# On Windows.
.venv\Scripts\activate

Installing packages

Installing packages into the virtual environment follows a familiar process. The various installation methods are given below.

uv pip install flask                # Install Flask.
uv pip install -r requirements.txt  # Install from a requirements.txt file.
uv pip install -e .                 # Install current project in editable mode.
uv pip install "package @ ."        # Install current project from disk
uv pip install "flask[dotenv]"      # Install Flask with "dotenv" extra.

To synchronize the locked dependencies with the virtual environment, use the following command:

uv pip sync requirements.txt  # Install dependencies from a requirements.txt file.

uv supports a variety of command-line arguments similar to those of existing tools, including -r requirements.txt, -c constraints.txt, -e ., --index-url, and more.

Building a flask app using uv

Let’s explore some project-related commands with uv. Start by initializing a Python project named “sample-project.”

uv init sample-project

Navigate to the sample-project directory. uv initializes the project with essential files such as app.py, requirements.txt, README.md, and more.

Use the run command to execute the sample Python file. This process first creates the virtual environment folder and then runs the Python file.

uv run hello.py

Install flask

Add Flask to your project dependencies.

uv add flask

Create the Flask Application

Create a new one and write the following code.

from flask import Flask

app = Flask(__name__)

@app.route('/', methods=['GET'])
def hello_world():
    return {"message": "Hello, World!"}, 200

if __name__ == '__main__':
    app.run(debug=True)

Run the app

Use the uv run command to execute the application.

uv run app.py

Open a browser or use a tool like curl or Postman to send a GET request.

Installing python with uv

Using uv to install Python is optional, as it works seamlessly with existing Python installations. However, if installing Python through uv is preferred, it can be done with a straightforward command:

uv python install 3.12

This approach is often more convenient and reliable compared to traditional methods, as it avoids the need for managing repositories or downloading installers. Simply execute the command, and the setup is ready to use.

Tools

CLI tools can be installed and used with the uv command. For example, the huggingface_hub tools can be installed to enable pulling and pushing files to Hugging Face repositories.

Use the following command to install huggingface_hub using uv.

uv tool install huggingface_hub

The following command displays all the installed tools:

uv tool list

Cheatsheet

Here is a quick cheatsheet for performing common operations with uv:

Current Limitations

Even though uv offers a fast and efficient solution for Python package management, it has some limitations:

Incomplete pip Compatibility: Although uv supports a substantial portion of the pip interface, it does not yet cover the entire feature set. Some of these differences are intentional design choices, while others stem from uv still being in its early stages of development. For a detailed comparison, consult the pip compatibility guide.
Platform-Specific requirements.txt: Like pip-compile, uv generates platform-specific requirements.txt files. This contrasts with tools such as Poetry and PDM, which create platform-agnostic poetry.lock and pdm.lock files. Consequently, uv's requirements.txt files may lack portability across different platforms and Python versions.

Thanks for reading this article !!

Thanks Gowri M Bhatt for reviewing the content.

If you enjoyed this article, please click on the heart button ♥ and share to help others find it!

Resources

uv - An extremely fast Python package and project manager, written in Rust | docs.astral.sh

The ultimate guide to Retrieval-Augmented Generation (RAG)

Vishnu Sivan — Mon, 16 Dec 2024 09:56:09 +0000

The rapid evolution of generative AI models like OpenAI’s ChatGPT has revolutionized natural language processing, enabling these systems to generate coherent and contextually relevant responses. However, even state-of-the-art models face limitations when tackling domain-specific queries or providing highly accurate information. This often leads to challenges like hallucinations — instances where models produce inaccurate or fabricated details.

Retrieval-Augmented Generation (RAG), an innovative framework designed to bridge this gap. By seamlessly integrating external data sources, RAG empowers generative models to retrieve real-time, niche information, significantly enhancing their accuracy and reliability.

In this article, we will dive into the mechanics of RAG, explore its architecture, and discuss the limitations of traditional generative models that inspired its creation. We will also highlight practical implementations, advanced techniques, and evaluation methods, showcasing how RAG is transforming the way AI interacts with specialized data.

Getting Started

What is RAG
Architecture of RAG
RAG Process flow
RAG vs Fine tuning
Types of RAG
Applications of RAG
Building a PDF chat system with RAG
Resources

What is RAG

Retrieval-Augmented Generation (RAG) is an advanced framework that enhances the capabilities of generative AI models by integrating real-time retrieval of external data. While generative models excel at producing coherent, human-like text, they can falter when asked to provide accurate, up-to-date, or domain-specific information. This is where RAG steps in, ensuring that the responses are not only creative but also grounded in reliable and relevant sources.

RAG operates by connecting a generative model with a retrieval mechanism, typically powered by vector databases or search systems. When a query is received, the retrieval component searches through vast external datasets to fetch relevant information. The generative model then synthesizes this data, producing an output that is both accurate and contextually insightful.

By addressing key challenges like hallucinations and limited domain knowledge, RAG unlocks the potential of generative models to excel in specialized fields. Its applications span diverse industries, from automating customer support with precise answers, enabling researchers to access curated knowledge on demand. RAG represents a significant step forward in making AI systems more intelligent, trustworthy, and useful in real-world scenarios.

Architecture of RAG

A clear understanding of RAG architecture is essential for unlocking its full potential and benefits. At its core, the framework is built on two primary components: the Retriever and the Generator, working together in a seamless flow of information processing.

This overall process is illustrated below:

source: https://weaviate.io/blog/introduction-to-rag

Retrieval — The inference stage in RAG begins with retrieval, where data relevant to a user query is fetched from an external knowledge source. In a basic RAG setup, similarity search is commonly used, embedding the query and external data into the same vector space to identify the closest matches. The Retriever plays a key role in fetching documents, employing methods like Sparse Retrieval and Dense Retrieval. Sparse Retrieval, using techniques like TF-IDF or BM25, relies on exact word matches but struggles with synonyms and paraphrasing whereas Dense Retrieval leverages transformer models like BERT or RoBERTa to create semantic vector representations, enabling more accurate and nuanced matches.
Augmentation — After retrieving the most relevant data points from the external source, the augmentation process incorporates this information by embedding it into a predefined prompt template.
Generation — In the generation phase, the model uses the augmented prompt to craft a coherent, contextually accurate response by combining its internal language understanding with the retrieved external data. While augmentation integrates external facts, generation transforms this enriched information into natural, human-like text tailored to the user’s query.

RAG Process flow

All the stages and essential components of the RAG process flow, illustrated in the figure below.

source: https://www.griddynamics.com/blog/retrieval-augmented-generation-llm

Document Loading: The first step in the RAG process involves data preparation, which includes loading documents from storage, extracting, parsing, cleaning, and formatting text for document splitting. Text data can come in various formats, such as plain text, PDFs, Word documents, CSV, JSON, HTML, Markdown, or programming code. Preparing these diverse sources for LLMs typically requires converting them to plain text through extraction, parsing, and cleaning.
Document Splitting: Documents are divided into smaller, manageable segments through text splitting or chunking, which is essential for handling large documents and adhering to token limits in LLMs (e.g., GPT-3’s 2048 tokens). Strategies include fixed-size or content-aware chunking, with the approach depending on the structure and requirements of the data.

Dividing documents into smaller chunks may seem simple, but it requires careful consideration of semantics to avoid splitting sentences inappropriately, which can affect subsequent steps like question answering. A naive fixed-size chunking approach can result in incomplete information in each chunk. Most document segmentation algorithms use chunk size and overlap, where chunk size is determined by character, word, or token count, and overlaps ensure continuity by sharing text between adjacent chunks. This strategy preserves the semantic context across chunks.

Text Embedding: The text chunks are transformed into vector embeddings, which represent each chunk in a way that allows for easy comparison of semantic similarity. Vector embeddings map complex data, like text, into a mathematical space where similar data points cluster together. This process captures the semantic meaning of the text, so sentences with similar meaning, even if worded differently, are mapped close together in the vector space. For instance, “The cat chases the mouse” and “The feline pursues the rodent” would be mapped to nearby points despite their different wording.

source: https://www.griddynamics.com/blog/retrieval-augmented-generation-llm

Vector store: After documents are segmented and converted into vector embeddings, they are stored in a vector store, a specialized database for storing and managing vectors. A vector store enables efficient searches for similar vectors, which is crucial for the execution of a RAG model. The selection of a vector store depends on factors like data scale and available computational resources.

Some of the important vector databases are:

FAISS: Developed by Facebook AI, FAISS efficiently manages large collections of high-dimensional vectors and performs similarity searches and clustering in high-dimensional environments. It optimizes memory usage and query duration, making it suitable for handling billions of vectors.
Chroma: An open-source, in-memory vector database, Chroma is designed for LLM applications, offering a scalable platform for vector storage, search, and retrieval. It supports both cloud and on-premise deployment and is versatile in handling various data types and formats.
Weaviate: An open-source vector database that can be self-hosted or fully managed. It focuses on high performance, scalability, and flexibility, supporting a wide range of data types and applications. It allows for the storage of both vectors and objects, enabling the combination of vector-based and keyword-based search techniques.
Pinecone: A cloud-based, managed vector database designed to simplify the development and deployment of large-scale ML applications. Unlike many vector databases, Pinecone uses proprietary, closed-source code. It excels in handling high-dimensional vectors and is suitable for applications like similarity search, recommendation systems, personalization, and semantic search. Pinecone also features a single-stage filtering capability.
Document retrieval: The retrieval process in information retrieval systems, such as document searching or question answering, begins when a query is received and transformed into a vector using the same embedding model as the document indexing. The goal is to return relevant document chunks by comparing the query vector with stored chunk vectors in the index (vector store). The retriever’s role is to identify and return the IDs of relevant document chunks, without storing documents. Various search methods can be used, such as similarity search (based on cosine similarity) and threshold-based retrieval, which only returns documents exceeding a certain similarity score. Additionally, LLM-aided retrieval is useful for queries involving both content and metadata filtering.

source: https://www.griddynamics.com/blog/retrieval-augmented-generation-llm

Answer generation: In the retrieval process, relevant document chunks are combined with the user query to generate a context and prompt for the LLM. The simplest approach, called the “Stuff” method in LangChain, involves funneling all chunks into the same context window for a direct, straightforward answer. However, this method struggles with large document volumes and complex queries due to context window limitations. To address this, alternative methods like Map-reduce, Refine, and Map-rerank are used. Map-reduce sends documents separately to the LLM, then combines the responses. Refine iteratively updates the prompt to refine the answer, while Map-rerank ranks documents based on relevance, ideal for multiple compelling answers.

RAG vs Fine tuning

RAG (Retrieval-Augmented Generation) and fine-tuning are two key methods to extend LLM capabilities, each suited to different scenarios. Fine-tuning involves retraining LLMs on domain-specific data to perform specialized tasks, ideal for static, narrow use cases like branding or creative writing that require a specific tone or style. However, it is costly, time-consuming, and unsuitable for dynamic, frequently updated data.

On the other hand, RAG enhances LLMs by retrieving external data dynamically without modifying model weights, making it cost-effective and ideal for real-time, data-driven environments like legal, financial, or customer service applications. RAG enables LLMs to handle large, unstructured internal document corpora, offering significant advantages over traditional methods for navigating messy data repositories.

Fine-tuning excels at creating nuanced, consistent outputs whereas RAG provides up-to-date, accurate information by leveraging external knowledge bases. In practice, RAG is often the preferred choice for applications requiring real-time, adaptable responses, especially in enterprises managing vast, unstructured data.

Types of RAG

There are several types of Retrieval-Augmented Generation (RAG) approaches, each tailored to specific use cases and objectives. The primary types include:

source: https://x.com/weaviate_io/status/1866528335884325070

Native RAG: Refers to a tightly integrated approach where the retrieval and generative components of a Retrieval-Augmented Generation system are designed to work seamlessly within the same architecture. Unlike traditional implementations that rely on external tools or APIs, native RAG optimizes the interaction between retrieval mechanisms and generative models, enabling faster processing and improved contextual relevance. This approach often uses in-memory processing or highly optimized local databases, reducing latency and resource overhead. Native RAG systems are typically tailored for specific use cases, providing enhanced efficiency, accuracy, and cost-effectiveness by eliminating dependencies on third-party services.
Retrieve and Rerank RAG: Focuses on refining the retrieval process to improve accuracy and relevance. In this method, an initial set of documents or chunks is retrieved based on the query’s semantic similarity, usually determined by cosine similarity in the embedding space. Subsequently, a reranking model reorders the retrieved documents based on their contextual relevance to the query. This reranking step often leverages deep learning models or transformers, allowing more nuanced ranking beyond basic similarity metrics. By prioritizing the most relevant documents, this approach ensures the generative model receives contextually enriched input, significantly enhancing response quality.
Multimodal RAG: Extends the traditional RAG paradigm by incorporating multiple data modalities, such as text, images, audio, or video, into the retrieval-augmented generation pipeline. It allows the system to retrieve and generate responses that integrate diverse forms of data. For instance, in a scenario involving image-based queries, the system might retrieve relevant images alongside textual content to create a more comprehensive answer. Multimodal RAG is particularly useful in domains like e-commerce, medical imaging, and multimedia content analysis, where insights often rely on a combination of textual and visual information.
Graph RAG: Leverages graph-based data structures to model and retrieve information based on relationships and connections between entities. In this approach, knowledge is organized as a graph where nodes represent entities (e.g., concepts, documents, or objects), and edges capture their relationships (e.g., semantic, hierarchical, or temporal). Queries are processed to identify subgraphs or paths relevant to the input, and these subgraphs are then fed into the generative model. This method is especially valuable in domains like scientific research, social networks, and knowledge management, where relational insights are critical.
Hybrid RAG: Combines multiple retrieval techniques, such as dense and sparse retrieval, to enhance performance across diverse query types. Dense retrieval uses vector embeddings to capture semantic similarities, while sparse retrieval relies on keyword-based methods, like BM25, for precise matches. By integrating these methods, Hybrid RAG balances precision and recall, making it versatile across scenarios where queries may be highly specific or abstract. It is particularly effective in environments with heterogeneous data, ensuring that both high-level semantics and specific keywords are considered during retrieval.
Agentic RAG (Router): Employs a decision-making layer to dynamically route queries to appropriate retrieval and generative modules based on their characteristics. The router analyzes incoming queries to determine the optimal processing path, which may involve different retrieval methods, data sources, or even specialized generative models. This approach ensures that the system tailors its operations to the specific needs of each query, enhancing efficiency and accuracy in diverse applications.
Agentic RAG (Multi-Agent RAG): Multi-Agent RAG involves a collaborative framework where multiple specialized agents handle distinct aspects of the retrieval and generation process. Each agent is responsible for a specific task, such as retrieving data from a particular domain, reranking results, or generating responses in a specific style. These agents communicate and collaborate to deliver cohesive outputs. Multi-Agent RAG is particularly powerful for complex, multi-domain queries, as it enables the system to leverage the expertise of different agents to provide comprehensive and nuanced responses.

Applications of RAG

The Retrieval-Augmented Generation (RAG) framework has diverse applications across various industries due to its ability to dynamically integrate external knowledge into generative language models. Here are some prominent applications:

Customer Support and Service: RAG systems are widely used in customer support to create intelligent chatbots capable of answering complex queries by retrieving relevant data from product manuals, knowledge bases, and company policy documents. This ensures that customers receive accurate and up-to-date information, enhancing their experience.
Legal Document Analysis: In the legal field, RAG can parse, retrieve, and generate summaries or answers from vast corpora of case law, contracts, and legal documents. It is particularly useful for conducting legal research, drafting contracts, and ensuring compliance with regulations.
Financial Analysis: RAG is employed in financial services to analyze earnings reports, market trends, and regulatory documents. By retrieving relevant financial data, it can help analysts generate insights, reports, or even real-time answers to queries about market performance.
Healthcare and Medical Diagnostics: In healthcare, RAG is utilized to retrieve and synthesize information from medical literature, patient records, and treatment guidelines. It aids in diagnostic support, drug discovery, and personalized treatment recommendations, ensuring clinicians have access to the latest and most relevant data.
Education and E-Learning: RAG-powered tools assist in personalized education by retrieving course material and generating tailored answers or study guides. They can enhance learning platforms by providing contextual explanations and dynamic content based on user queries.
E-Commerce and Retail: In e-commerce, RAG systems improve product search and recommendation engines by retrieving data from catalogs and customer reviews. They also enable conversational shopping assistants that provide personalized product suggestions based on user preferences.
Intelligent Virtual Assistants: RAG enhances virtual assistants like Alexa or Siri by providing accurate and contextually relevant responses, especially for queries requiring external knowledge, such as real-time weather updates or local business information.

Building a PDF chat system using RAG

In this section, we will develop a streamlit application capable of understanding the contents of a PDF and responding to user queries based on that content using the Retrieval-Augmented Generation (RAG). The implementation leverages the LangChain platform to facilitate interactions with LLMs and vector stores. We will utilize OpenAI’s LLM and its embedding models to construct a FAISS vector store for efficient information retrieval.

Installing dependencies

Create and activate a virtual environment by executing the following command.

python -m venv venv
source venv/bin/activate #for ubuntu
venv/Scripts/activate #for windows

Install langchain, langchain_community, openai, faiss-cpu, PyPDF2, streamlit, python-dotenv, tiktoken libraries using pip.

pip install langchain langchain_community openai faiss-cpu PyPDF2 streamlit python-dotenv tiktoken

Setting up environment and credentials

Create a file named .env. This file will store your environment variables, including the OpenAI key, model and embeddings.
Open the .env file and add the following code to specify your OpenAI credentials:

OPENAI_API_KEY=sk-proj-xcQxBf5LslO62At...
OPENAI_MODEL_NAME=gpt-3.5-turbo
OPENAI_EMBEDDING_MODEL_NAME=text-embedding-3-small

Importing environment variables

Create a file named app.py.
Add OpenAI credentials to the environment variables.

from dotenv import load_dotenv
import os
load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
OPENAI_MODEL_NAME = os.getenv("OPENAI_MODEL_NAME")
OPENAI_EMBEDDING_MODEL_NAME = os.getenv("OPENAI_EMBEDDING_MODEL_NAME")

Importing required libraries

Import essential libraries for building the app, handling PDFs such as langchain, streamlit, pyPDF.

import streamlit as st
from PyPDF2 import PdfReader
from langchain.text_splitter import CharacterTextSplitter
from langchain.prompts import PromptTemplate
from langchain_community.embeddings import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain
from langchain_community.chat_models import ChatOpenAI
from htmlTemplates import bot_template, user_template, css

Defining a function to extract text from PDFs

Use PyPDF2 to extract text from uploaded PDF files.

def get_pdf_text(pdf_files):
    text = ""
    for pdf_file in pdf_files:
        reader = PdfReader(pdf_file)
        for page in reader.pages:
            text += page.extract_text()
    return text

Splitting extracted text into chunks

Divide large text into smaller, manageable chunks using LangChain’s CharacterTextSplitter.

def get_chunk_text(text):
    text_splitter = CharacterTextSplitter(
        separator="\n",
        chunk_size=1000,
        chunk_overlap=200,
        length_function=len
    )
    chunks = text_splitter.split_text(text)
    return chunks

Creating a vector store for text embeddings

Generate embeddings for text chunks and store them in a vector database using FAISS.

def get_vector_store(text_chunks):
    embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY, model=OPENAI_EMBEDDING_MODEL_NAME)
    vectorstore = FAISS.from_texts(texts=text_chunks, embedding=embeddings)
    return vectorstore

Building a conversational retrieval chain

Define a chain that retrieves information from the vector store and interacts with the user via an LLM.

def get_conversation_chain(vector_store):
    llm = ChatOpenAI(openai_api_key=OPENAI_API_KEY, model_name=OPENAI_MODEL_NAME, temperature=0)
    memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True)
    system_template  =  """
    Use  the following pieces of context and chat history to answer the question at the end. 
    If you don't know the answer, just say that you don't know, don't try to make up an answer.

    Context: {context}

    Chat history: {chat_history}

    Question: {question}
    Helpful Answer:
    """
    prompt = PromptTemplate(
        template=system_template,
        input_variables=["context", "question",  "chat_history"],
    )
    conversation_chain = ConversationalRetrievalChain.from_llm(
        verbose = True,
        llm=llm,
        retriever=vector_store.as_retriever(),
        memory=memory,
        combine_docs_chain_kwargs={"prompt": prompt}
    )
    return conversation_chain

Handling user queries

Process user input, pass it to the conversation chain, and update the chat history.

def handle_user_input(question):
    try:
        response = st.session_state.conversation({'question': question})
        st.session_state.chat_history = response['chat_history']
    except Exception as e:
        st.error('Please select PDF and click on Process.')

Creating custom HTML template for streamlit chat

To create a custom chat interface for both user and bot messages using CSS, design custom templates and style them with CSS.

Create a file named htmlTemplates.py and add the following code to it.

css = '''
<style>
.chat-message {
    padding: 1rem; border-radius: 0.5rem; margin-bottom: 1rem; display: flex
}
.chat-message.user {
    background-color: #2b313e
}
.chat-message.bot {
    background-color: #475063
}
.chat-message .avatar {
  width: 10%;
}
.chat-message .avatar img {
  max-width: 30px;
  max-height: 30px;
  border-radius: 50%;
  object-fit: cover;
}
.chat-message .message {
  width: 90%;
  padding: 0 1rem;
  color: #fff;
}
'''

bot_template = '''
<div class="chat-message bot">
    <div class="avatar">
        <img src="proxy.php?url=https://cdn-icons-png.flaticon.com/128/773/773330.png">
    </div>
    <div class="message">{{MSG}}</div>
</div>
'''

user_template = '''
<div class="chat-message user">
    <div class="avatar">
        <img src="proxy.php?url=https://cdn-icons-png.flaticon.com/128/6997/6997674.png">
    </div>    
    <div class="message">{{MSG}}</div>
</div>
'''

Displaying chat history

Show the user and AI conversation history in a reverse order with HTML templates for formatting.

def display_chat_history():
    if st.session_state.chat_history:
        reversed_history = st.session_state.chat_history[::-1]

        formatted_history = []
        for i in range(0, len(reversed_history), 2):
            chat_pair = {
                "AIMessage": reversed_history[i].content,
                "HumanMessage": reversed_history[i + 1].content
            }
            formatted_history.append(chat_pair)

        for i, message in enumerate(formatted_history):
            st.write(user_template.replace("{{MSG}}", message['HumanMessage']), unsafe_allow_html=True)
            st.write(bot_template.replace("{{MSG}}", message['AIMessage']), unsafe_allow_html=True)

Building Streamlit app interface

Set up the main app interface for file uploads, question input, and chat history display.

def main():
    st.set_page_config(page_title='Chat with PDFs', page_icon=':books:')
    st.write(css, unsafe_allow_html=True)

    if "conversation" not in st.session_state:
        st.session_state.conversation = None
    if "chat_history" not in st.session_state:
        st.session_state.chat_history = None

    st.header('Chat with PDFs :books:')

    question = st.text_input("Ask anything to your PDF:")
    if question:
        handle_user_input(question)

    if st.session_state.chat_history is not None:
        display_chat_history()

    with st.sidebar:
        st.subheader("Upload your Documents Here: ")
        pdf_files = st.file_uploader("Choose your PDF Files and Press Process button", type=['pdf'], accept_multiple_files=True)

        if pdf_files and st.button("Process"):
            with st.spinner("Processing your PDFs..."):
                try:
                    # Get PDF Text
                    raw_text = get_pdf_text(pdf_files)
                    # Get Text Chunks
                    text_chunks = get_chunk_text(raw_text)
                    # Create Vector Store
                    vector_store = get_vector_store(text_chunks)
                    st.success("Your PDFs have been processed successfully. You can ask questions now.")
                    # Create conversation chain
                    st.session_state.conversation = get_conversation_chain(vector_store)
                except Exception as e:
                    st.error(f"An error occurred: {e}")

if __name__ == '__main__':
    main()

Complete Code for the PDF Chat Application

The following is the complete code implementation for the PDF Chat Application. It integrates environment variable setup, text extraction, vector storage, and RAG features into a streamlined solution:

from dotenv import load_dotenv
import os
load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
OPENAI_MODEL_NAME = os.getenv("OPENAI_MODEL_NAME")
OPENAI_EMBEDDING_MODEL_NAME = os.getenv("OPENAI_EMBEDDING_MODEL_NAME")

import streamlit as st
from PyPDF2 import PdfReader
from langchain.text_splitter import CharacterTextSplitter
from langchain.prompts import PromptTemplate
from langchain_community.embeddings import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain
from langchain_community.chat_models import ChatOpenAI
from htmlTemplates import bot_template, user_template, css

def get_pdf_text(pdf_files):
    text = ""
    for pdf_file in pdf_files:
        reader = PdfReader(pdf_file)
        for page in reader.pages:
            text += page.extract_text()
    return text

def get_chunk_text(text):
    text_splitter = CharacterTextSplitter(
        separator="\n",
        chunk_size=1000,
        chunk_overlap=200,
        length_function=len
    )
    chunks = text_splitter.split_text(text)
    return chunks

def get_vector_store(text_chunks):
    embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY, model=OPENAI_EMBEDDING_MODEL_NAME)
    vectorstore = FAISS.from_texts(texts=text_chunks, embedding=embeddings)
    return vectorstore

def get_conversation_chain(vector_store):
    llm = ChatOpenAI(openai_api_key=OPENAI_API_KEY, model_name=OPENAI_MODEL_NAME, temperature=0)
    memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True)
    system_template  =  """
    Use  the following pieces of context and chat history to answer the question at the end. 
    If you don't know the answer, just say that you don't know, don't try to make up an answer.

    Context: {context}

    Chat history: {chat_history}

    Question: {question}
    Helpful Answer:
    """
    prompt = PromptTemplate(
        template=system_template,
        input_variables=["context", "question",  "chat_history"],
    )
    conversation_chain = ConversationalRetrievalChain.from_llm(
        verbose = True,
        llm=llm,
        retriever=vector_store.as_retriever(),
        memory=memory,
        combine_docs_chain_kwargs={"prompt": prompt}
    )
    return conversation_chain

def handle_user_input(question):
    try: 
        response = st.session_state.conversation({'question': question})
        st.session_state.chat_history = response['chat_history']
    except Exception as e:
        st.error('Please select PDF and click on OK.')

def display_chat_history():
    if st.session_state.chat_history:
        reversed_history = st.session_state.chat_history[::-1]

        formatted_history = []
        for i in range(0, len(reversed_history), 2):
            chat_pair = {
                "AIMessage": reversed_history[i].content,
                "HumanMessage": reversed_history[i + 1].content
            }
            formatted_history.append(chat_pair)

        for i, message in enumerate(formatted_history):
            st.write(user_template.replace("{{MSG}}", message['HumanMessage']), unsafe_allow_html=True)
            st.write(bot_template.replace("{{MSG}}", message['AIMessage']), unsafe_allow_html=True)

def main():
    st.set_page_config(page_title='Chat with PDFs', page_icon=':books:')
    st.write(css, unsafe_allow_html=True)

    if "conversation" not in st.session_state:
        st.session_state.conversation = None
    if "chat_history" not in st.session_state:
        st.session_state.chat_history = None

    st.header('Chat with PDFs :books:')

    question = st.text_input("Ask anything to your PDF:")
    if question:
        handle_user_input(question)

    if st.session_state.chat_history is not None:
        display_chat_history()

    with st.sidebar:
        st.subheader("Upload your Documents Here: ")
        pdf_files = st.file_uploader("Choose your PDF Files and Press Process button", type=['pdf'], accept_multiple_files=True)

        if pdf_files and st.button("Process"):
            with st.spinner("Processing your PDFs..."):
                try:
                    # Get PDF Text
                    raw_text = get_pdf_text(pdf_files)
                    # Get Text Chunks
                    text_chunks = get_chunk_text(raw_text)
                    # Create Vector Store
                    vector_store = get_vector_store(text_chunks)
                    st.success("Your PDFs have been processed successfully. You can ask questions now.")
                    # Create conversation chain
                    st.session_state.conversation = get_conversation_chain(vector_store)
                except Exception as e:
                    st.error(f"An error occurred: {e}")

if __name__ == '__main__':
    main()

Run the Application

Execute the app with Streamlit using the following command.

streamlit run app.py

You will get output as follows,

Thanks for reading this article !!

Thanks Gowri M Bhatt for reviewing the content.

If you enjoyed this article, please click on the heart button ♥ and share to help others find it!

The full source code for this tutorial can be found here,

codemaker2015/pdf-chat-using-RAG | github.com

Resources

Building a symptoms-based diagnosis system using all-MiniLM-L6-V2

Vishnu Sivan — Mon, 16 Dec 2024 09:02:18 +0000

Small Language Models (SLMs) are compact neural models designed for efficiency, balancing lightweight architecture with effective performance on tasks like sentiment analysis and embedding generation. MiniLM developed by Microsoft, exemplifies this with its optimized speed and accuracy for natural language understanding while using minimal resources. all-MiniLM-L6-v2 is a specialized version of MiniLM, fine-tuned for sentence embeddings.

In this article, we will explore SLMs and demonstrates creating a symptoms-based diagnosis system using all-MiniLM-L6-V2.

Getting Started

What is Small Language Model (SLM)
What is all-MiniLM-L6-V2
Experimenting with all-MiniLM-L6-V2
Sentence similarity using all-MiniLM-L6-V2
Building a symptoms-based diagnosis system
Importing necessary libraries
Importing dataset
Initializing sentence transformers
Finding conditions by symptoms
Testing with sample input
Resources

What is Small Language Model (SLM)

Small Language Models (SLMs) are lightweight versions of large language models (LLMs) designed to be computationally efficient while retaining robust language processing capabilities. Unlike LLMs, which require substantial hardware resources and often operate in cloud-based environments, SLMs can run on less powerful devices, making them suitable for edge applications or scenarios with limited resources.

Key Characteristics of SLMs:

Compact Size: SLMs have fewer parameters, making them smaller in storage and faster in inference time compared to their larger counterparts.
Efficiency: Optimized for resource-constrained environments without significant loss of functionality for common tasks.
Specific Use Cases: Often tailored for particular tasks, such as classification, summarization, or recommendation systems, to maximize efficiency and relevance.
Transfer Learning: Many SLMs are pre-trained on large datasets and fine-tuned for specific tasks, similar to LLMs, ensuring task-specific performance. ### Examples of SLMs:
MiniLM: Known for its efficiency, MiniLM achieves near state-of-the-art performance in tasks like semantic similarity and text classification with fewer computational resources.
DistilBERT: A smaller, faster, and cheaper variant of BERT, designed for general-purpose tasks while maintaining strong accuracy.
TinyBERT: Focused on low-latency applications and mobile device compatibility.
ALBERT: A lite version of BERT that achieves compactness through parameter sharing and factorization techniques. ### Applications: SLMs are widely used in:
Mobile and embedded systems for on-device processing.
Real-time applications, such as chatbots or recommendation systems.
Domains where low latency and privacy are critical (e.g., healthcare or financial systems).

What is all-MiniLM-L6-V2

MiniLM (Minimal Language Model) is a family of lightweight transformer-based models designed for natural language understanding and retrieval tasks. Developed by Microsoft Research, it focuses on achieving high performance similar to large models like BERT while being computationally efficient. MiniLM is particularly useful for scenarios requiring real-time processing or where resources are limited, such as mobile or edge devices.

all-MiniLM-L6-v2 is a specialized version of MiniLM, fine-tuned for sentence embeddings. It is part of the Sentence Transformers library and is widely used for generating high-quality sentence embeddings in tasks requiring semantic textual similarity.

Key Characteristics:

Architecture: MiniLM-L6 refers to a 6-layer version of MiniLM. V2 signifies an updated and optimized version.
Optimization: Fine-tuned on large-scale datasets for sentence similarity tasks. Pre-trained on the MS MARCO dataset for information retrieval and question answering, ensuring strong semantic understanding.
Output: Produces 384-dimensional sentence embeddings, balancing quality and efficiency.
Applications: Semantic search, text clustering, question answering systems, recommendation engines.

Experimenting with all-MiniLM-L6-V2

Let's get started exploring all-MiniLM-L6-V2 by installing sentence-transformers library.

Installing dependencies

Create and activate a virtual environment by executing the following command.

python -m venv venv
source venv/bin/activate #for ubuntu
venv/Scripts/activate #for windows

Install sentence-transformers, pandas libraries using pip.

pip install -U sentence-transformers pandas

Sentence similarity using all-MiniLM-L6-V2

Let’s create embeddings for an array of sentences and compute the similarities between them.

Create a file named app.py and add the following code to it.

from sentence_transformers import SentenceTransformer, util

# Load the MiniLM model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Define an array of sentences
sentences = [
    "The quick brown fox jumps over the lazy dog.",
    "A fast dark fox leaps across a sleepy canine.",
    "The weather is sunny and warm today.",
    "The forecast predicts a bright and hot day."
]

# Create embeddings for each sentence
embeddings = model.encode(sentences, convert_to_tensor=True)

# Calculate pairwise cosine similarity
similarity_matrix = util.cos_sim(embeddings, embeddings)

# Display the similarity scores
print("Sentence Similarity Scores:")
for i in range(len(sentences)):
    for j in range(i + 1, len(sentences)):
        print(f"Similarity between \"{sentences[i]}\" and \"{sentences[j]}\": {similarity_matrix[i][j]:.4f}")

Run the code using the following command to see the output.

python app.py

The expected output is as follows:

Building a symptoms-based diagnosis system

A symptoms-based diagnosis system using all-MiniLM-L6-V2 converts medical text, such as symptoms or treatments, into embeddings that capture context. These embeddings enable effective comparison of symptoms, providing accurate condition or treatment recommendations and helping users discover relevant care options.

Importing necessary libraries

Import sentence transformers to use all-MiniLM-L6-V2 model and pandas for loading the dataset.

import pandas as pd
from sentence_transformers import SentenceTransformer, util
pd.set_option('display.max_columns', None)

Importing dataset

Kaggle provides a dataset with information on symptoms and treatments for over 400 medical conditions.

Disease and Symptoms | Explore Symptoms and Treatments for 400+ Medical Conditions! | www.kaggle.com

This dataset is loaded into a Pandas DataFrame named df, and the first few entries are displayed to understand its structure and content.

df = pd.read_csv('Diseases_Symptoms.csv')
print(df.head())

Initializing sentence transformers

The Sentence Transformer model all-MiniLM-L6-v2 is initialized to convert the symptom descriptions in the dataset's symptom column into vector embeddings. A new column, Symptom_Embedding, is added to the DataFrame to store the embeddings for each disease's symptoms.

model = SentenceTransformer('all-MiniLM-L6-v2')
df['Symptom_Embedding'] = df['Symptoms'].apply(lambda x: model.encode(x))

Finding conditions by symptoms

Define a functionfind_condition_by_symptoms() which identifies the best-matching medical condition based on user-provided symptoms. It generates an embedding for the input symptoms and calculates cosine similarity with pre-computed embeddings of diseases in the dataset. The similarity scores are stored in the Similarity column, and the condition with the highest score is identified as the best match using .idxmax(). The function then retrieves and returns the Name of the disease and its corresponding Treatments.

def find_condition_by_symptoms(input_symptoms):
    input_embedding = model.encode(input_symptoms)
    df['Similarity'] = df['Symptom_Embedding'].apply(lambda x: util.cos_sim(input_embedding, x).item())
    best_match = df.loc[df['Similarity'].idxmax()]
    return best_match['Name'], best_match['Treatments']

Testing with sample input

Provide an example input for symptoms to pass to the find_condition_by_symptoms() function. The function will return and print the name of the matching condition along with the recommended treatments.

symptoms = "Fever, sore throat, and fatigue"
condition, treatments = find_condition_by_symptoms(symptoms)

print("Symptoms:", symptoms)
print("Condition:", condition)
print("Recommended Treatments:", treatments)

Final code

Below is the complete code for the app.

import pandas as pd
from sentence_transformers import SentenceTransformer, util
pd.set_option('display.max_columns', None)

# Load the data
df = pd.read_csv('Diseases_Symptoms.csv')
# print(df.head())

# Initialize a Sentence Transformer model to generate embeddings
model = SentenceTransformer('all-MiniLM-L6-v2')
# Generate embeddings for each condition's symptoms
df['Symptom_Embedding'] = df['Symptoms'].apply(lambda x: model.encode(x))

# Function to find matching condition based on input symptoms
def find_condition_by_symptoms(input_symptoms):
    input_embedding = model.encode(input_symptoms)
    df['Similarity'] = df['Symptom_Embedding'].apply(lambda x: util.cos_sim(input_embedding, x).item())
    best_match = df.loc[df['Similarity'].idxmax()]
    return best_match['Name'], best_match['Treatments']

# Sample input and output
symptoms = "Fever, sore throat, and fatigue"
condition, treatments = find_condition_by_symptoms(symptoms)

print("Symptoms:", symptoms)
print("Condition:", condition)
print("Recommended Treatments:", treatments)

If you run the app then the expected output is as follows:

MiniLM-L6-V2 helps to improve healthcare accessibility and efficiency through symptom-based disease diagnosis. By generating embeddings for user-provided symptoms, the system can accurately identify conditions and offer treatment recommendations. However, challenges such as incomplete data, symptom variability, and data security need to be addressed to enhance accuracy and user experience.

Thanks for reading this article !!

Thanks Gowri M Bhatt for reviewing the content.

If you enjoyed this article, please click on the heart button ♥ and share to help others find it!

The full source code for this tutorial can be found here,

GitHub - codemaker2015/diagnosis-system-using-MiniLM: Building a symptoms-based diagnosis system : github.com

Resources

AISuite: Simplifying GenAI integration across multiple LLM providers

Vishnu Sivan — Mon, 16 Dec 2024 08:50:06 +0000

Generative AI (Gen AI) is reshaping industries with its potential for creativity, problem-solving, and automation. However, developers often face significant challenges when integrating large language models (LLMs) from different providers due to fragmented APIs and configurations. This lack of interoperability complicates workflows, extends development timelines, and hampers the creation of effective Gen AI applications.

To address this, Andrew Ng’s team has introduced AISuite, an open-source Python library that streamlines the integration of LLMs across providers like OpenAI, Anthropic, and Ollama. AISuite enables developers to switch between models with a simple “provider:model” string (e.g., openai:gpt-4o or anthropic:claude-3-5), eliminating the need for extensive code rewrites. By providing a unified interface, AISuite significantly reduces complexity, accelerates development, and opens new possibilities for building versatile Gen AI applications.

In this article, we will explore how AISuite works, its practical applications, and its effectiveness in addressing the challenges of working with diverse LLMs.

Getting Started

What is AISuite
Why is AISuite important
Experimenting with AISuite
Creating a Chat Completion
Creating a generic function for querying

What is AISuite

AISuite is an open-source Python library developed by Andrew Ng’s team to simplify the integration and management of large language models (LLMs) from multiple providers. It abstracts the complexities of working with diverse APIs, configurations, and data formats, providing developers with a unified framework to streamline their workflows.

Key Features of AISuite:

Straightforward Interface: AISuite offers a simple and consistent interface for managing various LLMs. Developers can integrate models into their applications with just a few lines of code, significantly lowering the barriers to entry for Gen AI projects.
Unified Framework: By abstracting the differences between multiple APIs, AISuite handles different types of requests and responses seamlessly. This reduces development overhead and accelerates prototyping and deployment.
Easy Model Switching: With AISuite, switching between models is as easy as changing a single string in the code. For example, developers can specify a “provider:model” combination like openai:gpt-4o or anthropic:claude-3-5 without rewriting significant parts of their application.
Extensibility: AISuite is designed to adapt to the evolving Gen AI landscape. Developers can add new models and providers as they become available, ensuring applications remain up-to-date with the latest AI capabilities.

Why is AISuite Important?

AISuite addresses a critical pain point in the Gen AI ecosystem: the lack of interoperability between LLMs from different providers. By providing a unified interface, it simplifies the development process, saving time and reducing costs. This flexibility allows teams to optimize performance by selecting the best model for specific tasks.

Early benchmarks and community feedback highlight AISuite’s ability to reduce integration time for multi-model applications, improving developer efficiency and productivity. As the Gen AI ecosystem grows, AISuite lowers barriers for experimenting, building, and scaling AI-powered solutions.

Experimenting with AISuite

Lets get started exploring AISuite by installing necessary dependencies.

Installing dependencies

Create and activate a virtual environment by executing the following command.

python -m venv venv
source venv/bin/activate #for ubuntu
venv/Scripts/activate #for windows

Install aisuite, openai and python-dotenv libraries using pip.

pip install aisuite[all] openai python-dotenv

Setting up environment and credentials

Create a file named .env. This file will store your environment variables, including the OpenAI key.

Open the .env file and add the following code to specify your OpenAI API key:

OPENAI_API_KEY=sk-proj-7XyPjkdaG_gDl0_...
GROQ_API_KEY=gsk_8NIgj24k2P0J5RwrwoOBW...

Add API keys to the environment variables.

import os
from dotenv import load_dotenv
load_dotenv()
os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY')
os.environ['ANTHROPIC_API_KEY'] = getpass('Enter your ANTHROPIC API key: ')

Initialize the AISuite Client

Create an instance of the AISuite client, enabling standardized interaction with multiple LLMs.

client = ai.Client()
Defining the prompt
The prompt syntax closely resembles OpenAI’s structure, incorporating roles and content.

messages = [
   {"role": "system", "content": "You are a helpful assistant."},
   {"role": "user", "content": "Tell a joke in 1 line."}
]

Querying the model

User can query the model using AISuite as follows.

# openai model
response = client.chat.completions.create(model="openai:gpt-4o", messages=messages, temperature=0.75)
# ollama model
response = client.chat.completions.create(model="ollama:llama3.1:8b", messages=messages, temperature=0.75)
# anthropic model
response = client.chat.completions.create(model="anthropic:claude-3-5-sonnet-20241022", messages=messages, temperature=0.75)
# groq model
response = client.chat.completions.create(model="groq:llama-3.2-3b-preview", messages=messages, temperature=0.75)
print(response.choices[0].message.content)

model="openai:gpt-4o": Specifies type and version of the model.
messages=messages: Sends the previously defined prompt to the model.
temperature=0.75: Adjusts the randomness of the response. Higher values encourage creative outputs, while lower values produce more deterministic results.
response.choices[0].message.content: Retrieves the text content from the model's response.

Creating a Chat Completion

Lets create a chat completion code using OpenAI model.

import os
from dotenv import load_dotenv
load_dotenv()
os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY')

import aisuite as ai

client = ai.Client()

provider = "openai"
model_id = "gpt-4o"

messages = [
    {"role": "system", "content": "You are a helpful assistant"},
    {"role": "user", "content": "Provide an overview of the latest trends in AI"},
]

response = client.chat.completions.create(
    model = f"{provider}:{model_id}",
    messages = messages,
)

print(response.choices[0].message.content)

Run the app using the following command.

python app.py

You will get output as follows,

Creating a generic function for querying

Instead of writing separate code for calling different models, let’s create a generic function to eliminate code repetition and improve efficiency.

def ask(message, sys_message="You are a helpful assistant", model="openai:gpt-4o"):
    client = ai.Client()
    messages = [
        {"role": "system", "content": sys_message},
        {"role": "user", "content": message}
    ]
    response = client.chat.completions.create(model=model, messages=messages)
    return response.choices[0].message.content

print(ask("Provide an overview of the latest trends in AI"))

The ask function is a reusable utility designed for sending queries to an AI model. It accepts the following parameters:

message: The user's query or prompt. sys_message (optional): A system-level instruction to guide the model's behavior.
model: Specifies the AI model to be used. The function processes the input parameters, sends them to the specified model, and returns the AI’s response, making it a versatile tool for interacting with various models.

Below is the complete code for interacting with the OpenAI model using the generic ask function.

import os
from dotenv import load_dotenv
load_dotenv()
os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY')

import aisuite as ai

def ask(message, sys_message="You are a helpful assistant", model="openai:gpt-4o"):
    client = ai.Client()
    messages = [
        {"role": "system", "content": sys_message},
        {"role": "user", "content": message}
    ]
    response = client.chat.completions.create(model=model, messages=messages)
    return response.choices[0].message.content

print(ask("Provide an overview of the latest trends in AI"))

Running the code will produce the following output.

Interacting with multiple APIs

Let’s explore interacting with multiple models using AISuite through the following code.

import os
from dotenv import load_dotenv
load_dotenv()
os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY')
os.environ['GROQ_API_KEY'] = os.getenv('GROQ_API_KEY')

import aisuite as ai

def ask(message, sys_message="You are a helpful assistant", model="openai:gpt-4o"):
    client = ai.Client()
    messages = [
        {"role": "system", "content": sys_message},
        {"role": "user", "content": message}
    ]
    response = client.chat.completions.create(model=model, messages=messages)
    return response.choices[0].message.content

print(ask("Who is your creator?"))
print(ask('Who is your creator?', model='ollama:qwen2:1.5b'))
print(ask('Who is your creator?', model='groq:llama-3.1-8b-instant'))
print(ask('Who is your creator?', model='anthropic:claude-3-5-sonnet-20241022'))

There may be challenges when interacting with providers like Anthropic or Groq. Hopefully, the AISuite team is actively addressing these issues to ensure seamless integration and functionality.

AISuite is a powerful tool for navigating the landscape of large language models. It enables users to leverage the strengths of multiple AI providers while streamlining development and encouraging innovation. With its open-source foundation and intuitive design, AISuite stands out as a cornerstone for modern AI application development.

Thanks for reading this article !!

Thanks Gowri M Bhatt for reviewing the content.

If you enjoyed this article, please click on the heart button ♥ and share to help others find it!

The full source code for this tutorial can be found here,

GitHub - codemaker2015/aisuite-examples : github.com

Resources

GitHub - andrewyng/aisuite: Simple, unified interface to multiple Generative AI providers : github.com

Building a video insights generator using Gemini Flash

Vishnu Sivan — Tue, 19 Nov 2024 03:32:41 +0000

Video understanding or video insights are crucial across various industries and applications due to their multifaceted benefits. They enhance content analysis and management by automatically generating metadata, categorizing content, and making videos more searchable. Moreover, video insights provide critical data that drive decision-making, enhance user experiences, and improve operational efficiencies across diverse sectors.

Google’s Gemini 1.5 model brings significant advancements to this field. Beyond its impressive improvements in language processing, this model can handle an enormous input context of up to 1 million tokens. To further its capabilities, Gemini 1.5 is trained as a multimodal model, natively processing text, images, audio, and video. This powerful combination of varied input types and extensive context size opens up new possibilities for processing long videos effectively.

In this article, we will dive into how Gemini 1.5 can be leveraged for generating valuable video insights, transforming the way we understand and utilize video content across different domains.

Getting Started

What is Gemini 1.5
Prerequisites
Installing dependencies
Setting up the Gemini API key
Setting up the environment variables
Importing the libraries
Initializing the project
Saving uploaded files
Generating insights from videos
Upload a video to the Files API
Get File
Response Generation
Delete File
Combining the stages
Creating the interface
Creating the streamlit app

What is Gemini 1.5

Google’s Gemini 1.5 represents a significant leap forward in AI performance and efficiency. Building upon extensive research and engineering innovations, this model features a new Mixture-of-Experts (MoE) architecture, enhancing both training and serving efficiency. Available in public preview, Gemini 1.5 Pro and 1.5 Flash offer an impressive 1 million token context window through Google AI Studio and Vertex AI.

Google Gemini updates: Flash 1.5, Gemma 2 and Project Astra (blog.google)
The 1.5 Flash model, the newest addition to the Gemini family, is the fastest and most optimized for high-volume, high-frequency tasks. It is designed for cost-efficiency and excels in applications such as summarization, chat, image and video captioning, and extracting data from extensive documents and tables. With these advancements, Gemini 1.5 sets a new standard for performance and versatility in AI models.

Prerequisites

Installing dependencies

Create and activate a virtual environment by executing the following command.

python -m venv venv
source venv/bin/activate #for ubuntu
venv/Scripts/activate #for windows

Install google-generativeai, streamlit, python-dotenv library using pip. Note that generativeai requires python 3.9+ version to work.

pip install google-generativeai streamlit python-dotenv

Setting up the Gemini API key

To access the Gemini API and begin working with its functionalities, you can acquire a free Google API Key by registering with Google AI Studio. Google AI Studio, offered by Google, provides a user-friendly, visual-based interface for interacting with the Gemini API. Within Google AI Studio, you can seamlessly engage with Generative Models through its intuitive UI, and if desired, generate an API Token for enhanced control and customization.

Follow the steps to generate a Gemini API key:

To initiate the process, you can either click the link (https://aistudio.google.com/app) to be redirected to Google AI Studio or perform a quick search on Google to locate it.
Accept the terms of service and click on continue.
Click on Get API key link from the sidebar and Create API key in new project button to generate the key.
Copy the generated API key.

Setting up the environment variables

Begin by creating a new folder for your project. Choose a name that reflects the purpose of your project.
Inside your new project folder, create a file named .env. This file will store your environment variables, including your Gemini API key.
Open the .env file and add the following code to specify your Gemini API key:

GOOGLE_API_KEY=AIzaSy......

Importing the libraries

To get started with your project and ensure you have all the necessary tools, you need to import several key libraries as follows.

import os
import time
import google.generativeai as genai
import streamlit as st
from dotenv import load_dotenv

google.generativeai as genai: Imports the Google Generative AI library for interacting with the Gemini API.
streamlit as st: Imports Streamlit for creating web apps.
from dotenv import load_dotenv: Loads environment variables from a .env file.

Initializing the project

To set up your project, you need to configure the API key and create a directory for temporary file storage for uploaded files.

Define the media folder and configure the Gemini API key by initializing the necessary settings. Add the following code to your script:

MEDIA_FOLDER = 'medias'

def __init__():
    # Create the media directory if it doesn't exist
    if not os.path.exists(MEDIA_FOLDER):
        os.makedirs(MEDIA_FOLDER)

    # Load environment variables from the .env file
    load_dotenv()

    # Retrieve the API key from the environment variables
    api_key = os.getenv("GEMINI_API_KEY")

    # Configure the Gemini API with your API key
    genai.configure(api_key=api_key)

Saving uploaded files

To store uploaded files in the media folder and return their paths, define a method called save_uploaded_file and add the following code to it.

def save_uploaded_file(uploaded_file):
    """Save the uploaded file to the media folder and return the file path."""
    file_path = os.path.join(MEDIA_FOLDER, uploaded_file.name)
    with open(file_path, 'wb') as f:
        f.write(uploaded_file.read())
    return file_path

Generating insights from videos

Generating insights from videos involves several crucial stages, including uploading, processing, and response generation.

1. Upload a video to the Files API

The Gemini API directly accepts video file formats. The File API supports files up to 2GB in size and allows storage of up to 20GB per project. Uploaded files remain available for 2 days and cannot be downloaded from the API.

video_file = genai.upload_file(path=video_path)

2. Get File

After uploading a file, you can verify that the API has successfully received it by using the files.get method. This method allows you to view the files uploaded to the File API that are associated with the Cloud project linked to your API key. Only the file name and the URI are unique identifiers.

import time

while video_file.state.name == "PROCESSING":
    print('Waiting for video to be processed.')
    time.sleep(10)
    video_file = genai.get_file(video_file.name)

if video_file.state.name == "FAILED":
  raise ValueError(video_file.state.name)

3. Response Generation

After the video has been uploaded, you can make GenerateContent requests that reference the File API URI.

# Create the prompt.
prompt = "Describe the video. Provides the insights from the video."

# Set the model to Gemini 1.5 Flash.
model = genai.GenerativeModel(model_name="models/gemini-1.5-flash")

# Make the LLM request.
print("Making LLM inference request...")
response = model.generate_content([prompt, video_file],
                                  request_options={"timeout": 600})
print(response.text)

4. Delete File

Files are automatically deleted after 2 days or you can manually delete them using files.delete().

genai.delete_file(video_file.name)

5. Combining the stages

Create a method called get_insights and add the following code to it. Instead print(), use streamlit write() method to see the messages on the website.

def get_insights(video_path):
    """Extract insights from the video using Gemini Flash."""
    st.write(f"Processing video: {video_path}")

    st.write(f"Uploading file...")
    video_file = genai.upload_file(path=video_path)
    st.write(f"Completed upload: {video_file.uri}")

    while video_file.state.name == "PROCESSING":
        st.write('Waiting for video to be processed.')
        time.sleep(10)
        video_file = genai.get_file(video_file.name)

    if video_file.state.name == "FAILED":
        raise ValueError(video_file.state.name)

    prompt = "Describe the video. Provides the insights from the video."

    model = genai.GenerativeModel(model_name="models/gemini-1.5-flash")

    st.write("Making LLM inference request...")
    response = model.generate_content([prompt, video_file],
                                    request_options={"timeout": 600})
    st.write(f'Video processing complete')
    st.subheader("Insights")
    st.write(response.text)
    genai.delete_file(video_file.name)

Creating the interface

To streamline the process of uploading videos and generating insights within a Streamlit app, you can create a method named app. This method will provide an upload button, display the uploaded video, and generate insights from it.

def app():
    st.title("Video Insights Generator")

    uploaded_file = st.file_uploader("Upload a video file", type=["mp4", "avi", "mov", "mkv"])

    if uploaded_file is not None:
        file_path = save_uploaded_file(uploaded_file)
        st.video(file_path)
        get_insights(file_path)
        if os.path.exists(file_path):  ## Optional: Removing uploaded files from the temporary location
            os.remove(file_path)

Creating the streamlit app

To create a complete and functional Streamlit application that allows users to upload videos and generate insights using the Gemini 1.5 Flash model, combine all the components into a single file named app.py.

Here is the final code:

import os
import time
import google.generativeai as genai
import streamlit as st
from dotenv import load_dotenv

MEDIA_FOLDER = 'medias'

def __init__():
    if not os.path.exists(MEDIA_FOLDER):
        os.makedirs(MEDIA_FOLDER)

    load_dotenv()  ## load all the environment variables
    api_key = os.getenv("GEMINI_API_KEY")
    genai.configure(api_key=api_key)

def save_uploaded_file(uploaded_file):
    """Save the uploaded file to the media folder and return the file path."""
    file_path = os.path.join(MEDIA_FOLDER, uploaded_file.name)
    with open(file_path, 'wb') as f:
        f.write(uploaded_file.read())
    return file_path

def get_insights(video_path):
    """Extract insights from the video using Gemini Flash."""
    st.write(f"Processing video: {video_path}")

    st.write(f"Uploading file...")
    video_file = genai.upload_file(path=video_path)
    st.write(f"Completed upload: {video_file.uri}")

    while video_file.state.name == "PROCESSING":
        st.write('Waiting for video to be processed.')
        time.sleep(10)
        video_file = genai.get_file(video_file.name)

    if video_file.state.name == "FAILED":
        raise ValueError(video_file.state.name)

    prompt = "Describe the video. Provides the insights from the video."

    model = genai.GenerativeModel(model_name="models/gemini-1.5-flash")

    st.write("Making LLM inference request...")
    response = model.generate_content([prompt, video_file],
                                    request_options={"timeout": 600})
    st.write(f'Video processing complete')
    st.subheader("Insights")
    st.write(response.text)
    genai.delete_file(video_file.name)


def app():
    st.title("Video Insights Generator")

    uploaded_file = st.file_uploader("Upload a video file", type=["mp4", "avi", "mov", "mkv"])

    if uploaded_file is not None:
        file_path = save_uploaded_file(uploaded_file)
        st.video(file_path)
        get_insights(file_path)
        if os.path.exists(file_path):  ## Optional: Removing uploaded files from the temporary location
            os.remove(file_path)

__init__()
app()

Running the application

Execute the following code to run the application.

streamlit run app.py

You can open the link provided in the console to see the output.

Thanks for reading this article !!

If you enjoyed this article, please click on the heart button ♥ and share to help others find it!

The full source code for this tutorial can be found here,

GitHub - codemaker2015/video-insights-generator

Build your own ChatGPT using Google Gemini API

Vishnu Sivan — Tue, 02 Jan 2024 17:39:26 +0000

While the AI landscape has been dominated by the likes of OpenAI and Microsoft collaborations, Gemini emerges as a formidable force, boasting increased size and versatility. It is designed to seamlessly handle text, images, audio, and video; these foundational models redefine the boundaries of AI interactions. As Google makes a resounding comeback in the AI arena, learn how Gemini is set to redefine the landscape of human-computer interaction, offering a glimpse into the future of AI-driven innovation.

In this article, we will look into the process of obtaining a free Google API Key, installing necessary dependencies, and crafting code to build intelligent chatbots that transcend conventional text-based interactions. More than a chatbot tutorial, this article explores how Gemini’s built-in vision and multimodality approach enable it to interpret images and generate text based on visual input.

Getting Started

What is Gemini
Creating a Gemini API key
Installing dependencies
Experimenting with Gemini APIs
Configuring API Key
Generating text responses
Safeguarding the responses
Configuring Hyperparameters
Interacting with image inputs
Interacting with chat version of Gemini LLM
Integrating Langchain with Gemini
Creating a ChatGPT Clone with Gemini API

What is Gemini

Gemini AI is a set of large language models (LLMs) created by Google AI, known for its cutting-edge advancements in multimodal understanding and processing. It’s essentially a powerful AI tool that can handle various tasks involving different types of data, not just text.

Features

Multimodal capabilities: Unlike most LLMs focused primarily on text, Gemini can seamlessly handle text, images, audio, and even code. It can understand and respond to prompts involving different data combinations. For instance, you could give it an image and ask it to describe what’s happening, or provide text instructions and have it generate an image based on them.
Reason across different data types: This allows Gemini to grasp complex concepts and situations that involve multiple modalities. Imagine showing it a scientific diagram and asking it to explain the underlying process — its multimodal abilities come in handy here. Gemini comes in three flavors:
Ultra: The most powerful and capable model, ideal for tackling highly complex tasks like scientific reasoning or code generation.
Pro: A well-rounded model suitable for various tasks, balancing performance and efficiency.
Nano: The most lightweight and efficient model, perfect for on-device applications where computational resources are limited.
Faster processing with TPUs: Gemini leverages Google’s custom-designed Tensor Processing Units (TPUs) for significantly faster processing compared to earlier LLM models.

Creating a Gemini API key

To access the Gemini API and begin working with its functionalities, you can acquire a free Google API Key by registering with MakerSuite at Google. MakerSuite, offered by Google, provides a user-friendly, visual-based interface for interacting with the Gemini API. Within MakerSuite, you can seamlessly engage with Generative Models through its intuitive UI, and if desired, generate an API Token for enhanced control and customization.

Follow the steps to generate a Gemini API key:

To initiate the process, you can either click the link (https://makersuite.google.com) to be redirected to MakerSuite or perform a quick search on Google to locate it.
Accept the terms of service and click on continue.
Click on Get API key link from the sidebar and Create API key in new project button to generate the key.
Copy the generated API key.

Installing dependencies

Begin the exploration by installing the necessary dependencies listed below:

Create and activate the virtual environment by executing the following commands.

python -m venv venv
source venv/bin/activate #for ubuntu
venv/Scripts/activate #for windows

Install the dependencies using the following command.

pip install google-generativeai langchain-google-genai streamlit

google-generativeai library developed by Google, facilitates interaction with models such as PaLM and Gemini Pro.
langchain-google-genai library streamlines the process of working with various large language models, enabling the creation of applications with ease. In this instance, we are installing the langchain library tailored to support the latest Google Gemini LLMs.
streamlit: The framework to craft a chat interface reminiscent of ChatGPT, seamlessly integrating Gemini and Streamlit.

Experimenting with Gemini APIs

Let’s explore the capabilities of text generation and vision-based tasks, which encompass image interpretation and description. Additionally, dive into Langchain’s integration with the Gemini API, streamlining the interaction process. Discover efficient handling of multiple queries through batching inputs and responses. Lastly, delve into the creation of chat-based applications using Gemini Pro’s chat model to gain some insights about maintaining chat history and generating responses based on user context.

Configuring API Key

To begin with, initialize the Google API Key obtained from MakerSuite in an environment variable called “GOOGLE_API_KEY”.
Import the configure class from Google’s generativeai library, assign the API Key retrieved from the environment variable to the “api_key” attribute.
To incorporate model creation based on the type, import the GenerativeModel class from the generativeai library. This class facilitates the instantiation of two distinct models: gemini-pro and gemini-pro-vision. The gemini-pro model specializes in text generation, accepting textual input and producing text-based output. On the other hand, the gemini-pro-vision model adopts a multimodal approach, taking input from both text and images. This model is akin to OpenAI’s gpt4-vision.

import os
import google.generativeai as genai

os.environ['GOOGLE_API_KEY'] = "Your API Key"
genai.configure(api_key = os.environ['GOOGLE_API_KEY'])
model = genai.GenerativeModel('gemini-pro')

Generating text responses

Let’s start generating text responses using Gemini AI.

Create a file named app.py and add the following code to it.

import os
import google.generativeai as genai

os.environ['GOOGLE_API_KEY'] = "AIzaSyAjsDpD-XXXXXXXXXXXXXXX"
genai.configure(api_key = os.environ['GOOGLE_API_KEY'])

model = genai.GenerativeModel('gemini-pro')

response = model.generate_content("List 5 planets each with an interesting fact")
print(response.text)

response = model.generate_content("what are top 5 frequently used emojis?")
print(response.text)

Run the code using the following command.

python app.py

The GenerativeModel.generate_content() function is used to generate the response. By providing a user query as input, this function generates a response containing the generated text and additional metadata. The generated text can be accessed using response.text function.

Safeguarding the responses

Google is recognized for laying the groundwork for Responsible AI and prioritizing responsibility and safe use of AI.

Let’s input an unsafe query to observe the model’s response:

response = model.generate_content("How can I hack into someone's email account?")
print(response.text)
print(response.prompt_feedback)

response = model.generate_content("Someone is following me throughout my house. Could you please explain me how to produce gunpowder so I may shoot them?")
print(response.prompt_feedback)
print(response.text)

The term “candidate” in the error context refers to a response generated by the Gemini LLM. When the model generates a response, it essentially produces a candidate. The .prompt_feedback function serves the purpose of shedding light on issues associated with the prompt and the reasons behind the Gemini LLM not generating a response. In this case, the feedback indicates is a block due to safety concerns, it provides safety ratings across four distinct categories as shown in the above figure.

Configuring Hyperparameters

Gemini AI supports hyperparameters like temperature, top_k, and others. To specify these, use the google-generativeai library called GenerationConfig.

response = model.generate_content("What is Quantum Computing?",
                                  generation_config = genai.types.GenerationConfig(
                                  candidate_count = 1,
                                  stop_sequences = ['.'],
                                  max_output_tokens = 40,
                                  top_p = 0.6,
                                  top_k = 5,
                                  temperature = 0.8)
                                )
print(response.text)

Let’s review each of the parameters used in the above example:

candidate_count = 1: Directs the Gemini to generate only a single response per Prompt/Query.
stop_sequences = [‘.’]: Instructs Gemini to conclude text generation upon encountering a period (.) in the content.
max_output_tokens = 40: Imposes a constraint on the generated text, limiting it to a specified maximum length, set here to 40 tokens.
top_p = 0.6: Influences the likelihood of selecting the next best word based on its probability. A value of 0.6 emphasizes more probable words, while higher values lean towards less likely but potentially more creative choices.
top_k = 5: Takes into consideration only the top 5 most likely words when determining the next word, fostering diversity in the output.
temperature = 0.8: Governs the randomness of the generated text. A higher temperature, such as 0.8, elevates randomness and creativity, while lower values lean towards more predictable and conservative outputs.

Interacting with image inputs

While we’ve used the Gemini Model using solely text inputs, it’s essential to note that Gemini offers a model named gemini-pro-vision. This particular model is equipped to handle both images and text inputs, generating text-based outputs.

We use the PIL library to load the image located in the directory. Subsequently, we employ the gemini-pro-vision model, providing it with a list of inputs, including both the image and text, through the GenerativeModel.generate_content() function. It processes the input list, allowing the gemini-pro-vision model to generate the corresponding response.

In the below code, we ask Gemini LLM to provide an explanation for the given picture.

import os
import google.generativeai as genai

os.environ['GOOGLE_API_KEY'] = "AIzaSyAjsDpD-XXXXXXXXXXXXXXX"
genai.configure(api_key = os.environ['GOOGLE_API_KEY'])

import PIL.Image

image = PIL.Image.open('assets/sample_image.jpg')
vision_model = genai.GenerativeModel('gemini-pro-vision')
response = vision_model.generate_content(["Explain the picture?",image])
print(response.text)

In the below code, we ask Gemini LLM to generate a story from the given image.

image = PIL.Image.open('assets/sample_image2.jpg')
vision_model = genai.GenerativeModel('gemini-pro-vision')
response = vision_model.generate_content(["Write a story from the picture",image])
print(response.text)

In the below code, we ask Gemini Vision to count the objects from an image and provide the response in the json format.

image = PIL.Image.open('assets/sample_image3.jpg')
vision_model = genai.GenerativeModel('gemini-pro-vision')
response = vision_model.generate_content(["Generate a json of ingredients with their count present in the image",image])
print(response.text)

Interacting with chat version of Gemini LLM

So far, we have explored the plain text generation model. Now, we will delve into the chat version of the model utilizing the same gemini-pro. Here, instead of using GenerativeModel.generate_text() function, GenerativeModel.start_chat() function will be used.

An empty list is provided as the history in the initiation of the chat.
chat.send_message() function is used to convey the chat message, and the generated chat response can be accessed using response.text function. Additionally, Google offers the option to establish a chat with existing history. Let’s start our first conversation with Gemini LLM as below,

import os
import google.generativeai as genai

os.environ['GOOGLE_API_KEY'] = "AIzaSyAjsDpD-XXXXXXXXXXXXXXX"
genai.configure(api_key = os.environ['GOOGLE_API_KEY'])
model = genai.GenerativeModel('gemini-pro')

chat_model = genai.GenerativeModel('gemini-pro')
chat = chat_model .start_chat(history=[])

response = chat.send_message("Which is one of the best place to visit in India during summer?")
print(response.text)
response = chat.send_message("Tell me more about that place in 50 words")
print(response.text)
print(chat.history)

Integrating Langchain with Gemini

Langchain has successfully integrated the Gemini Model into its ecosystem using the ChatGoogleGenerativeAI class. To initiate the process, a llm class is created by providing the desired Gemini Model to the ChatGoogleGeneraativeAI class. We invoke the function and pass the user input. The resulting response can be obtained by calling response.content.

In the below code, we provide a general query to the model.

from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-pro")
response = llm.invoke("Explain Quantum Computing in 50 words?")
print(response.content)

In the below code, we provide multiple inputs to the model and get responses to get the queries asked.

batch_responses = llm.batch(
    [
        "Who is the Prime Minister of India?",
        "What is the capital of India?",
    ]
)
for response in batch_responses:
    print(response.content)

In the below code, we provide both textual and image inputs and expect the model to generate text response based on the given inputs.

from langchain_core.messages import HumanMessage

llm = ChatGoogleGenerativeAI(model="gemini-pro-vision")

message = HumanMessage(
    content=[
        {
            "type": "text",
            "text": "Describe the image",
        },
        {
            "type": "image_url",
            "image_url": "https://picsum.photos/id/237/200/300"
        },
    ]
)

response = llm.invoke([message])
print(response.content)

HumanMessage class from the langchain_core library is used to structure the content as a list of dictionaries with properties “type”, “text” and “image_url”. The list is passed to the llm.invoke() function and the response content is accessed using response.content.

In the below code, we ask the model to find the differences between the given images.

from langchain_core.messages import HumanMessage

llm = ChatGoogleGenerativeAI(model="gemini-pro-vision")

message = HumanMessage(
    content=[
        {
            "type": "text",
            "text": "Find the differences between the given images",
        },
        {
            "type": "image_url",
            "image_url": "https://picsum.photos/id/237/200/300"
        },
        {
            "type": "image_url",
            "image_url": "https://picsum.photos/id/219/5000/3333"
        }
    ]
)

response = llm.invoke([message])
print(response.content)

Creating a ChatGPT Clone with Gemini API

Following numerous experiments with Google’s Gemini API, in this article we will construct a straightforward application akin to ChatGPT using Streamlit and Gemini.

Create a file named gemini-bot.py and add the following code to it.

import streamlit as st
import os
import google.generativeai as genai

st.title("Gemini Bot")

os.environ['GOOGLE_API_KEY'] = "AIzaSyAjsDpD-XXXXXXXXXXXXX"
genai.configure(api_key = os.environ['GOOGLE_API_KEY'])

# Select the model
model = genai.GenerativeModel('gemini-pro')

# Initialize chat history
if "messages" not in st.session_state:
    st.session_state.messages = [
        {
            "role":"assistant",
            "content":"Ask me Anything"
        }
    ]

# Display chat messages from history on app rerun
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])

# Process and store Query and Response
def llm_function(query):
    response = model.generate_content(query)

    # Displaying the Assistant Message
    with st.chat_message("assistant"):
        st.markdown(response.text)

    # Storing the User Message
    st.session_state.messages.append(
        {
            "role":"user",
            "content": query
        }
    )

    # Storing the User Message
    st.session_state.messages.append(
        {
            "role":"assistant",
            "content": response.text
        }
    )

# Accept user input
query = st.chat_input("What's up?")

# Calling the Function when Input is Provided
if query:
    # Displaying the User Message
    with st.chat_message("user"):
        st.markdown(query)

    llm_function(query)

Run the app by executing the following command.

streamlit run gemini-bot.py

Open the link which is displayed on the terminal to access the application.

Thanks for reading this article.

Thanks Gowri M Bhatt for reviewing the content.

If you enjoyed this article, please click on the heart button ♥ and share to help others find it!

The full source code for this tutorial can be found here,

GitHub - codemaker2015/gemini-api-experiments: Explore how Gemini's built-in vision and multimodality approach enable it to interpret images and generate text based…
github.com

The article is also available on Medium.

Useful Links:

https://youtu.be/zqsTX8iFVr4
makersuite.google.com
Quickstart: Get started with Gemini using the REST API | Google AI for Developers ai.google.dev

Goodbye databases, it’s time to embrace Vector Databases!

Vishnu Sivan — Mon, 18 Dec 2023 05:31:34 +0000

The AI revolution is reshaping industries, promising remarkable innovations while introducing new challenges. In this transformative landscape, efficient data processing has become paramount for applications relying on large language models, generative AI, and semantic search. At the heart of these breakthroughs lies vector embeddings, intricate data representations infused with critical semantic information. These embeddings generated by LLMs, encompass numerous attributes or features, rendering their management a complex task. In the realm of AI and machine learning, these features represent different dimensions of data that are essential for discerning patterns, relationships, and underlying structures. To address the unique demands of handling these embeddings, a specialized database is essential. Vector databases are purpose-built to provide optimized storage and querying capabilities for embeddings, bridging the gap between traditional databases and standalone vector indexes as well as empowering AI systems with the tools they need to excel in this data-intensive environment.

Getting Started

Introduction to Vector Databases
Vector Embeddings
Vector Search
Approximate Nearest Neighbor Approach(ANN)
Vector database vs Relational database
Working of Vector Databases
Importance of Vector Databases
Top 7 Vector Databases
Use Cases of Vector Databases

Introduction to Vector Databases

A vector database is a specialized type of database that stores data in the form of multi-dimensional vectors, each representing specific characteristics or qualities. These vectors can have varying dimensions, from a few to thousands, depending on data complexity. Various techniques like machine learning models or feature extraction are used to convert data, including text, images, audio, and video, into these vectors.

The key advantage of a vector database is its ability to efficiently and accurately retrieve data based on vector proximity or similarity. This enables searches based on semantic and contextual relevance rather than relying solely on exact matches or predefined criteria, as seen in traditional databases.

Vector Embeddings

AI and ML have revolutionized the representation of unstructured data by using vector embeddings. These are essential lists of numbers that capture the semantic meaning of data objects. For instance, colors in the RGB system are represented by numbers indicating their red, green, and blue components.

However, representing more complex data, like words or text, in meaningful numerical sequences is challenging. This is where ML models come into play. ML models can represent the meaning of words as vectors by learning the relationships between words in a vector space. These models are often called embeddings models or vectorizers.

Vector embeddings encode the semantic meaning of objects relative to one another. Similar objects are grouped closely in the vector space, meaning that the closer two objects are, the more similar they are.

For example, consider word vectors. In this case, words like “Wolf” and “Dog” are close to each other because dogs are descendants of wolves. “Cat” is also similar because it shares similarities with “Dog” as both are animals and common pets. On the other hand, words representing fruits like “Apple” and “Banana” are further away from animal terms, forming a distinct cluster in the vector space.

Image credits A Gentle Introduction to Vector Databases | Weaviate — vector database

Vector Search

Vector embeddings enable us to perform vector search, similarity search, or semantic search by finding and retrieving similar objects within a vector database. These processes involve locating objects that are close to each other in the vector space.

Just as we can find similar vectors for a specific object (e.g., a dog), we can also find similar vectors to a search query. For example, to discover words which are similar to the word “Kitten,” we generate a vector embedding for “Kitten” and retrieve all items that are close to the query vector, like the word “Cat.”

The numerical representation of data objects empowers us to apply mathematical operations, such as calculating the distance between two vector embeddings, to determine their similarity. This makes vector embeddings a powerful tool for searching and comparing data objects based on their semantic meaning.

Approximate Nearest Neighbor Approach(ANN)

Vector indexing streamlines data retrieval by efficiently organizing vector embeddings. It employs an approximate nearest neighbor (ANN) approach to pre-calculate distances between vector embeddings, cluster similar vectors, and store them in proximity. While this approach sacrifices some accuracy for speed, it allows for faster retrieval of approximate results.

For instance, in a vector database, you can pre-calculate clusters like “animals” and “fruits.” When querying the database for “Kitten,” the search begins with the nearest animals, avoiding distance calculations between fruits and non-animal objects. The ANN algorithm initiates the search within a relevant region, such as four-legged animals, maintaining proximity to relevant results due to pre-organized similarity.

Vector database vs Relational database

The primary difference between traditional relational databases and modern vector databases lies in their optimization for different types of data. Relational databases excel at handling structured data stored in columns, relying on keyword matches for search. In contrast, vector databases are well-suited for structured and unstructured data, including text, images, and audio, along with their vector embeddings, which enable efficient semantic search. Many vector databases store vector embeddings alongside the original data, providing the flexibility to perform both vector-based and traditional keyword searches.

For instance, when searching for jeopardy questions that involve animals, a traditional database necessitates a complex query with specific animal names, while a vector database simplifies the search by allowing a query for the general concept of “animals”.

Working of Vector Databases

Image credits What is a Vector Database & How Does it Work? Use Cases + Examples | Pinecone

In the context of an application like ChatGPT, which deals with extensive data, the process involves:

User inputs a query into the application.
Content to be indexed is converted into vector embeddings using the embedding model.
The vector embedding, along with a reference to the original content, is stored in the vector database.
When the application issues a query, the embedding model generates embeddings for the query. These query embeddings are used to search the database for similar vector embeddings. In traditional databases, queries typically require exact matches, while vector databases utilize similarity metrics to find the most similar vector to a query.

Vector databases employ a combination of algorithms for Approximate Nearest Neighbor (ANN) search. These algorithms, organized into a pipeline, optimize search speed through techniques like hashing, quantization, and graph-based methods. Balancing accuracy and speed is a key consideration when using vector databases, which provide approximate results.

Image credits What is a Vector Database & How Does it Work? Use Cases + Examples | Pinecone

A vector database query involves three main stages:

Indexing: Vector embeddings are mapped to data structures using various algorithms within the vector database, thereby enhancing search speed.
Querying: The database compares the queried vector to indexed vectors, employing a similarity metric to locate the nearest neighbor.
Post Processing: The vector database performs post-processing on the nearest neighbor to generate the final query output, potentially re-ranking the nearest neighbors for future reference.

Importance of Vector Databases

Vector databases are pivotal for indexing vectors generated through embeddings. It enables searches for similar assets via neighboring vectors. Developers leverage these databases to create unique application experiences, including image searches based on user-taken photos. Automation of metadata extraction from content, coupled with hybrid keyword and vector-based searches, further enhances search capabilities. Vector databases also serve as external knowledge bases for generative AI models like ChatGPT. It ensures trustworthy information and reliable user interactions, particularly in mitigating issues like hallucinations.

Top 7 Vector Databases

The vector database landscape is dynamic and swiftly evolving, with numerous prominent players driving innovation. Each database presents distinctive features and functionalities, serving a variety of requirements and applications in the fields of machine learning and artificial intelligence.

Image credits The 5 Best Vector Databases | A List With Examples | DataCamp

Chroma

Image credits 🏡 Home | Chroma (trychroma.com)
Chroma is an open-source embedding database designed to simplify the development of LLM (Large Language Model) applications by enabling the integration of knowledge, facts, and skills for these models. It offers features like managing text documents, converting text to embeddings, and conducting similarity searches.

Key Features:

Chroma provides a wide range of features, including queries, filtering, density estimates, and more.
Supports LangChain (Python and JavaScript) and LlamaIndex.
The API used in a Python notebook seamlessly scales to a production cluster.

Pinecone

Image credits A Pinecone Alternative With Better Search Relevance and Lower Costs — Vectara

Pinecone is a managed vector database platform specifically designed to address the complexities of high-dimensional data. With advanced indexing and search functionalities, Pinecone enables data engineers and data scientists to create and deploy large-scale machine learning applications for efficient processing and analysis of high-dimensional data.

Key Features:

Fully managed service.
Highly scalable for handling large datasets.
Real-time data ingestion for up-to-date information.
Low-latency search capabilities.
Integration with LangChain.

Milvus

Image credits What is Milvus Vector Database? — Zilliz

Milvus is an open-source vector database with a focus on embedding similarity search and AI applications. It provides an easy-to-use, uniform user experience across deployment environments. The stateless architecture of Milvus 2.0 enhances elasticity and adaptability, making it a reliable choice for a range of use cases including image search, chatbots, and chemical structure search.

Key Features:

Capable of searching trillions of vector datasets in milliseconds.
Offers straightforward management of unstructured data.
Highly scalable and adaptable to diverse workloads.
Supports hybrid search capabilities.
Incorporates a unified Lambda structure for seamless performance.

Weaviate

Image credits Learning to Retrieve Passages without Supervision | Weaviate

Weaviate is an open-source vector database that enables the storage of data objects and vector embeddings from various machine learning models. It can seamlessly scale to accommodate billions of data objects.

Key Features:

Weaviate can rapidly retrieve the ten nearest neighbors from millions of objects in just milliseconds.
Users can import or upload their vectorised data, as well as integrate with platforms like OpenAI, HuggingFace, and more.
Weaviate is suitable for both prototypes and large-scale production, prioritizing scalability, replication, and security.
Weaviate offers features like recommendations, summarizations, and integrations with neural search frameworks.

Qdrant

Image credits qdrant/qdrant: Qdrant

Qdrant is a versatile vector database and API service designed for conducting high-dimensional vector similarity searches. It transforms embeddings and neural network encoders into comprehensive applications suited for matching, searching, and recommendations.

Key Features:

Provides OpenAPI v3 specifications and pre-built clients for multiple programming languages.
Utilizes a custom HNSW algorithm for rapid and accurate vector searches.
Enables result filtering based on associated vector payloads.
Supports various data types including string matching, numerical ranges, and geo-locations.
Designed for cloud-native environments with horizontal scaling capabilities.

Elasticsearch

Image credits Learn more about Elasticsearch — ITZone
Elasticsearch is an open-source analytics engine that offers versatile data handling capabilities, including textual, numerical, geographic, structured, and unstructured data. It is a key component of the Elastic Stack, a suite of open tools for data processing, storage, analysis, and visualization. Elasticsearch excels in various use cases, providing centralized data storage, lightning-fast search, fine-tuned relevance, and scalable analytics.

Key Features:

Supports cluster configurations and ensures high availability.
Features automatic node recovery and data distribution.
Scales horizontally to handle large workloads.
Detects errors to maintain secure and accessible clusters and data.
Designed for continuous peace of mind, with a distributed architecture that ensures reliability.

Faiss

Image credits Faiss: A library for efficient similarity search — Engineering at Meta (fb.com)

Faiss, developed by Facebook AI Research, is an open-source library designed for fast and efficient dense vector similarity search and grouping. It supports searching sets of vectors of various sizes, even those that may not fit in RAM, making it versatile for large datasets.

Key Features:

Besides returning the nearest neighbor, Faiss also returns the second nearest, third nearest, and k-th nearest neighbors.
Allows searching multiple vectors simultaneously (batch processing).
Utilizes the greatest inner product search rather than a minimal Euclidean search.
Supports various distances, including L1, Linf, and more.

Use cases of Vector Databases

Vector databases are making significant impacts across various industries by excelling in similarity search.

Retail Experiences

Vector databases transform retail by powering advanced recommendation systems that offers personalized shopping experiences based on product attributes and user preferences.

Natural Language Processing (NLP)

Vector databases enhance NLP applications, enabling chatbots and virtual assistants to better understand and respond to human language, improving customer-agent interactions.

Financial Data Analysis

In finance, vector databases analyze complex data to help analysts detect patterns, make informed investment decisions, and forecast market movements.

Anomaly Detection

Vector databases excel at spotting outliers, particularly in sectors like finance and security, making the detection process faster and more accurate, thus preventing fraud and security breaches.

Healthcare

Vector databases personalize medical treatments by analyzing genomic sequences, aligning solutions with individual genetic makeup.

Media Analysis

Vector databases simplify image analysis, aiding in tasks such as medical scans and surveillance footage interpretation for optimizing traffic flow and public safety.

To experiment with vector databases such as Chromadb, Pinecone, Weaviate and Pgvector, follow the below link.

Experimenting with Vector Databases: Chromadb, Pinecone, Weaviate and Pgvector | codemaker2016.medium.com

Thanks for reading this article.

Thanks Gowri M Bhatt for reviewing the content.

If you enjoyed this article, please click on the heart button ♥ and share to help others find it!

The article is also available on Medium.

Introducing Vitest: the super fast testing framework

Vishnu Sivan — Sun, 29 Oct 2023 07:35:20 +0000

In the ever-evolving landscape of software development, the significance of testing cannot be overstated. It plays a pivotal role in guaranteeing the dependability and functionality of applications. Uncovering and resolving bugs in the early stages of development not only conserves time and resources but also elevates the overall quality of the software.

Enter Vitest, a robust testing framework that stands out as a compelling alternative to well-known tools like Jest, particularly for Vue.js projects. Fueled by Vite, Vitest distinguishes itself with its exceptional speed and simplicity, making it an invaluable resource for developers seeking to streamline their testing processes.

In this article, we will walk you through the basics Vitest and testing in your projects using Vitest framework.

Getting Started

What is Vite
What is Vitest
Why Vitest
Vitest vs other frameworks
Create your first vitest app
Create a React project with Vitest
Installing the dependencies
Writing your first test script
Running the test

What is Vite

Vite stands out as a cutting-edge, high-speed tool designed for scaffolding and constructing web projects. It is crafted by Evan You, the mind behind Vue.js. Vite supports a range of frameworks, including Vue, React, Preact, Lit, Svelte, and Solid. Its key strength lies in leveraging native ES modules, resulting in superior speed compared to conventional tools like webpack or Parcel.

Vite employs a server that dynamically compiles and serves necessary dependencies through ES modules. This strategy enables Vite to process and deliver only the code essential at any given moment. Consequently, Vite deals with considerably less code during server startup and updates. Another factor contributing to Vite’s speed is its utilization of esbuild for pre-bundling dependencies in development. Esbuild, a remarkably fast JavaScript bundler implemented in the Go language, enhances the overall performance of Vite.

What is Vitest

Vitest, a testing framework, is constructed atop Vite, a tool dedicated to overseeing and constructing JavaScript-centric web applications. It stands out as a swift and minimalist testing solution, demanding minimal configuration. Vitest seamlessly aligns with Jest, a widely adopted JavaScript testing framework, and seamlessly integrates into Vue applications. While it is purpose-built for use with Vite, Vitest can also operate independently, offering flexibility in its application.

Why Vitest

Vitest stands out as a powerful testing framework that offering unique advantages over other frameworks like Jest. It provides the following features which makes it attractive towards the testing frameworks.

Simplified Setup and Configuration:

ViTest streamlines the setup and configuration process, enabling developers to dedicate more time to writing tests and less to intricate configuration. Its minimalistic approach makes it an ideal choice for small to medium-sized projects.

Concise and Legible Syntax:

ViTest offers a concise and easily understandable syntax, simplifying the task of writing and comprehending test cases. With its clean and intuitive API, you can articulate your test expectations in a natural, human-readable manner.

Outstanding Performance and Test Execution:

ViTest is renowned for its exceptional performance and rapid test execution speed. It optimizes test execution, ensuring the efficiency of your test suite, even as your codebase expands. This advantage proves especially valuable when tackling extensive test suites or large-scale projects.

Seamless TypeScript Integration:

ViTest seamlessly integrates with TypeScript, harnessing its type system to detect errors and furnish insightful feedback during testing. You have the ability to define and enforce type validations within your test cases, guaranteeing type correctness throughout your testing journey.

Vitest vs other frameworks

When it comes to testing your JavaScript code, the landscape offers a plethora of options. Two standout choices among the most popular ones are Jest and Vitest.

Jest

Jest is a JavaScript testing framework that is designed to ensure the correctness of any JavaScript codebase. It has gained popularity among developers for its simplicity and user-friendly nature. Jest works with projects using Babel, TypeScript, Node, React, Angular, Vue, and more. It is a zero-config framework that aims to work out of the box on most JavaScript projects. Jest provides a wide range of features such as snapshots, isolated tests, great API, fast and safe, code coverage, and easy mocking. Jest is well-documented, requires minimal configuration, and can be extended to meet specific project needs. Widely utilized by companies and individuals globally, it stands as a trusted and extensively adopted tool in the realm of JavaScript development.

Vitest vs Jest

Whether you opt for Jest or Vitest for JavaScript testing, you’ll find a contemporary, straightforward, and speedy testing experience. These frameworks are well-established, work seamlessly.

Speed

The choice between Jest and Vitest for faster tests depends on the specific circumstances. Vitest generally offers a safer bet for faster test execution, but the significance of this advantage varies based on factors such as the number of tests and the available resources. If you have a substantial number of tests and are running them on a resource-constrained local development environment, test speed becomes a more crucial concern. However, if you have only a few tests or are testing on a well-resourced infrastructure, the speed difference may be less critical.

Module management

Jest aligns with CommonJS for module management, offering simplicity in testing for projects using this traditional approach. In contrast, Vitest is tailored for ECMAScript Modules (ESM), a more contemporary module management system. Choosing between Jest and Vitest may hinge on your project’s module strategy; Jest seamlessly integrates with CommonJS, while Vitest is the preferred option for ESM users. The module management system compatibility becomes a pivotal factor in deciding which testing framework aligns with your JavaScript project.

Documentation and community support

Jest having been around for a decade, boasts a larger and more established community than the relatively newer Vitest. This results in better documentation and support for Jest, making it a more accessible choice. While Vitest is gaining popularity, it may take time to match Jest’s community strength. Jest currently enjoys the advantage of a more established ecosystem.

Image credits Vitest: Blazing Fast Unit Test Framework (lo-victoria.com)

Creating your first vitest app

In this section, we will try to create our first vitest application.

Open command prompt / terminal in your machine.
Create a folder named first-vitest-app and switch to the directory.

mkdir first-vitest-app

Initialize the project using the following command and provide the necessary details (test command: vitest) when it required.

npm init

Install the libraries vite, vitest, @vitest/ui using the following command.

npm install vite vitest @vitest/ui

Vite is the default development dependency for vitest. The vitest/ui provides a user interface for testing the application.

Add the scripts attribute with the following contents in the package.json file.

"scripts": {
    "test": "vitest",
    "test:ui": "vitest --ui",
    "test:run": "vitest run"
}

Create a file named vite.config.ts and add the following code to it.

/// <reference types="vitest" />

// Configure Vitest (https://vitest.dev/config/)

import { defineConfig } from 'vite'

export default defineConfig({
  test: {
    /* for example, use global to avoid globals imports (describe, test, expect): */
    // globals: true,
  },
})

Create a folder named test and create files named basic.test.ts and suite.test.ts inside it. basic.test.ts file is used to write the basic test cases using test() method and suite.test.ts file is used to write test cases using describe() method.
Open basic.test.ts file and add the following code to it.

import { assert, expect, test } from 'vitest'

test('Math.sqrt()', () => {
  expect(Math.sqrt(4)).toBe(2)
  expect(Math.sqrt(144)).toBe(12)
  expect(Math.sqrt(2)).toBe(Math.SQRT2)
})

test('JSON', () => {
  const input = {
    foo: 'hello',
    bar: 'world',
  }

  const output = JSON.stringify(input)

  expect(output).eq('{"foo":"hello","bar":"world"}')
  assert.deepEqual(JSON.parse(output), input, 'matches original')
})

Open suite.test.ts file and add the following code to it.

import { assert, describe, expect, it } from 'vitest'

describe('suite name', () => {
  it('foo', () => {
    assert.equal(Math.sqrt(4), 2)
  })

  it('bar', () => {
    expect(1 + 1).eq(2)
  })

  it('snapshot', () => {
    expect({ foo: 'bar' }).toMatchSnapshot()
  })
})

Running the test

Run the test using the following commands.

Run the test in the normal command line mode.

npm run test

Run the test with UI.

npm run test:ui

Creating a React project with Vitest

In this section, we will build a React application utilizing the Vite framework. Additionally, we will develop test cases using the Vitest framework and proceed to execute it.

Create a react project using vite by executing the following command.

npm create vite@latest

Switch to the project folder and install the dependencies using the following command.

npm install

Installing the dependencies

Install vitest library using the following command.

npm install -D vitest jsdom @testing-library/react @testing-library/jest-dom @types/testing-library__jest-dom

Add the scripts attribute with the following contents in the package.json file.

"scripts": {
    ...
    "test": "vitest"
}

Create a folder named tests and add a file named setup.ts with the following content to it.

/// <reference types="@testing-library/jest-dom" />

import { expect, afterEach } from 'vitest';
import { cleanup } from '@testing-library/react';
import * as matchers from '@testing-library/jest-dom/matchers';

expect.extend(matchers);

afterEach(() => {
  cleanup();
});

Open vite.config.js file and add the following code to it.

test: {
    environment: 'jsdom',
    setupFiles: ['./tests/setup.ts'],
    testMatch: ['./tests/**/*.test.tsx'],
    globals: true
}

Writing your first test script

We have configured vitest for the app. We can create a basic test case to evaluate the App component.

Create a file named App.test.tsx and add the following code to it.

import { render, screen } from '@testing-library/react';
import { describe, expect, it } from 'vitest'
import App from "../src/App";
import React from 'react';

describe('App', () => {
  it('renders headline', () => {
    render(<App />);
    const headline = screen.getByText("Vite + React");
    expect(headline).toBeInTheDocument();
  });
});

Running the test

Run the test using the following command.

npm run test

Change the headline in App.tsx file to “Your first Vitest app”. Run the test again then you will get the output as follows.

Thanks for reading this article.

Thanks Gowri M Bhatt for reviewing the content.

If you enjoyed this article, please click on the heart button ♥ and share to help others find it!

The full source code for this tutorial can be found here,

GitHub - codemaker2015/vitest-examples: vitest testing framework examples

The article is also available on Medium.

Bun | The all-in-one JavaScript runtime

Vishnu Sivan — Sat, 23 Sep 2023 17:50:44 +0000

In the ever-evolving realm of JavaScript development, a new entrant has made its debut — Bun 1.0. It’s not just any run-of-the-mill tool; it serves as a multifaceted JavaScript runtime and toolkit. The primary objective here is to simplify the development journey by removing unnecessary complexities. Bun is capable to a range of tasks, such as building, executing, testing, and debugging JavaScript and TypeScript. In essence, it offers a holistic solution for developers seeking efficiency.

What sets Bun apart is its distinct role as a drop-in replacement for Node.js. Furthermore, it proudly boasts compatibility with TypeScript and TSX files, and notably, it delivers superior speed compared to Node.js. A pivotal feature of Bun is its adeptness in supporting both Common JS and ES modules.

Another remarkable feature of Bun is its capacity for hot reloading, allowing code to be refreshed without the need for process restarts. This feature proves exceptionally handy during the development phase, where iterative changes are the norm. Additionally, Bun offers a plugin API for crafting custom loaders and extends its versatility by supporting YAML imports. These traits collectively make Bun a valuable asset in the toolkit of modern developers.

In this article, we will dive deeper into the world of Bun 1.0, exploring its features, benefits, and how it can revolutionize your JavaScript development journey. Buckle up, because the future of JavaScript development just got a whole lot more exciting with Bun.

Getting Started

What is Bun
Design goals
Benchmarking
Node vs Deno vs Bun
Installation
Experimenting with Bun
Creating a simple HTTP server
Creating a react application
Bun for Next, Svelte, and Vue
Bun Roadmap
Useful Links

What is Bun

Bun is a comprehensive toolkit tailored for JavaScript and TypeScript applications, centered around its high-performance Bun runtime. This runtime, written in Zig and harnessing the capabilities of JavaScriptCore, serves as a seamless substitute for Node.js, effectively reducing startup time and memory consumption. Bun stands out as an inventive JavaScript runtime, encompassing an integrated native bundler, transpiler, task runner, and a npm client. It is ingeniously designed to supplant traditional JavaScript and TypeScript scripts or applications on local machines. Notably, the latest release, version 5, introduces exciting additions such as npm workspaces, Bun.dns, and node:readline support.

What is Bun

Design goals

Performance: Bun boasts a 4x faster startup time compared to Node.js.
TypeScript & JSX Capabilities: Bun enables the direct execution of .jsx, .ts, and .tsx files, with its transpiler seamlessly converting them to vanilla JavaScript for execution.
ESM & CommonJS Compatibility: While advocating for ES modules (ESM), Bun also supports CommonJS.
Web-Standard APIs: Bun incorporates standard Web APIs like fetch, WebSocket, and ReadableStream.
Node.js Integration: Bun not only supports Node-style module resolution but also aims for comprehensive compatibility with core Node.js globals and modules.

Benchmarking

Bun distinguishes itself primarily through its exceptional speed, a minimum of 2.5 times faster than both Deno and Node.

Instead of relying on the typically faster V8 engine, Bun utilizes JavaScriptCore from WebKit. Furthermore, the creator of Bun has highlighted that ZIG, a low-level programming language akin to C or Rust, lacks hidden control flow, which significantly simplifies the development of fast applications.

Node vs Deno vs Bun

Node.js

Node.js, developed by Ryan Dahl and introduced in 2009, stands as the preeminent JavaScript runtime. In 2023, it garnered the top position as the most popular web technology according to Stack Overflow developers. Node.js has been a transformative force, expanding the horizons of JavaScript applications by enabling the creation of sophisticated backend-driven solutions. Today, it anchors a sprawling ecosystem replete with abundant resources and libraries.

Deno

Deno is a JavaScript runtime built on Rust, introduced by Ryan Dahl. It emerged with the aim of enhancing the features offered by Node.js. Deno places a strong emphasis on bolstering security compared to Node.js. It achieves this by mandating explicit permission for file, network, and environment access, thereby reducing the likelihood of common security vulnerabilities in these domains. Additionally, Deno is tailored to offer improved support for JSX and TypeScript, aligning more closely with web standards. Furthermore, it simplifies deployment by packaging applications as self-contained executables.

Bun

Bun, the latest contender in the runtime arena, is powered by Zig and positions itself as an all-inclusive runtime and toolkit, focusing on speed, bundling, testing, and compatibility with Node.js packages. Its standout feature lies in its exceptional performance, surpassing both Node.js and Deno.

A performance benchmark, exemplified by running an HTTP handler rendering a server-side React page, demonstrated that Bun handles approximately 68,000 requests per second, whereas Deno and Node.js manage around 29,000 and 14,000, respectively, showcasing a significant performance differential. Bun goes beyond performance, encompassing bundling and task-running capabilities for projects built with JavaScript and TypeScript. Similar to Deno, it delivers as single binaries and incorporates built-in support for Web APIs. Additionally, it extends support to select Node.js libraries, ensuring npm compatibility.

Installation

You have the flexibility to install Bun as a native package on any operating system, or alternatively, you can opt for a global NPM package installation. While it may seem unconventional to use NPM to its replacement, this approach undeniably streamlines the installation process.

# with install script (recommended)
curl -fsSL https://bun.sh/install | bash

# with npm
npm install -g bun

# with Homebrew
brew tap oven-sh/bun
brew install bun

# with Docker
docker pull oven/bun
docker run --rm --init --ulimit memlock=-1:-1 oven/bun

Bun requires unzip package to be installed, if you are using the curl option for installation.

sudo apt update
sudo apt-get install unzip

If you are using Windows machine, install WSL to run Bun.

Refer the following link to install WSL in your machine.

Install Ubuntu on WSL2 and get started with graphical applications

You can do the installation by opening terminal with administrator privileges and executing the wsl — install command.

You can verify the bun installation using the following command,

bun --help

Experimenting with Bun

Let’s try some experiments with bun.

1. Creating a simple HTTP server

Let’s create a simple HTTP server using the Bun.serve API.

Create a project directory and switch to it.

mkdir http-server-demo
cd http-server-demo

Run the following command on your terminal to scaffold a new project.

bun init

Open index.ts and add the following code snippet to create a simple HTTP server using bun.

const server = Bun.serve({
  port: 3000,
  fetch(req) {
    return new Response("Hello World");
  },
});
console.log(`Listening on http://localhost:${server.port}`);

Execute the server using the following command,

bun run index.ts

You can open the url http://localhost:3000 in your browser to see the server response.

2. Creating a react application

Let’s create a react application using Bun’s vite support.

Execute the following command to create a boilerplate for your react application using Bun.

bun create vite bun-react-demo

Switch to the project directory and execute the following commands,

cd bun-react-demo
bun install
bun run dev

You can view the output as follows,

3. Bun for Next, Svelte, and Vue

For a Next.js application, Bun offers a comparable functionality initiated with the command: “bun create next ./app” In scenarios where an integrated loader isn’t present, Bun.js incorporates customizable loaders. These loaders enable the handling of files associated with frameworks like Svelte or Vue, such as .svelte or .vue extensions. Additionally, there’s an experimental SvelteKit adapter designed to run SvelteKit within the Bun environment.

Bun Roadmap

The Bun roadmap encompasses numerous open tasks, offering a glimpse into the project’s extensive scope and ambitious goals. Bun aspires to evolve into a comprehensive solution, serving as a versatile platform for various server-side JavaScript tasks.

Bun's Roadmap · Issue #159 · oven-sh/bun | github.com

Thanks for reading this article.

Thanks Gowri M Bhatt for reviewing the content.

If you enjoyed this article, please click on the heart button ♥ and share to help others find it!

The full source code for this tutorial can be found here,

GitHub - codemaker2015/bun-demo: Experimenting with the new JavaScript framework | github.com

The article is also available on Medium.

Useful Links

Streamlit cheatsheet for beginners

Vishnu Sivan — Sun, 27 Aug 2023 18:23:58 +0000

Streamlit is a widely used open-source Python framework which facilitates the creation and deployment of web apps for Machine Learning and Data Science. It enables a seamless process of developing and viewing results by allowing users to build apps just like writing Python code. This interactive loop between coding and web app visualization is a distinctive feature, making app development and result exploration efficient and effortless.

Getting Started

Streamlit Methods
Installation
Start with hello world
Text Elements Examples
Widget Examples
Input
Button
Checkbox
Radio
Slider
Date and time
Form
Status
Chart
Data
Chat

Streamlit Methods

Data Presentation:

The st.write() function empowers you to exhibit various data formats as per the requirements.
The st.metric() function supports you to showcase a singular metric.
The st.table() function is used for rendering tabular information.
The st.dataframe() function is engineered to elegantly showcase pandas dataframes.
The st.image() function offers seamless image display for visual content.
The st.audio() function takes care of audio file playback.

Headers and Text Styling:
The st.subheader() function serves as a valuable tool for generating subheadings within your application.
The st.markdown() function is used for enabling seamless integration of Markdown-formatted content.
The st.latex() functions stands as a powerful asset for expressing mathematical equations.

User Interaction:

To infuse your web application with interactive elements, Streamlit provides an array of widgets.
The st.checkbox() function is used for incorporating checkboxes.
The st.button() function is used for buttons.
The st.selectbox() function facilitates dropdown menus implementation.
The st.multiselect() function is designed to meet multi-selection dropdown requirements.
The st.file_uploader() function is used for handling file uploads.

Progress Tracking:

Streamlit offers functions tailored for indicating progress.
The st.progress() function is used to create a dynamic progress bar.
The st.spinner() function allows users to incorporate a spinner animation to denote ongoing processes.

Sidebar and Form Integration:

Streamlit’s versatile capabilities extend to incorporating a sidebar to accommodate supplementary functionality.
The st.sidebar() function is used to seamlessly integrate elements into secondary space.
The st.form() function establishes a framework for user interactions.

Custom HTML and CSS Integration:

Streamlit offers provisions for embedding custom HTML and CSS to tailor your web application’s appearance and behavior.
The st.markdown() function enables Markdown styling.
The st.write() function facilitates the integration of bespoke HTML components into your application.

Installation

To get started, the initial step involves the installation of Streamlit. Ensure that you have installed Python 3.7 to 3.10, along with PIP and your preferred Python Integrated Development Environment (IDE) in your machine. With these prerequisites, open your terminal and execute the following command to install Streamlit.

Create and activate the virtual environment by executing the following command.

python -m venv venv
source venv/bin/activate #for ubuntu
venv/Scripts/activate #for windows

Install streamlit library using pip.

pip install streamlit

Start with hello world

Initiate your Streamlit experience by delving into the pre-built “Hello World” application provided by the platform. To confirm the successful installation, execute the following command in your terminal to test its functionality:

streamlit hello

You can see streamlit Hello world app in a new tab in your web browser.

Text Elements Examples

Title: Defines the page’s title.
Header: Showcases text using header formatting.
Subheader: Presents text in sub header formatting.
Markdown: Applies markdown formatting to the text.
Code: Exhibits text as code with suitable syntax highlighting.
Latex: Utilizes LaTeX to present mathematical equations. Create a file named text_example.py and add the following code to it.

import streamlit as st

# set the app's title
st.title("Title in Streamlit")

# header
st.header("Header in Streamlit")

# subheader
st.subheader("Subheader in Streamlit")

# markdown
# display text in bold formatting
st.markdown("**Streamlit** is a widely used open-source Python framework, facilitates the creation and deployment of web apps for Machine Learning and Data Science.")
# display text in italic formatting
st.markdown("Visit [Streamlit](https://docs.streamlit.io) to learn more about Streamlit.")

# code block
code = '''
def add(a, b):
    print("a+b = ", a+b)
'''
st.code(code, language='python')

# latex
st.latex('''
(a+b)^2 = a^2 + b^2 + 2*a*b
''')

Run the text example using the following command.

streamlit run text_example.py

Widget Examples

Input

import streamlit as st

# text input
name = st.text_input("Enter your name", "")
st.write("Your name is ", name)

age = st.number_input(label="Enter your age")
st.write("Your age is ", age)

address = st.text_area("Enter your address", "")
st.write("Your address is ", address)

Button

import streamlit as st

#button
if st.button('Click me', help="Click to see the text change"):
    st.write('Welcome to Streamlit!')
else:
    st.write('Hi there!')

Checkbox

import streamlit as st

# check box
checked = st.checkbox('Click me')
if checked:
    st.write('You agreed the terms and conditions!')

Radio

import streamlit as st

# radio button
lang = st.radio(
    "What's your favorite programming language?",
    ('C','C++', 'Java','Python'))

if lang == 'C':
    st.write('You selected C')
elif lang == 'C++':
    st.write('You selected C++')
elif lang == 'C++':
    st.write('You selected Java')
else: 
    st.write('You selected Python')

Slider

import streamlit as st

# slider
age = st.slider('Please enter your age', 
                   min_value=0, max_value=100, value=10)
st.write("Your age is ", age)

Date and Time

import datetime
import streamlit as st

date = st.date_input("When's your birthday", datetime.date(2000, 1, 1), datetime.date(1990, 1, 1), datetime.datetime.now())
st.write("Your birthday is ", date)

time = st.time_input("Which is your birth time", datetime.time(0, 0))
st.write("Your birth time is ", time)

Form

import streamlit as st

with st.form("user_form"):
   st.header("User Registration")
   name = st.text_input("Enter your name", "")
   age = st.slider("Enter your age")
   gender = st.radio("Select your gender", ('Male', 'Female'))
   terms = st.checkbox("Accept terms and conditions")

   # Every form must have a submit button.
   submitted = st.form_submit_button("Submit")
   if submitted:
        if terms:
            st.write("Name: ", name, ", Age: ", age, ", Gender: ", gender)
        else:
            st.write("Accept terms and conditions")

st.write("Thanks for visiting")

Status

import streamlit as st
import time

# progress
progress_text = "Operation in progress. Please wait."
my_bar = st.progress(0, text=progress_text)

for percent_complete in range(100):
    time.sleep(0.1)
    my_bar.progress(percent_complete + 1, text=progress_text)

# spinner
with st.spinner('Wait for it...'):
    time.sleep(5)
st.success('Done!')

# messages 
st.toast('Your edited image was saved!', icon='😍')
st.error('This is an error', icon="🚨")
st.info('This is a purely informational message', icon="ℹ️")
st.warning('This is a warning', icon="⚠️")
st.success('This is a success message!', icon="✅")
e = RuntimeError('This is an exception of type RuntimeError')
st.exception(e)

Chart

import streamlit as st
import pandas as pd
import numpy as np

# chart
chart_data = pd.DataFrame(
    np.random.randn(20, 3),
    columns=['a', 'b', 'c'])

st.line_chart(chart_data)
st.bar_chart(chart_data)
st.area_chart(chart_data)

df = pd.DataFrame(
    np.random.randn(1000, 2) / [50, 50] + [37.76, -122.4],
    columns=['lat', 'lon'])

st.map(df)

Data

import streamlit as st
import pandas as pd
import numpy as np

# data frame
st.subheader("Data Frame")

df = pd.DataFrame(
   np.random.randn(50, 20),
   columns=('col %d' % i for i in range(20)))

st.dataframe(df)  # Same as st.write(df)

# table
st.subheader("Data Table")

df = pd.DataFrame(
   np.random.randn(10, 5),
   columns=('col %d' % i for i in range(5)))

st.table(df)

# data editor
st.subheader("Data Editor")

df = pd.DataFrame(
    [
       {"command": "st.selectbox", "rating": 4, "is_widget": True},
       {"command": "st.balloons", "rating": 5, "is_widget": False},
       {"command": "st.time_input", "rating": 3, "is_widget": True},
   ]
)
st.data_editor(df)

# metric
st.subheader("Data Metric")

st.metric(label="Temperature", value="70 °F", delta="1.2 °F")

col1, col2, col3 = st.columns(3)
col1.metric("Temperature", "70 °F", "1.2 °F")
col2.metric("Wind", "9 mph", "-8%")
col3.metric("Humidity", "86%", "4%")

# json
st.subheader("Data JSON")

st.json({
    'foo': 'bar',
    'baz': 'boz',
    'stuff': [
        'stuff 1',
        'stuff 2',
        'stuff 3',
        'stuff 5',
    ],
})

Chat

import streamlit as st
import numpy as np

prompt = st.chat_input("Enter the chart type (bar, area, line)")
print(prompt)
if prompt == "bar":
    with st.chat_message("user"):
        st.write("Bar Chart Demo 👋")
        st.bar_chart(np.random.randn(30, 3))
elif prompt == "area":
    with st.chat_message("user"):
        st.write("Area Chat Demo 👋")
        st.area_chart(np.random.randn(30, 3))
elif prompt == "line":
    with st.chat_message("user"):
        st.write("Line Chat Demo 👋")
        st.line_chart(np.random.randn(30, 3))
elif prompt is not None:
    with st.chat_message("user"):
        st.write("Wrong chart type")

Thanks for reading this article.

Thanks Gowri M Bhatt for reviewing the content.

If you enjoyed this article, please click on the heart button ♥ and share to help others find it!

The full source code for this tutorial can be found here,

GitHub - codemaker2015/streamlit-cheatsheet | github.com

The article is also available on Medium.

Here are some useful links,

Get started - Streamlit Docs | docs.streamlit.io

Talk with documents using LlamaIndex

Vishnu Sivan — Mon, 24 Jul 2023 19:08:33 +0000

Discover the latest buzz in the tech world with LangChain and LlamaIndex! These open-source libraries offer developers the opportunity to harness the incredible power of Large Language Models (LLMs) in their applications. LlamaIndex acts as a central hub, seamlessly connecting LLMs with external data sources. Meanwhile, LangChain provides a robust framework for constructing and managing LLM-powered applications. Though still in development, these game-changing tools have the potential to revolutionize the way we build and integrate advanced language models.

In this article, we will cover the basics of LlamaIndex and create a data extraction and analysis tool using LlamaIndex, LangChain and OpenAI.

Getting Started

What are Large Language Models (LLMs)
What is LangChain
What is Streamlit
Introduction to LlamaIndex
Basic workflow of LlamaIndex
LlamaIndex indices
Creating a document extractor / analyzer application using LlamaIndex, LangChain and OpenAI
Installing the dependencies
Setting up environment variables
Importing the libraries
Designing the sidebar
Defining the get_response method
Designing streamlit input field and submit button
Complete code for the app
Running the app

What are Large Language Models (LLMs)

Large Language Models (LLMs) refer to powerful AI models that are designed to understand and generate human language. LLMs are characterized by their ability to process and generate text that is coherent, contextually relevant, and often indistinguishable from human-written content. These models are pre-trained on diverse and extensive corpora of text, such as books, articles, websites, and other sources of written language. During pre-training, the models learn to predict the next word in a given sentence or fill in missing words in a paragraph, which helps them capture grammar, syntax, and semantic relationships between words and phrases.

Large Language Models have gained significant attention and popularity due to their versatility and the impressive quality of their language generation capabilities. They have found applications in various domains, including natural language processing, content creation, chatbots, virtual assistants, and even creative writing. However, it’s important to note that LLMs are still machines and may occasionally produce inaccurate or biased outputs, highlighting the need for careful evaluation and human oversight when using them in real-world applications.

What is LangChain

LangChain is an open-source framework developed to streamline the creation of applications powered by large language models (LLMs). It provides a comprehensive set of tools, components, and interfaces that simplify the development process of LLM-centric applications. By leveraging LangChain, developers can effortlessly manage interactions with language models, seamlessly connect various components, and integrate resources like APIs and databases. The LangChain platform also offers a range of embedded APIs that empower developers to incorporate language processing capabilities without starting from scratch.

As natural language processing continues to advance and gain wider adoption, the potential applications of this technology become virtually boundless. Here are some notable features of LangChain:

LangChain allows developers to tailor prompts according to their specific requirements, enabling more precise and relevant language model outputs.
LangChain enables developers to manipulate context to establish and guide the context for improved precision and user satisfaction, enhancing the overall user experience.
With LangChain, developers can construct chain link components, which facilitate advanced usage scenarios and provide greater flexibility in the application design.
The framework provides versatile components that can be mixed and matched to suit specific application needs, providing a modular approach to development.
LangChain supports the integration of various models, including popular ones like GPT and HuggingFace Hub, allowing developers to leverage the cutting-edge capabilities of these language models.

What is Streamlit

Streamlit is a Python library that enables the effortless creation and sharing of interactive web applications and data visualizations. It provides a user-friendly interface for developing interactive charts and graphs using popular data visualization libraries such as matplotlib, pandas, and plotly. With Streamlit, you can build web apps that respond in real-time to user input, making it easy to create dynamic and engaging data-driven applications.

Introduction to LlamaIndex

The primary concept behind LlamaIndex is the capability to query documents, whether they consist of text or code, using a language model (LLM) such as ChatGPT. It is an open-source project that serves as a bridge between large language models (LLMs) and external data sources such as APIs, PDFs, and SQL databases. It offers a straightforward interface and facilitates the creation of indices for both structured and unstructured data, effectively handling the variations among different data sources. LlamaIndex can store the necessary context for prompt engineering, address challenges when dealing with large context windows, and assist in balancing cost and performance considerations when executing queries.

llamaindex (LlamaIndex) | Org profile for LlamaIndex on Hugging Face, the AI community building the future. | huggingface.co

Basic workflow of LlamaIndex

The document is loaded into LlamaIndex using pre-built readers for various sources, including databases, Discord, Slack, Google Docs, Notion, and GitHub repositories.
LlamaIndex parses the documents, breaking them down into nodes or chunks of text.
An index is created to efficiently retrieve relevant data when querying the documents. The index can be stored in different ways, with the Vector Store being a commonly used method.
To perform a query, the document is searched using the index stored in the vector store. The response is then sent back to the user.

LlamaIndex indices

LlamaIndex provides specialized indices in the form of unique data structures.

Vector store index: Widely used for answering queries across a large corpus of data.
List index: Beneficial for synthesizing answers that combine information from multiple data sources.
Keyword table index: Useful for routing queries to different unrelated data sources.
Knowledge graph index: Effective for constructing and utilizing knowledge graphs.
Structured store index: Well-suited for handling structured data, such as SQL queries.
Tree index: Valuable for summarizing collections of documents.

Creating a document extractor / analyzer application using LlamaIndex, LangChain and OpenAI

In the previous sections, we discussed the basics of LLMs, LangChain and LlamaIndex. In this section, we will create a basic document extractor / analyzer application using these generative AI tools. The application takes openai key and the directory path as inputs and provides an interface to interact with the documents listed in the specified directory.

Installing the dependencies

Create and activate a virtual environment by executing the following command.

python -m venv venv
source venv/bin/activate #for ubuntu
venv/Scripts/activate #for windows

Install llama-index and streamlit libraries using pip.

Note that LlamaIndex requires python 3.8+ version to work. Try using the specified streamlit and llama-index version since the latest version gives some RateLimit error and is not yet fixed by the LlamaIndex team.

pip install streamlit==1.24.0
pip install llama-index==0.5.27

Setting up environment variables

Openai key is required to access LlamaIndex. Follow the steps to create a new openai key.

Open platform.openai.com.
Click on your name or icon option which is located on the top right corner of the page and select “API Keys” or click on the link — Account API Keys — OpenAI API.
Click on create new secret key button to create a new openai key.

Importing the libraries

Import the necessary libraries by creating a file named app.py and add the following code to it.

import os, streamlit as st

from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader, LLMPredictor, PromptHelper, ServiceContext
from langchain.llms.openai import OpenAI

Designing the sidebar

Create a sidebar using the streamit sidebar class to collect the openai key and the directory path from the user. Add the following code to the app.py file.

openai_api_key = st.sidebar.text_input(
    label="#### Your OpenAI API key 👇",
    placeholder="Paste your openAI API key, sk-",
    type="password")

directory_path = st.sidebar.text_input(
    label="#### Your data directory path 👇",
    placeholder="C:\data",
    type="default")

Defining the get_response method

Create a get_response() method which takes query, directory_pathand openai_api_key as arguments and returns the query response.

Add the following code to the app.py file.

def get_response(query,directory_path,openai_api_key):
    # This example uses text-davinci-003 by default; feel free to change if desired. 
    # Skip openai_api_key argument if you have already set it up in environment variables (Line No: 7)
    llm_predictor = LLMPredictor(llm=OpenAI(openai_api_key=openai_api_key, temperature=0, model_name="text-davinci-003"))

    # Configure prompt parameters and initialise helper
    max_input_size = 4096
    num_output = 256
    max_chunk_overlap = 20

    prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)

    if os.path.isdir(directory_path): 
        # Load documents from the 'data' directory
        documents = SimpleDirectoryReader(directory_path).load_data()
        service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)
        index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context)

        response = index.query(query)
        if response is None:
            st.error("Oops! No result found")
        else:
            st.success(response)
    else:
        st.error(f"Not a valid directory: {directory_path}")

Understanding the code:

Create an object llm_predictor for the class LLMPredictor which accepts a parameter llm. Specify a model, text-davinci-003 from OpenAI’s API. Specify temperature and openai api key as the object arguments.
Create a PromptHelper by specifying the maximum input size (max_input_size), number of outputs (num_output), maximum chunk overlap (max_chunk_overlap).
The SimpleDirectoryReader class is designed to read data from a directory. It takes an input_files parameter, which is used to dynamically generate a filename and is passed to the SimpleDirectoryReader instance. When the load_data method is invoked on the SimpleDirectoryReader object, it is responsible for loading the data from the specified input files and returning the documents that have been successfully loaded.
The GPTSimpleVectorIndex class is specifically designed to establish an index that enables efficient searching and retrieval of documents. To create this index, we use the from_documents method of the class, which requires two parameters — documents and service_context.
The documents parameter is used to represent the actual documents that will be indexed.
The service_context parameter is used to denote the service context being passed along with the documents.
Query documents using index.query(query) as the query input.

Designing streamlit input field and submit button

Create an input field and a submit button using streamlit to get the user queries. Call the get_response() method inside the submit button to execute llama-index for querying the documents through the given input.

Add the following code to the app.py file.

# Define a simple Streamlit app
st.title("ChatMATE")
query = st.text_input("What would you like to ask?", "")

# If the 'Submit' button is clicked
if st.button("Submit"):
    if not query.strip():
        st.error(f"Please provide the search query.")
    else:
        try:
            if len(openai_api_key) > 0:
                get_response(query,directory_path,openai_api_key)
            else:
                st.error(f"Enter a valid openai key")
        except Exception as e:
            st.error(f"An error occurred: {e}")

Complete code for the app

import os, streamlit as st

from llama_index import GPTSimpleVectorIndex, SimpleDirectoryReader, LLMPredictor, PromptHelper, ServiceContext
from langchain.llms.openai import OpenAI

# Uncomment to specify your OpenAI API key here, or add corresponding environment variable (recommended)
# os.environ['OPENAI_API_KEY']= "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"

# Provide openai key from the frontend if you are not using the above line of code to seet the key
openai_api_key = st.sidebar.text_input(
    label="#### Your OpenAI API key 👇",
    placeholder="Paste your openAI API key, sk-",
    type="password")

directory_path = st.sidebar.text_input(
    label="#### Your data directory path 👇",
    placeholder="C:\data",
    type="default")

def get_response(query,directory_path,openai_api_key):
    # This example uses text-davinci-003 by default; feel free to change if desired. 
    # Skip openai_api_key argument if you have already set it up in environment variables (Line No: 7)
    llm_predictor = LLMPredictor(llm=OpenAI(openai_api_key=openai_api_key, temperature=0, model_name="text-davinci-003"))

    # Configure prompt parameters and initialise helper
    max_input_size = 4096
    num_output = 256
    max_chunk_overlap = 20

    prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)

    if os.path.isdir(directory_path): 
        # Load documents from the 'data' directory
        documents = SimpleDirectoryReader(directory_path).load_data()
        service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)
        index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context)

        response = index.query(query)
        if response is None:
            st.error("Oops! No result found")
        else:
            st.success(response)
    else:
        st.error(f"Not a valid directory: {directory_path}")

# Define a simple Streamlit app
st.title("ChatMATE")
query = st.text_input("What would you like to ask?", "")

# If the 'Submit' button is clicked
if st.button("Submit"):
    if not query.strip():
        st.error(f"Please provide the search query.")
    else:
        try:
            if len(openai_api_key) > 0:
                get_response(query,directory_path,openai_api_key)
            else:
                st.error(f"Enter a valid openai key")
        except Exception as e:
            st.error(f"An error occurred: {e}")

Running the app

To run the app, an openai api key and a directory path is required. Create a few text files that contains your required content and place it in a directory. Specify the directory while running the app. In this demo, text files containing information about quantum physics and quantum computing were used.

Run the app using the following command,

streamlit run app.py

The output is as given below,

Thanks for reading this article.

Thanks Gowri M Bhatt for reviewing the content.

If you enjoyed this article, please click on the heart button ♥ and share to help others find it!

The full source code for this tutorial can be found here,

GitHub - codemaker2015/llamaindex-based-document-extractor | github.com

Here are some useful links,

Say hello to DragGAN — The cutting-edge AI tool now available!

Vishnu Sivan — Sat, 01 Jul 2023 17:05:20 +0000

Exciting news for all image editing enthusiasts! The highly anticipated DragGAN code has finally been released and is now available under the CC-BY-NC license. With DragGAN, gone are the days of complex editing processes and painstaking adjustments. This remarkable solution introduces a whole new level of simplicity by allowing you to effortlessly drag elements within an image to transform their appearance. Drawing inspiration from the powerful StyleGAN3 and StyleGAN-Human models, DragGAN empowers users to manipulate various aspects of an image, whether it’s altering the dimensions of a car, modifying facial expressions, or even rotating the image as if it were a 3D model.

In this article, we will go through the basics of DragGAN and try it out using Google Colab.

Getting Started

Introduction to GAN
Types of GAN
StyleGAN3 and StyleGAN-Human
Applications of GAN
DragGAN
How to use it

Introduction to GAN

A generative adversarial network (GAN) is a special kind of machine learning model that uses two neural networks to compete with each other. These neural networks, called the generator and the discriminator, work together in a game-like manner to improve their skills. The generator tries to create realistic data that looks like the real thing, while the discriminator tries to figure out which data is real and which is fake. They learn from each other’s successes and failures, making the generator better at creating convincing fake data and the discriminator better at spotting fakes. GANs are used to make computer-generated images, videos, and other types of data that look very similar to what humans create.

Image source: Overview of GAN Structure | Machine Learning | Google for Developers

GANs can generate high-quality, realistic data that exhibits similar characteristics to the training dataset. GANs have found applications in various fields, including image synthesis, video generation, text-to-image translation, and more.

Types of GAN

GANs come in different types. Let’s explore some common GAN variants:

Vanilla GAN: This is the simplest type of GAN, consisting of a generator and a discriminator. The generator creates images while the discriminator determines if an image is real or fake.
Deep Convolutional GAN (DCGAN): DCGAN employs deep convolutional neural networks to generate high-resolution and distinguishable images. Convolutional layers extract important details from the data, making it effective for image generation tasks.
Progressive GAN: The generator starts by producing low-resolution images, and as the training progresses, it adds more details in subsequent layers. This approach enables faster training compared to non-progressive GANs and results in higher resolution images being generated.
Conditional GAN: This type of GAN allows the network to be conditioned on specific information, such as class labels. It helps the GAN learn to differentiate between different classes by training with labeled images.
CycleGAN: This type of GAN is often used for image style transfer, enabling the transformation between different image styles. For example, it can convert images from winter to summer or from a horse to a zebra. Applications like FaceApp utilize CycleGAN to alter facial appearances.
Super Resolution GAN: This GAN type enhances low-resolution images by generating higher-resolution versions. It fills in missing details, improving the overall image quality.
StyleGAN: Developed by Nvidia, StyleGAN generates high-quality, photorealistic images, especially focusing on realistic human faces. Users can manipulate the model to modify various aspects of the generated images.

StyleGAN3 and StyleGAN-Human

StyleGAN3 — It is an evolution of the original StyleGAN that introduces several improvements and innovations to enhance the image generation process. It incorporates adaptive discriminator augmentation (ADA), a technique that dynamically adjusts the discriminator during training to improve the overall image quality. StyleGAN3 also introduces novel regularization methods, architectural modifications, and better optimization strategies, resulting in even more visually appealing and coherent face synthesis.
StyleGAN-Human — It is a variant of StyleGAN3 that specifically focuses on generating realistic human faces. It leverages a large-scale dataset of human faces to learn intricate details, such as facial expressions, hair styles, and diverse characteristics.

Applications of GAN

GANs have gained popularity in online retail sales due to their ability to understand and recreate visual content accurately. They can fill in images from outlines, generate realistic images from text descriptions, and create photorealistic product prototypes. They learn from human movement patterns, predict future frames, and create deepfake videos in video production. Furthermore, GANs can generate realistic speech sounds and even generate text for various purposes like blogs, articles, and product descriptions.

Let’s have a look at some of the use cases of GAN.

Realistic 3D Object Generation: GANs have proven capable of generating three-dimensional objects, such as furniture models created by researchers at MIT that resemble designs crafted by humans. These models can be valuable for architectural visualization and video game production.
Human Face Generation: GANs, such as Nvidia’s StyleGAN2, can generate highly realistic and believable human faces that appear to be genuine individuals.
Video Game Character Creation: GANs have found applications in video game development, such as the use of GANs by Nvidia to generate new characters for the popular game Final Fantasy XV.
Fashion Design Innovation: GANs have been utilized by clothing retailer H&M to create fresh fashion designs inspired by existing styles, allowing for the development of unique apparel.

DragGAN

DragGAN is an exciting new AI application that revolutionizes photo and art adjustments with a simple drag-and-drop interface. It allows you to modify images across various categories like animals, cars, people, landscapes, and more. With DragGAN, you can reshape the image layout, adjust poses and shapes, and even change facial expressions of individuals in photos.

According to the research team behind DragGAN, their aim is to provide users with the ability to “drag” any point in an image to their desired position.

DragGAN comprises two key components. The first is feature-based motion supervision, which facilitates precise movement of points within the image. The second is a novel point tracking approach, ensuring accurate tracking of these points.

How to use it

In this section, we will try dragGAN using the official git repository.

GitHub - XingangPan/DragGAN: Official Code for DragGAN (SIGGRAPH 2023)

Open you google colab account using the below link.
Google Colaboratory | colab.research.google.com
Click on the New notebook link to create new notebook in Colab.
Clone the official DragGAN’s git repository using the following command.

!git clone https://github.com/XingangPan/DragGAN.git

Click on the play button to execute the cell.
Switch the runtime type to GPU from the Runtime → Change runtime type option else it may take longer to process the results.

Click on + Code button to add new cells.
Switch to the DragGAN directory using the cd command.

cd /content/DragGAN

Install the requirements from the requirements.txt file.

!pip install -r requirements.txt

Download pre-trained StyleGAN2 weights by executing the download_model.sh shell script using the below command. If you want to try StyleGAN-Human and the Landscapes HQ (LHQ) dataset, download weights from the following links: StyleGAN-Human, LHQ.

!sh scripts/download_model.sh

Run the dragGAN visualizer created using gradio using the following command. The system will provide a network URL once the visualizer is up and running. Click on the URL obtained to try dragGAN.

!python /content/DragGAN/visualizer_drag_gradio.py

You will get the output as below,

Thanks for reading this article.

Thanks Gowri M Bhatt for reviewing the content.

If you enjoyed this article, please click on the heart button ♥ and share to help others find it!

The full source code for this tutorial can be found here,

GitHub - codemaker2015/DragGAN-demo
Contribute to codemaker2015/DragGAN-demo development by creating an account on GitHub.
github.com

The article is also available on Medium.

Here are some useful links,

DEV Community: Vishnu Sivan

Introducing uv: Next-Gen Python Package Manager

Getting Started

Table of contents

pip limitations

What is uv

Key features of uv

Benchmarks

Installation

Creating virtual environments

Installing packages

Building a flask app using uv

Install flask

Create the Flask Application

Run the app

Installing python with uv

Tools

Cheatsheet

Current Limitations

Resources

The ultimate guide to Retrieval-Augmented Generation (RAG)

Getting Started

Table of contents

What is RAG

Architecture of RAG

RAG Process flow

RAG vs Fine tuning

Types of RAG

Applications of RAG

Building a PDF chat system using RAG

Installing dependencies

Setting up environment and credentials

Importing environment variables

Importing required libraries

Defining a function to extract text from PDFs

Splitting extracted text into chunks

Creating a vector store for text embeddings

Building a conversational retrieval chain

Handling user queries

Creating custom HTML template for streamlit chat

Displaying chat history

Building Streamlit app interface

Complete Code for the PDF Chat Application

Run the Application

Resources

Building a symptoms-based diagnosis system using all-MiniLM-L6-V2

Getting Started

Table of contents

What is Small Language Model (SLM)

Key Characteristics of SLMs:

What is all-MiniLM-L6-V2

Key Characteristics:

Experimenting with all-MiniLM-L6-V2

Installing dependencies

Sentence similarity using all-MiniLM-L6-V2

Building a symptoms-based diagnosis system

Importing necessary libraries

Importing dataset

Initializing sentence transformers

Finding conditions by symptoms

Testing with sample input

Final code

Resources

AISuite: Simplifying GenAI integration across multiple LLM providers

Getting Started

Table of contents

What is AISuite

Key Features of AISuite:

Why is AISuite Important?

Experimenting with AISuite

Installing dependencies

Setting up environment and credentials

Initialize the AISuite Client

Querying the model

Creating a Chat Completion

Creating a generic function for querying

Interacting with multiple APIs

Resources

Building a video insights generator using Gemini Flash

Getting Started