DEV Community: Vishnu Sivan The latest articles on DEV Community by Vishnu Sivan (@codemaker2015). https://dev.to/codemaker2015 https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F846070%2F2bde4dc9-8fd1-4357-906a-c09d9ae3787a.jpeg DEV Community: Vishnu Sivan https://dev.to/codemaker2015 en Introducing uv: Next-Gen Python Package Manager Vishnu Sivan Mon, 16 Dec 2024 16:08:23 +0000 https://dev.to/codemaker2015/introducing-uv-next-gen-python-package-manager-4jbc https://dev.to/codemaker2015/introducing-uv-next-gen-python-package-manager-4jbc <p>Python evolution has been closely tied to advancements in package management, from manual installations to tools like pip and poetry. However, as projects grow in complexity, traditional tools often fall short in speed and efficiency.</p> <p>uv is a cutting-edge Python package and project manager built with Rust, aims to change that. Combining the functionality of tools like pip, poetry, and virtualenv, uv streamlines tasks like dependency management, script execution, and project building—all with exceptional performance. Its seamless compatibility with pip commands, requiring no additional learning curve.</p> <p>In this tutorial, we will explore how to install uv and make the most of its features. From setting up a project and managing dependencies to running scripts and leveraging its enhanced pip interface.</p> <h2> Getting Started </h2> <h3> Table of contents </h3> <ul> <li>pip limitations</li> <li>What is uv</li> <li>Key features of uv</li> <li>Benchmarks</li> <li>Installation</li> <li>Creating virtual environments</li> <li>Building a flask app using uv</li> <li>Installing python with uv</li> <li>Tools</li> <li>Cheatsheet</li> <li>Current Limitations</li> </ul> <h3> pip limitations </h3> <p>Pip is a widely used package management system written in Python, designed to install and manage software packages. However, despite its popularity, it is often criticized for being one of the slowest package management tools for Python. Complaints about “pip install being slow” are so common that they frequently appear in developer forums and threads.</p> <p>One significant drawback of pip is its susceptibility to dependency smells, which occur when dependency configuration files are poorly written or maintained. These issues can lead to serious consequences, such as increased complexity and reduced maintainability of projects.</p> <p>Another limitation of pip is its inability to consistently match Python code accurately when restoring runtime environments. This mismatch can result in a low success rate for dependency inference, making it challenging to reliably recreate project environments.</p> <h3> What is uv </h3> <p>uv is a modern, high-performance Python package manager, developed by the creators of ruff and written in Rust. Designed as a drop-in replacement for pip and pip-tools, it delivers exceptional speed and compatibility with existing tools.</p> <p>Key features include support for editable installs, Git and URL dependencies, constraint files, custom indexes, and more. uv’s standards-compliant virtual environments work seamlessly with other tools, avoiding lock-in or customization. It is cross-platform, supporting Linux, Windows, and macOS, and has been tested extensively against the PyPI index.</p> <p>Focusing on simplicity, speed, and reliability, uv addresses common developer pain points like slow installations, version conflicts, and complex dependency management, offering an intuitive solution for modern Python development.</p> <h4> Key features of uv </h4> <ul> <li> <strong>⚖️ Drop-in Replacement</strong>: Seamlessly replaces pip, pip-tools, virtualenv, and other tools with full compatibility.</li> <li> <strong>⚡ Blazing Speed</strong>: 10–100x faster than traditional tools like pip, pip-compile, and pip-sync.</li> <li> <strong>💾 Disk-Space Efficient</strong>: Utilizes a global cache for dependency deduplication, saving storage.</li> <li> <strong>🐍 Flexible Installation</strong>: Installable via curl, pip, or pipx without requiring Rust or Python.</li> <li> <strong>🧪 Thoroughly Tested</strong>: Proven performance at scale with the top 10,000 PyPI packages.</li> <li> <strong>🖥️ Cross-Platform Support</strong>: Fully compatible with macOS, Linux, and Windows.</li> <li> <strong>🔩 Advanced Dependency Management</strong>: Features include dependency version overrides, alternative resolution strategies, and a conflict-tracking resolver.</li> <li> <strong>⁉️ Clear Error Messaging</strong>: Best-in-class error handling ensures developers can resolve conflicts efficiently.</li> <li> <strong>🤝 Modern Python Features</strong>: Supports editable installs, Git dependencies, direct URLs, local dependencies, constraint files, and more.</li> <li> <strong>🚀 Unified Tooling</strong>: Combines the functionality of tools like pip, pipx, poetry, pyenv, twine, and more into a single solution.</li> <li> <strong>🛠️ Application and Script Management</strong>: Installs and manages Python versions, runs scripts with inline dependency metadata, and supports comprehensive project workflows.</li> <li> <strong>🗂️ Universal Lockfile</strong>: Simplifies project management with consistent and portable lockfiles.</li> <li> <strong>🏢 Workspace Support</strong>: Handles scalable projects with Cargo-style workspace management.</li> </ul> <h3> Benchmarks </h3> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fge55ay5lf3kkp9s1sj3e.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fge55ay5lf3kkp9s1sj3e.png" alt="benchmarks" width="800" height="135"></a><br> source: <a href="proxy.php?url=https://blog.kusho.ai/uv-pip-killer-or-yet-another-package-manager" rel="noopener noreferrer">https://blog.kusho.ai/uv-pip-killer-or-yet-another-package-manager</a><br> Resolving (left) and installing (right) dependencies using a warm cache, simulating the process of recreating a virtual environment or adding a new dependency to an existing project.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Focbnw85t7w6cryp22zcx.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Focbnw85t7w6cryp22zcx.png" alt="benchmarks" width="800" height="117"></a><br> source: <a href="proxy.php?url=https://blog.kusho.ai/uv-pip-killer-or-yet-another-package-manager" rel="noopener noreferrer">https://blog.kusho.ai/uv-pip-killer-or-yet-another-package-manager</a><br> Resolving (left) and installing (right) dependencies with a cold cache simulate execution in a clean environment. Without caching, uv is 8–10x faster than pip and pip-tools, and with a warm cache, it achieves speeds 80–115x faster.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F44bv9t4utb8i4c5y6ykn.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F44bv9t4utb8i4c5y6ykn.png" alt="benchmarks" width="800" height="115"></a><br> source: <a href="proxy.php?url=https://blog.kusho.ai/uv-pip-killer-or-yet-another-package-manager" rel="noopener noreferrer">https://blog.kusho.ai/uv-pip-killer-or-yet-another-package-manager</a><br> Creating a virtual environment with (left) and without (right) seed packages like pip and setuptools. uv is approximately 80x faster than python -m venv and 7x faster than virtualenv, all while operating independently of Python.</p> <h3> Installation </h3> <p>Installing uv is quick and straightforward. You can opt for standalone installers or install it directly from PyPI.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code># On macOS and Linux. curl -LsSf https://astral.sh/uv/install.sh | sh # On Windows. powershell -c "irm https://astral.sh/uv/install.ps1 | iex" # With pip. pip install uv # With pipx. pipx install uv # With Homebrew. brew install uv # With Pacman. pacman -S uv </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fifk2oz5geyofxlnzdm1c.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fifk2oz5geyofxlnzdm1c.png" alt="Installation" width="800" height="161"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fspl8faprxki0p1idsra3.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fspl8faprxki0p1idsra3.png" alt="Installation" width="640" height="187"></a></p> <p>Before using uv, we have to add the uv path to environment variables.<br> For Linux and macOS, modify the PATH environment variable using the following command in the terminal:<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>export PATH="$HOME/.local/bin:$PATH" </code></pre> </div> <p>For windows, To add a directory to the PATH environment variable for both user and system on Windows, search for Environment Variables in the search panel. Under User variables / System variables, select the Path variable, click Edit, then click New and add the desired path.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>%USERPROFILE%\.local\bin </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fghjbgrbw2k90t4u3n1vl.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fghjbgrbw2k90t4u3n1vl.png" alt="env" width="800" height="289"></a><br> After the installation, run the uv command in the terminal to verify that it has been installed correctly.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fan9wqwnjms6zfay017kk.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fan9wqwnjms6zfay017kk.png" alt="Installation" width="800" height="329"></a></p> <h3> Creating virtual environments </h3> <p>Creating a virtual environment with uv is simple and straightforward. Use the following command, along with your desired environment name, to create it.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>uv venv </code></pre> </div> <ul> <li>Run the following command to activate the virtual environment. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code># On macOS and Linux. source .venv/bin/activate # On Windows. .venv\Scripts\activate </code></pre> </div> <h3> Installing packages </h3> <p>Installing packages into the virtual environment follows a familiar process. The various installation methods are given below.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>uv pip install flask # Install Flask. uv pip install -r requirements.txt # Install from a requirements.txt file. uv pip install -e . # Install current project in editable mode. uv pip install "package @ ." # Install current project from disk uv pip install "flask[dotenv]" # Install Flask with "dotenv" extra. </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi1f0aceyovlioilgq9es.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi1f0aceyovlioilgq9es.png" alt="Installation" width="800" height="344"></a></p> <p>To synchronize the locked dependencies with the virtual environment, use the following command:<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>uv pip sync requirements.txt # Install dependencies from a requirements.txt file. </code></pre> </div> <p>uv supports a variety of command-line arguments similar to those of existing tools, including -r requirements.txt, -c constraints.txt, -e ., --index-url, and more.</p> <h3> Building a flask app using uv </h3> <p>Let’s explore some project-related commands with uv. Start by initializing a Python project named “sample-project.”<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>uv init sample-project </code></pre> </div> <p>Navigate to the sample-project directory. uv initializes the project with essential files such as app.py, requirements.txt, README.md, and more.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb4p6uiqbuprpo3xtzu25.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb4p6uiqbuprpo3xtzu25.png" alt="Project structure" width="800" height="349"></a></p> <p>Use the run command to execute the sample Python file. This process first creates the virtual environment folder and then runs the Python file.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>uv run hello.py </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnxdzpstj1nkjwsj1upmx.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnxdzpstj1nkjwsj1upmx.png" alt="project1" width="800" height="234"></a></p> <h3> Install flask </h3> <p>Add Flask to your project dependencies.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>uv add flask </code></pre> </div> <h3> Create the Flask Application </h3> <p>Create a new one and write the following code.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>from flask import Flask app = Flask(__name__) @app.route('/', methods=['GET']) def hello_world(): return {"message": "Hello, World!"}, 200 if __name__ == '__main__': app.run(debug=True) </code></pre> </div> <h3> Run the app </h3> <p>Use the uv run command to execute the application.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>uv run app.py </code></pre> </div> <p>Open a browser or use a tool like curl or Postman to send a GET request.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw2332mt4arfsc7i17c4j.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw2332mt4arfsc7i17c4j.png" alt="Output1" width="640" height="443"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3w5uc7o8uo5c8hlpbl1h.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3w5uc7o8uo5c8hlpbl1h.png" alt="Output2" width="537" height="238"></a></p> <h3> Installing python with uv </h3> <p>Using uv to install Python is optional, as it works seamlessly with existing Python installations. However, if installing Python through uv is preferred, it can be done with a straightforward command:<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>uv python install 3.12 </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flo8me1u4uo4xyc2xc5vg.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flo8me1u4uo4xyc2xc5vg.png" alt="python installation" width="800" height="136"></a></p> <p>This approach is often more convenient and reliable compared to traditional methods, as it avoids the need for managing repositories or downloading installers. Simply execute the command, and the setup is ready to use.</p> <h3> Tools </h3> <p>CLI tools can be installed and used with the uv command. For example, the huggingface_hub tools can be installed to enable pulling and pushing files to Hugging Face repositories.</p> <ul> <li>Use the following command to install huggingface_hub using uv. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>uv tool install huggingface_hub </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo3upnqo2qmk72nmey98k.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo3upnqo2qmk72nmey98k.png" alt="tool" width="800" height="431"></a></p> <ul> <li>The following command displays all the installed tools: </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>uv tool list </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu22ezef181wr9ubuadnb.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu22ezef181wr9ubuadnb.png" alt="tool" width="800" height="136"></a></p> <h3> Cheatsheet </h3> <p>Here is a quick cheatsheet for performing common operations with uv:</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwpbq7ebbofc6jvfraiqb.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwpbq7ebbofc6jvfraiqb.png" alt="Cheatsheet" width="800" height="749"></a></p> <h3> Current Limitations </h3> <p>Even though uv offers a fast and efficient solution for Python package management, it has some limitations:</p> <ul> <li> <strong>Incomplete pip Compatibility</strong>: Although uv supports a substantial portion of the pip interface, it does not yet cover the entire feature set. Some of these differences are intentional design choices, while others stem from uv still being in its early stages of development. For a detailed comparison, consult the pip compatibility guide.</li> <li> <strong>Platform-Specific requirements.txt</strong>: Like pip-compile, uv generates platform-specific requirements.txt files. This contrasts with tools such as Poetry and PDM, which create platform-agnostic poetry.lock and pdm.lock files. Consequently, uv's requirements.txt files may lack portability across different platforms and Python versions.</li> </ul> <p>Thanks for reading this article !!</p> <p>Thanks Gowri M Bhatt for reviewing the content.</p> <p>If you enjoyed this article, please click on the heart button ♥ and share to help others find it!</p> <h3> Resources </h3> <p><a href="proxy.php?url=https://docs.astral.sh/uv" rel="noopener noreferrer">uv - An extremely fast Python package and project manager, written in Rust | docs.astral.sh</a></p> python The ultimate guide to Retrieval-Augmented Generation (RAG) Vishnu Sivan Mon, 16 Dec 2024 09:56:09 +0000 https://dev.to/codemaker2015/the-ultimate-guide-to-retrieval-augmented-generation-rag-5e6e https://dev.to/codemaker2015/the-ultimate-guide-to-retrieval-augmented-generation-rag-5e6e <p>The rapid evolution of generative AI models like OpenAI’s ChatGPT has revolutionized natural language processing, enabling these systems to generate coherent and contextually relevant responses. However, even state-of-the-art models face limitations when tackling domain-specific queries or providing highly accurate information. This often leads to challenges like hallucinations — instances where models produce inaccurate or fabricated details.</p> <p>Retrieval-Augmented Generation (RAG), an innovative framework designed to bridge this gap. By seamlessly integrating external data sources, RAG empowers generative models to retrieve real-time, niche information, significantly enhancing their accuracy and reliability.</p> <p>In this article, we will dive into the mechanics of RAG, explore its architecture, and discuss the limitations of traditional generative models that inspired its creation. We will also highlight practical implementations, advanced techniques, and evaluation methods, showcasing how RAG is transforming the way AI interacts with specialized data.</p> <h2> Getting Started </h2> <h3> Table of contents </h3> <ul> <li>What is RAG</li> <li>Architecture of RAG</li> <li>RAG Process flow</li> <li>RAG vs Fine tuning</li> <li>Types of RAG</li> <li>Applications of RAG</li> <li>Building a PDF chat system with RAG</li> <li>Resources</li> </ul> <h3> What is RAG </h3> <p>Retrieval-Augmented Generation (RAG) is an advanced framework that enhances the capabilities of generative AI models by integrating real-time retrieval of external data. While generative models excel at producing coherent, human-like text, they can falter when asked to provide accurate, up-to-date, or domain-specific information. This is where RAG steps in, ensuring that the responses are not only creative but also grounded in reliable and relevant sources.</p> <p>RAG operates by connecting a generative model with a retrieval mechanism, typically powered by vector databases or search systems. When a query is received, the retrieval component searches through vast external datasets to fetch relevant information. The generative model then synthesizes this data, producing an output that is both accurate and contextually insightful.</p> <p>By addressing key challenges like hallucinations and limited domain knowledge, RAG unlocks the potential of generative models to excel in specialized fields. Its applications span diverse industries, from automating customer support with precise answers, enabling researchers to access curated knowledge on demand. RAG represents a significant step forward in making AI systems more intelligent, trustworthy, and useful in real-world scenarios.</p> <h3> Architecture of RAG </h3> <p>A clear understanding of RAG architecture is essential for unlocking its full potential and benefits. At its core, the framework is built on two primary components: the Retriever and the Generator, working together in a seamless flow of information processing.</p> <p>This overall process is illustrated below:<br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxwmugmgcs2ybhg1qm7ck.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxwmugmgcs2ybhg1qm7ck.png" alt="architecture" width="800" height="400"></a><br> source: <a href="proxy.php?url=https://weaviate.io/blog/introduction-to-rag" rel="noopener noreferrer">https://weaviate.io/blog/introduction-to-rag</a></p> <ul> <li> <strong>Retrieval</strong> — The inference stage in RAG begins with retrieval, where data relevant to a user query is fetched from an external knowledge source. In a basic RAG setup, similarity search is commonly used, embedding the query and external data into the same vector space to identify the closest matches. The Retriever plays a key role in fetching documents, employing methods like Sparse Retrieval and Dense Retrieval. Sparse Retrieval, using techniques like TF-IDF or BM25, relies on exact word matches but struggles with synonyms and paraphrasing whereas Dense Retrieval leverages transformer models like BERT or RoBERTa to create semantic vector representations, enabling more accurate and nuanced matches.</li> <li> <strong>Augmentation</strong> — After retrieving the most relevant data points from the external source, the augmentation process incorporates this information by embedding it into a predefined prompt template.</li> <li> <strong>Generation</strong> — In the generation phase, the model uses the augmented prompt to craft a coherent, contextually accurate response by combining its internal language understanding with the retrieved external data. While augmentation integrates external facts, generation transforms this enriched information into natural, human-like text tailored to the user’s query.</li> </ul> <h3> RAG Process flow </h3> <p>All the stages and essential components of the RAG process flow, illustrated in the figure below.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9fcmc02s1b9mg55jgs1b.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9fcmc02s1b9mg55jgs1b.png" alt="process flow" width="800" height="384"></a><br> source: <a href="proxy.php?url=https://www.griddynamics.com/blog/retrieval-augmented-generation-llm" rel="noopener noreferrer">https://www.griddynamics.com/blog/retrieval-augmented-generation-llm</a></p> <ul> <li> <strong>Document Loading</strong>: The first step in the RAG process involves data preparation, which includes loading documents from storage, extracting, parsing, cleaning, and formatting text for document splitting. Text data can come in various formats, such as plain text, PDFs, Word documents, CSV, JSON, HTML, Markdown, or programming code. Preparing these diverse sources for LLMs typically requires converting them to plain text through extraction, parsing, and cleaning.</li> <li> <strong>Document Splitting</strong>: Documents are divided into smaller, manageable segments through text splitting or chunking, which is essential for handling large documents and adhering to token limits in LLMs (e.g., GPT-3’s 2048 tokens). Strategies include fixed-size or content-aware chunking, with the approach depending on the structure and requirements of the data. <img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6s8w3g7fzxxy1o2x1x6z.png" alt="document chunking" width="461" height="371"> </li> </ul> <p>Dividing documents into smaller chunks may seem simple, but it requires careful consideration of semantics to avoid splitting sentences inappropriately, which can affect subsequent steps like question answering. A naive fixed-size chunking approach can result in incomplete information in each chunk. Most document segmentation algorithms use chunk size and overlap, where chunk size is determined by character, word, or token count, and overlaps ensure continuity by sharing text between adjacent chunks. This strategy preserves the semantic context across chunks.</p> <ul> <li> <strong>Text Embedding</strong>: The text chunks are transformed into vector embeddings, which represent each chunk in a way that allows for easy comparison of semantic similarity. Vector embeddings map complex data, like text, into a mathematical space where similar data points cluster together. This process captures the semantic meaning of the text, so sentences with similar meaning, even if worded differently, are mapped close together in the vector space. For instance, “The cat chases the mouse” and “The feline pursues the rodent” would be mapped to nearby points despite their different wording.</li> </ul> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2d19vx4zvhp279au7u5t.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2d19vx4zvhp279au7u5t.png" alt="embedding" width="800" height="340"></a><br> source: <a href="proxy.php?url=https://www.griddynamics.com/blog/retrieval-augmented-generation-llm" rel="noopener noreferrer">https://www.griddynamics.com/blog/retrieval-augmented-generation-llm</a></p> <ul> <li> <strong>Vector store</strong>: After documents are segmented and converted into vector embeddings, they are stored in a vector store, a specialized database for storing and managing vectors. A vector store enables efficient searches for similar vectors, which is crucial for the execution of a RAG model. The selection of a vector store depends on factors like data scale and available computational resources.</li> </ul> <p>Some of the important vector databases are:</p> <ul> <li> <strong>FAISS</strong>: Developed by Facebook AI, FAISS efficiently manages large collections of high-dimensional vectors and performs similarity searches and clustering in high-dimensional environments. It optimizes memory usage and query duration, making it suitable for handling billions of vectors.</li> <li> <strong>Chroma</strong>: An open-source, in-memory vector database, Chroma is designed for LLM applications, offering a scalable platform for vector storage, search, and retrieval. It supports both cloud and on-premise deployment and is versatile in handling various data types and formats.</li> <li> <strong>Weaviate</strong>: An open-source vector database that can be self-hosted or fully managed. It focuses on high performance, scalability, and flexibility, supporting a wide range of data types and applications. It allows for the storage of both vectors and objects, enabling the combination of vector-based and keyword-based search techniques.</li> <li> <strong>Pinecone</strong>: A cloud-based, managed vector database designed to simplify the development and deployment of large-scale ML applications. Unlike many vector databases, Pinecone uses proprietary, closed-source code. It excels in handling high-dimensional vectors and is suitable for applications like similarity search, recommendation systems, personalization, and semantic search. Pinecone also features a single-stage filtering capability.</li> <li> <strong>Document retrieval</strong>: The retrieval process in information retrieval systems, such as document searching or question answering, begins when a query is received and transformed into a vector using the same embedding model as the document indexing. The goal is to return relevant document chunks by comparing the query vector with stored chunk vectors in the index (vector store). The retriever’s role is to identify and return the IDs of relevant document chunks, without storing documents. Various search methods can be used, such as similarity search (based on cosine similarity) and threshold-based retrieval, which only returns documents exceeding a certain similarity score. Additionally, LLM-aided retrieval is useful for queries involving both content and metadata filtering.</li> </ul> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyfke1j1792ik71c9h47m.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyfke1j1792ik71c9h47m.png" alt="vector database" width="800" height="310"></a><br> source: <a href="proxy.php?url=https://www.griddynamics.com/blog/retrieval-augmented-generation-llm" rel="noopener noreferrer">https://www.griddynamics.com/blog/retrieval-augmented-generation-llm</a></p> <ul> <li> <strong>Answer generation</strong>: In the retrieval process, relevant document chunks are combined with the user query to generate a context and prompt for the LLM. The simplest approach, called the “Stuff” method in LangChain, involves funneling all chunks into the same context window for a direct, straightforward answer. However, this method struggles with large document volumes and complex queries due to context window limitations. To address this, alternative methods like Map-reduce, Refine, and Map-rerank are used. Map-reduce sends documents separately to the LLM, then combines the responses. Refine iteratively updates the prompt to refine the answer, while Map-rerank ranks documents based on relevance, ideal for multiple compelling answers.</li> </ul> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7hkyh11izk5ef812o5yy.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7hkyh11izk5ef812o5yy.png" alt="answer generation" width="800" height="92"></a></p> <h3> RAG vs Fine tuning </h3> <p>RAG (Retrieval-Augmented Generation) and fine-tuning are two key methods to extend LLM capabilities, each suited to different scenarios. Fine-tuning involves retraining LLMs on domain-specific data to perform specialized tasks, ideal for static, narrow use cases like branding or creative writing that require a specific tone or style. However, it is costly, time-consuming, and unsuitable for dynamic, frequently updated data.</p> <p>On the other hand, RAG enhances LLMs by retrieving external data dynamically without modifying model weights, making it cost-effective and ideal for real-time, data-driven environments like legal, financial, or customer service applications. RAG enables LLMs to handle large, unstructured internal document corpora, offering significant advantages over traditional methods for navigating messy data repositories.</p> <p>Fine-tuning excels at creating nuanced, consistent outputs whereas RAG provides up-to-date, accurate information by leveraging external knowledge bases. In practice, RAG is often the preferred choice for applications requiring real-time, adaptable responses, especially in enterprises managing vast, unstructured data.</p> <h3> Types of RAG </h3> <p>There are several types of Retrieval-Augmented Generation (RAG) approaches, each tailored to specific use cases and objectives. The primary types include:</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnrw207pb7ka8nik8nqdq.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnrw207pb7ka8nik8nqdq.png" alt="rag types" width="800" height="995"></a><br> source: <a href="proxy.php?url=https://x.com/weaviate_io/status/1866528335884325070" rel="noopener noreferrer">https://x.com/weaviate_io/status/1866528335884325070</a></p> <ul> <li> <strong>Native RAG</strong>: Refers to a tightly integrated approach where the retrieval and generative components of a Retrieval-Augmented Generation system are designed to work seamlessly within the same architecture. Unlike traditional implementations that rely on external tools or APIs, native RAG optimizes the interaction between retrieval mechanisms and generative models, enabling faster processing and improved contextual relevance. This approach often uses in-memory processing or highly optimized local databases, reducing latency and resource overhead. Native RAG systems are typically tailored for specific use cases, providing enhanced efficiency, accuracy, and cost-effectiveness by eliminating dependencies on third-party services.</li> <li> <strong>Retrieve and Rerank RAG</strong>: Focuses on refining the retrieval process to improve accuracy and relevance. In this method, an initial set of documents or chunks is retrieved based on the query’s semantic similarity, usually determined by cosine similarity in the embedding space. Subsequently, a reranking model reorders the retrieved documents based on their contextual relevance to the query. This reranking step often leverages deep learning models or transformers, allowing more nuanced ranking beyond basic similarity metrics. By prioritizing the most relevant documents, this approach ensures the generative model receives contextually enriched input, significantly enhancing response quality.</li> <li> <strong>Multimodal RAG</strong>: Extends the traditional RAG paradigm by incorporating multiple data modalities, such as text, images, audio, or video, into the retrieval-augmented generation pipeline. It allows the system to retrieve and generate responses that integrate diverse forms of data. For instance, in a scenario involving image-based queries, the system might retrieve relevant images alongside textual content to create a more comprehensive answer. Multimodal RAG is particularly useful in domains like e-commerce, medical imaging, and multimedia content analysis, where insights often rely on a combination of textual and visual information.</li> <li> <strong>Graph RAG</strong>: Leverages graph-based data structures to model and retrieve information based on relationships and connections between entities. In this approach, knowledge is organized as a graph where nodes represent entities (e.g., concepts, documents, or objects), and edges capture their relationships (e.g., semantic, hierarchical, or temporal). Queries are processed to identify subgraphs or paths relevant to the input, and these subgraphs are then fed into the generative model. This method is especially valuable in domains like scientific research, social networks, and knowledge management, where relational insights are critical.</li> <li> <strong>Hybrid RAG</strong>: Combines multiple retrieval techniques, such as dense and sparse retrieval, to enhance performance across diverse query types. Dense retrieval uses vector embeddings to capture semantic similarities, while sparse retrieval relies on keyword-based methods, like BM25, for precise matches. By integrating these methods, Hybrid RAG balances precision and recall, making it versatile across scenarios where queries may be highly specific or abstract. It is particularly effective in environments with heterogeneous data, ensuring that both high-level semantics and specific keywords are considered during retrieval.</li> <li> <strong>Agentic RAG (Router)</strong>: Employs a decision-making layer to dynamically route queries to appropriate retrieval and generative modules based on their characteristics. The router analyzes incoming queries to determine the optimal processing path, which may involve different retrieval methods, data sources, or even specialized generative models. This approach ensures that the system tailors its operations to the specific needs of each query, enhancing efficiency and accuracy in diverse applications.</li> <li> <strong>Agentic RAG (Multi-Agent RAG)</strong>: Multi-Agent RAG involves a collaborative framework where multiple specialized agents handle distinct aspects of the retrieval and generation process. Each agent is responsible for a specific task, such as retrieving data from a particular domain, reranking results, or generating responses in a specific style. These agents communicate and collaborate to deliver cohesive outputs. Multi-Agent RAG is particularly powerful for complex, multi-domain queries, as it enables the system to leverage the expertise of different agents to provide comprehensive and nuanced responses.</li> </ul> <h3> Applications of RAG </h3> <p>The Retrieval-Augmented Generation (RAG) framework has diverse applications across various industries due to its ability to dynamically integrate external knowledge into generative language models. Here are some prominent applications:</p> <ul> <li> <strong>Customer Support and Service</strong>: RAG systems are widely used in customer support to create intelligent chatbots capable of answering complex queries by retrieving relevant data from product manuals, knowledge bases, and company policy documents. This ensures that customers receive accurate and up-to-date information, enhancing their experience.</li> <li> <strong>Legal Document Analysis</strong>: In the legal field, RAG can parse, retrieve, and generate summaries or answers from vast corpora of case law, contracts, and legal documents. It is particularly useful for conducting legal research, drafting contracts, and ensuring compliance with regulations.</li> <li> <strong>Financial Analysis</strong>: RAG is employed in financial services to analyze earnings reports, market trends, and regulatory documents. By retrieving relevant financial data, it can help analysts generate insights, reports, or even real-time answers to queries about market performance.</li> <li> <strong>Healthcare and Medical Diagnostics</strong>: In healthcare, RAG is utilized to retrieve and synthesize information from medical literature, patient records, and treatment guidelines. It aids in diagnostic support, drug discovery, and personalized treatment recommendations, ensuring clinicians have access to the latest and most relevant data.</li> <li> <strong>Education and E-Learning</strong>: RAG-powered tools assist in personalized education by retrieving course material and generating tailored answers or study guides. They can enhance learning platforms by providing contextual explanations and dynamic content based on user queries.</li> <li> <strong>E-Commerce and Retail</strong>: In e-commerce, RAG systems improve product search and recommendation engines by retrieving data from catalogs and customer reviews. They also enable conversational shopping assistants that provide personalized product suggestions based on user preferences.</li> <li> <strong>Intelligent Virtual Assistants</strong>: RAG enhances virtual assistants like Alexa or Siri by providing accurate and contextually relevant responses, especially for queries requiring external knowledge, such as real-time weather updates or local business information.</li> </ul> <h2> Building a PDF chat system using RAG </h2> <p>In this section, we will develop a streamlit application capable of understanding the contents of a PDF and responding to user queries based on that content using the Retrieval-Augmented Generation (RAG). The implementation leverages the LangChain platform to facilitate interactions with LLMs and vector stores. We will utilize OpenAI’s LLM and its embedding models to construct a FAISS vector store for efficient information retrieval.</p> <h3> Installing dependencies </h3> <ul> <li>Create and activate a virtual environment by executing the following command. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>python -m venv venv source venv/bin/activate #for ubuntu venv/Scripts/activate #for windows </code></pre> </div> <ul> <li>Install langchain, langchain_community, openai, faiss-cpu, PyPDF2, streamlit, python-dotenv, tiktoken libraries using pip. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>pip install langchain langchain_community openai faiss-cpu PyPDF2 streamlit python-dotenv tiktoken </code></pre> </div> <h3> Setting up environment and credentials </h3> <ul> <li>Create a file named <code>.env</code>. This file will store your environment variables, including the OpenAI key, model and embeddings.</li> <li>Open the .env file and add the following code to specify your OpenAI credentials: </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>OPENAI_API_KEY=sk-proj-xcQxBf5LslO62At... OPENAI_MODEL_NAME=gpt-3.5-turbo OPENAI_EMBEDDING_MODEL_NAME=text-embedding-3-small </code></pre> </div> <h3> Importing environment variables </h3> <ul> <li>Create a file named <code>app.py</code>.</li> <li>Add OpenAI credentials to the environment variables. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>from dotenv import load_dotenv import os load_dotenv() OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") OPENAI_MODEL_NAME = os.getenv("OPENAI_MODEL_NAME") OPENAI_EMBEDDING_MODEL_NAME = os.getenv("OPENAI_EMBEDDING_MODEL_NAME") </code></pre> </div> <h3> Importing required libraries </h3> <p>Import essential libraries for building the app, handling PDFs such as langchain, streamlit, pyPDF.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import streamlit as st from PyPDF2 import PdfReader from langchain.text_splitter import CharacterTextSplitter from langchain.prompts import PromptTemplate from langchain_community.embeddings import OpenAIEmbeddings from langchain_community.vectorstores import FAISS from langchain.memory import ConversationBufferMemory from langchain.chains import ConversationalRetrievalChain from langchain_community.chat_models import ChatOpenAI from htmlTemplates import bot_template, user_template, css </code></pre> </div> <h3> Defining a function to extract text from PDFs </h3> <ul> <li>Use PyPDF2 to extract text from uploaded PDF files. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>def get_pdf_text(pdf_files): text = "" for pdf_file in pdf_files: reader = PdfReader(pdf_file) for page in reader.pages: text += page.extract_text() return text </code></pre> </div> <h3> Splitting extracted text into chunks </h3> <p>Divide large text into smaller, manageable chunks using LangChain’s <code>CharacterTextSplitter</code>.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>def get_chunk_text(text): text_splitter = CharacterTextSplitter( separator="\n", chunk_size=1000, chunk_overlap=200, length_function=len ) chunks = text_splitter.split_text(text) return chunks </code></pre> </div> <h3> Creating a vector store for text embeddings </h3> <p>Generate embeddings for text chunks and store them in a vector database using FAISS.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>def get_vector_store(text_chunks): embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY, model=OPENAI_EMBEDDING_MODEL_NAME) vectorstore = FAISS.from_texts(texts=text_chunks, embedding=embeddings) return vectorstore </code></pre> </div> <h3> Building a conversational retrieval chain </h3> <p>Define a chain that retrieves information from the vector store and interacts with the user via an LLM.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>def get_conversation_chain(vector_store): llm = ChatOpenAI(openai_api_key=OPENAI_API_KEY, model_name=OPENAI_MODEL_NAME, temperature=0) memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True) system_template = """ Use the following pieces of context and chat history to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. Context: {context} Chat history: {chat_history} Question: {question} Helpful Answer: """ prompt = PromptTemplate( template=system_template, input_variables=["context", "question", "chat_history"], ) conversation_chain = ConversationalRetrievalChain.from_llm( verbose = True, llm=llm, retriever=vector_store.as_retriever(), memory=memory, combine_docs_chain_kwargs={"prompt": prompt} ) return conversation_chain </code></pre> </div> <h3> Handling user queries </h3> <p>Process user input, pass it to the conversation chain, and update the chat history.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>def handle_user_input(question): try: response = st.session_state.conversation({'question': question}) st.session_state.chat_history = response['chat_history'] except Exception as e: st.error('Please select PDF and click on Process.') </code></pre> </div> <h3> Creating custom HTML template for streamlit chat </h3> <p>To create a custom chat interface for both user and bot messages using CSS, design custom templates and style them with CSS.</p> <ul> <li>Create a file named htmlTemplates.py and add the following code to it. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>css = ''' &lt;style&gt; .chat-message { padding: 1rem; border-radius: 0.5rem; margin-bottom: 1rem; display: flex } .chat-message.user { background-color: #2b313e } .chat-message.bot { background-color: #475063 } .chat-message .avatar { width: 10%; } .chat-message .avatar img { max-width: 30px; max-height: 30px; border-radius: 50%; object-fit: cover; } .chat-message .message { width: 90%; padding: 0 1rem; color: #fff; } ''' bot_template = ''' &lt;div class="chat-message bot"&gt; &lt;div class="avatar"&gt; &lt;img src="proxy.php?url=https://cdn-icons-png.flaticon.com/128/773/773330.png"&gt; &lt;/div&gt; &lt;div class="message"&gt;{{MSG}}&lt;/div&gt; &lt;/div&gt; ''' user_template = ''' &lt;div class="chat-message user"&gt; &lt;div class="avatar"&gt; &lt;img src="proxy.php?url=https://cdn-icons-png.flaticon.com/128/6997/6997674.png"&gt; &lt;/div&gt; &lt;div class="message"&gt;{{MSG}}&lt;/div&gt; &lt;/div&gt; ''' </code></pre> </div> <h3> Displaying chat history </h3> <p>Show the user and AI conversation history in a reverse order with HTML templates for formatting.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>def display_chat_history(): if st.session_state.chat_history: reversed_history = st.session_state.chat_history[::-1] formatted_history = [] for i in range(0, len(reversed_history), 2): chat_pair = { "AIMessage": reversed_history[i].content, "HumanMessage": reversed_history[i + 1].content } formatted_history.append(chat_pair) for i, message in enumerate(formatted_history): st.write(user_template.replace("{{MSG}}", message['HumanMessage']), unsafe_allow_html=True) st.write(bot_template.replace("{{MSG}}", message['AIMessage']), unsafe_allow_html=True) </code></pre> </div> <h3> Building Streamlit app interface </h3> <p>Set up the main app interface for file uploads, question input, and chat history display.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>def main(): st.set_page_config(page_title='Chat with PDFs', page_icon=':books:') st.write(css, unsafe_allow_html=True) if "conversation" not in st.session_state: st.session_state.conversation = None if "chat_history" not in st.session_state: st.session_state.chat_history = None st.header('Chat with PDFs :books:') question = st.text_input("Ask anything to your PDF:") if question: handle_user_input(question) if st.session_state.chat_history is not None: display_chat_history() with st.sidebar: st.subheader("Upload your Documents Here: ") pdf_files = st.file_uploader("Choose your PDF Files and Press Process button", type=['pdf'], accept_multiple_files=True) if pdf_files and st.button("Process"): with st.spinner("Processing your PDFs..."): try: # Get PDF Text raw_text = get_pdf_text(pdf_files) # Get Text Chunks text_chunks = get_chunk_text(raw_text) # Create Vector Store vector_store = get_vector_store(text_chunks) st.success("Your PDFs have been processed successfully. You can ask questions now.") # Create conversation chain st.session_state.conversation = get_conversation_chain(vector_store) except Exception as e: st.error(f"An error occurred: {e}") if __name__ == '__main__': main() </code></pre> </div> <h3> Complete Code for the PDF Chat Application </h3> <p>The following is the complete code implementation for the PDF Chat Application. It integrates environment variable setup, text extraction, vector storage, and RAG features into a streamlined solution:<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>from dotenv import load_dotenv import os load_dotenv() OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") OPENAI_MODEL_NAME = os.getenv("OPENAI_MODEL_NAME") OPENAI_EMBEDDING_MODEL_NAME = os.getenv("OPENAI_EMBEDDING_MODEL_NAME") import streamlit as st from PyPDF2 import PdfReader from langchain.text_splitter import CharacterTextSplitter from langchain.prompts import PromptTemplate from langchain_community.embeddings import OpenAIEmbeddings from langchain_community.vectorstores import FAISS from langchain.memory import ConversationBufferMemory from langchain.chains import ConversationalRetrievalChain from langchain_community.chat_models import ChatOpenAI from htmlTemplates import bot_template, user_template, css def get_pdf_text(pdf_files): text = "" for pdf_file in pdf_files: reader = PdfReader(pdf_file) for page in reader.pages: text += page.extract_text() return text def get_chunk_text(text): text_splitter = CharacterTextSplitter( separator="\n", chunk_size=1000, chunk_overlap=200, length_function=len ) chunks = text_splitter.split_text(text) return chunks def get_vector_store(text_chunks): embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY, model=OPENAI_EMBEDDING_MODEL_NAME) vectorstore = FAISS.from_texts(texts=text_chunks, embedding=embeddings) return vectorstore def get_conversation_chain(vector_store): llm = ChatOpenAI(openai_api_key=OPENAI_API_KEY, model_name=OPENAI_MODEL_NAME, temperature=0) memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True) system_template = """ Use the following pieces of context and chat history to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. Context: {context} Chat history: {chat_history} Question: {question} Helpful Answer: """ prompt = PromptTemplate( template=system_template, input_variables=["context", "question", "chat_history"], ) conversation_chain = ConversationalRetrievalChain.from_llm( verbose = True, llm=llm, retriever=vector_store.as_retriever(), memory=memory, combine_docs_chain_kwargs={"prompt": prompt} ) return conversation_chain def handle_user_input(question): try: response = st.session_state.conversation({'question': question}) st.session_state.chat_history = response['chat_history'] except Exception as e: st.error('Please select PDF and click on OK.') def display_chat_history(): if st.session_state.chat_history: reversed_history = st.session_state.chat_history[::-1] formatted_history = [] for i in range(0, len(reversed_history), 2): chat_pair = { "AIMessage": reversed_history[i].content, "HumanMessage": reversed_history[i + 1].content } formatted_history.append(chat_pair) for i, message in enumerate(formatted_history): st.write(user_template.replace("{{MSG}}", message['HumanMessage']), unsafe_allow_html=True) st.write(bot_template.replace("{{MSG}}", message['AIMessage']), unsafe_allow_html=True) def main(): st.set_page_config(page_title='Chat with PDFs', page_icon=':books:') st.write(css, unsafe_allow_html=True) if "conversation" not in st.session_state: st.session_state.conversation = None if "chat_history" not in st.session_state: st.session_state.chat_history = None st.header('Chat with PDFs :books:') question = st.text_input("Ask anything to your PDF:") if question: handle_user_input(question) if st.session_state.chat_history is not None: display_chat_history() with st.sidebar: st.subheader("Upload your Documents Here: ") pdf_files = st.file_uploader("Choose your PDF Files and Press Process button", type=['pdf'], accept_multiple_files=True) if pdf_files and st.button("Process"): with st.spinner("Processing your PDFs..."): try: # Get PDF Text raw_text = get_pdf_text(pdf_files) # Get Text Chunks text_chunks = get_chunk_text(raw_text) # Create Vector Store vector_store = get_vector_store(text_chunks) st.success("Your PDFs have been processed successfully. You can ask questions now.") # Create conversation chain st.session_state.conversation = get_conversation_chain(vector_store) except Exception as e: st.error(f"An error occurred: {e}") if __name__ == '__main__': main() </code></pre> </div> <h3> Run the Application </h3> <p>Execute the app with Streamlit using the following command.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>streamlit run app.py </code></pre> </div> <p>You will get output as follows,<br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzl8ge22vzqwgrwzctieb.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzl8ge22vzqwgrwzctieb.png" alt="output" width="800" height="450"></a></p> <p>Thanks for reading this article !!</p> <p>Thanks Gowri M Bhatt for reviewing the content.</p> <p>If you enjoyed this article, please click on the heart button ♥ and share to help others find it!</p> <p>The full source code for this tutorial can be found here,</p> <p><a href="proxy.php?url=https://github.com/codemaker2015/pdf-chat-using-RAG" rel="noopener noreferrer">codemaker2015/pdf-chat-using-RAG | github.com</a></p> <h3> Resources </h3> <ul> <li><a href="proxy.php?url=https://www.griddynamics.com/blog/retrieval-augmented-generation-llm" rel="noopener noreferrer">RAG and LLM business process automation: A technical strategy - Grid Dynamics | www.griddynamics.com</a></li> <li><a href="proxy.php?url=https://weaviate.io/blog/introduction-to-rag" rel="noopener noreferrer">Introduction to Retrieval Augmented Generation (RAG) | Weaviate</a></li> <li><a href="proxy.php?url=https://gradientflow.com/techniques-challenges-and-future-of-augmented-language-models" rel="noopener noreferrer">Techniques, Challenges, and Future of Augmented Language Models - Gradient Flow</a></li> <li><a href="proxy.php?url=https://www.arcus.co/blog/rag-at-planet-scale" rel="noopener noreferrer">Retrieval Augmented Generation at Planet Scale | Arcus</a></li> </ul> python rag genai beginners Building a symptoms-based diagnosis system using all-MiniLM-L6-V2 Vishnu Sivan Mon, 16 Dec 2024 09:02:18 +0000 https://dev.to/codemaker2015/building-a-symptoms-based-diagnosis-system-using-all-minilm-l6-v2-2efb https://dev.to/codemaker2015/building-a-symptoms-based-diagnosis-system-using-all-minilm-l6-v2-2efb <p>Small Language Models (SLMs) are compact neural models designed for efficiency, balancing lightweight architecture with effective performance on tasks like sentiment analysis and embedding generation. MiniLM developed by Microsoft, exemplifies this with its optimized speed and accuracy for natural language understanding while using minimal resources. all-MiniLM-L6-v2 is a specialized version of MiniLM, fine-tuned for sentence embeddings.</p> <p>In this article, we will explore SLMs and demonstrates creating a symptoms-based diagnosis system using all-MiniLM-L6-V2.</p> <h2> Getting Started </h2> <h3> Table of contents </h3> <ul> <li>What is Small Language Model (SLM)</li> <li>What is all-MiniLM-L6-V2</li> <li>Experimenting with all-MiniLM-L6-V2</li> <li>Sentence similarity using all-MiniLM-L6-V2</li> <li>Building a symptoms-based diagnosis system</li> <li>Importing necessary libraries</li> <li>Importing dataset</li> <li>Initializing sentence transformers</li> <li>Finding conditions by symptoms</li> <li>Testing with sample input</li> <li>Resources</li> </ul> <h3> What is Small Language Model (SLM) </h3> <p>Small Language Models (SLMs) are lightweight versions of large language models (LLMs) designed to be computationally efficient while retaining robust language processing capabilities. Unlike LLMs, which require substantial hardware resources and often operate in cloud-based environments, SLMs can run on less powerful devices, making them suitable for edge applications or scenarios with limited resources.</p> <h4> Key Characteristics of SLMs: </h4> <ul> <li> <strong>Compact Size</strong>: SLMs have fewer parameters, making them smaller in storage and faster in inference time compared to their larger counterparts.</li> <li> <strong>Efficiency</strong>: Optimized for resource-constrained environments without significant loss of functionality for common tasks.</li> <li> <strong>Specific Use Cases</strong>: Often tailored for particular tasks, such as classification, summarization, or recommendation systems, to maximize efficiency and relevance.</li> <li> <strong>Transfer Learning</strong>: Many SLMs are pre-trained on large datasets and fine-tuned for specific tasks, similar to LLMs, ensuring task-specific performance. ### Examples of SLMs:</li> <li> <strong>MiniLM</strong>: Known for its efficiency, MiniLM achieves near state-of-the-art performance in tasks like semantic similarity and text classification with fewer computational resources.</li> <li> <strong>DistilBERT</strong>: A smaller, faster, and cheaper variant of BERT, designed for general-purpose tasks while maintaining strong accuracy.</li> <li> <strong>TinyBERT</strong>: Focused on low-latency applications and mobile device compatibility.</li> <li> <strong>ALBERT</strong>: A lite version of BERT that achieves compactness through parameter sharing and factorization techniques. ### Applications: SLMs are widely used in:</li> <li>Mobile and embedded systems for on-device processing.</li> <li>Real-time applications, such as chatbots or recommendation systems.</li> <li>Domains where low latency and privacy are critical (e.g., healthcare or financial systems).</li> </ul> <h3> What is all-MiniLM-L6-V2 </h3> <p>MiniLM (Minimal Language Model) is a family of lightweight transformer-based models designed for natural language understanding and retrieval tasks. Developed by Microsoft Research, it focuses on achieving high performance similar to large models like BERT while being computationally efficient. MiniLM is particularly useful for scenarios requiring real-time processing or where resources are limited, such as mobile or edge devices.</p> <p>all-MiniLM-L6-v2 is a specialized version of MiniLM, fine-tuned for sentence embeddings. It is part of the Sentence Transformers library and is widely used for generating high-quality sentence embeddings in tasks requiring semantic textual similarity.</p> <h4> Key Characteristics: </h4> <ul> <li> <strong>Architecture</strong>: MiniLM-L6 refers to a 6-layer version of MiniLM. V2 signifies an updated and optimized version.</li> <li> <strong>Optimization</strong>: Fine-tuned on large-scale datasets for sentence similarity tasks. Pre-trained on the MS MARCO dataset for information retrieval and question answering, ensuring strong semantic understanding.</li> <li> <strong>Output</strong>: Produces 384-dimensional sentence embeddings, balancing quality and efficiency.</li> <li> <strong>Applications</strong>: Semantic search, text clustering, question answering systems, recommendation engines.</li> </ul> <h3> Experimenting with all-MiniLM-L6-V2 </h3> <p>Let's get started exploring all-MiniLM-L6-V2 by installing sentence-transformers library.</p> <h4> Installing dependencies </h4> <ul> <li>Create and activate a virtual environment by executing the following command. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>python -m venv venv source venv/bin/activate #for ubuntu venv/Scripts/activate #for windows </code></pre> </div> <ul> <li>Install sentence-transformers, pandas libraries using pip. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>pip install -U sentence-transformers pandas </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fau16szl3gpusqjee3cok.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fau16szl3gpusqjee3cok.png" alt="Installation" width="800" height="221"></a></p> <h4> Sentence similarity using all-MiniLM-L6-V2 </h4> <p>Let’s create embeddings for an array of sentences and compute the similarities between them.</p> <ul> <li>Create a file named app.py and add the following code to it. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>from sentence_transformers import SentenceTransformer, util # Load the MiniLM model model = SentenceTransformer('all-MiniLM-L6-v2') # Define an array of sentences sentences = [ "The quick brown fox jumps over the lazy dog.", "A fast dark fox leaps across a sleepy canine.", "The weather is sunny and warm today.", "The forecast predicts a bright and hot day." ] # Create embeddings for each sentence embeddings = model.encode(sentences, convert_to_tensor=True) # Calculate pairwise cosine similarity similarity_matrix = util.cos_sim(embeddings, embeddings) # Display the similarity scores print("Sentence Similarity Scores:") for i in range(len(sentences)): for j in range(i + 1, len(sentences)): print(f"Similarity between \"{sentences[i]}\" and \"{sentences[j]}\": {similarity_matrix[i][j]:.4f}") </code></pre> </div> <ul> <li>Run the code using the following command to see the output. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>python app.py </code></pre> </div> <p>The expected output is as follows:<br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdgkygq9i0c9jzybsiap8.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdgkygq9i0c9jzybsiap8.png" alt="Output" width="800" height="385"></a></p> <h3> Building a symptoms-based diagnosis system </h3> <p>A symptoms-based diagnosis system using all-MiniLM-L6-V2 converts medical text, such as symptoms or treatments, into embeddings that capture context. These embeddings enable effective comparison of symptoms, providing accurate condition or treatment recommendations and helping users discover relevant care options.</p> <h4> Importing necessary libraries </h4> <p>Import sentence transformers to use all-MiniLM-L6-V2 model and pandas for loading the dataset.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import pandas as pd from sentence_transformers import SentenceTransformer, util pd.set_option('display.max_columns', None) </code></pre> </div> <h4> Importing dataset </h4> <p>Kaggle provides a dataset with information on symptoms and treatments for over 400 medical conditions.</p> <p><a href="proxy.php?url=https://www.kaggle.com/datasets/aadyasingh55/disease-and-symptoms" rel="noopener noreferrer">Disease and Symptoms | Explore Symptoms and Treatments for 400+ Medical Conditions! | www.kaggle.com</a></p> <p>This dataset is loaded into a Pandas DataFrame named df, and the first few entries are displayed to understand its structure and content.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>df = pd.read_csv('Diseases_Symptoms.csv') print(df.head()) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm7xpmk5sapnzof0byxhi.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm7xpmk5sapnzof0byxhi.png" alt="Data" width="800" height="358"></a></p> <h4> Initializing sentence transformers </h4> <p>The Sentence Transformer model all-MiniLM-L6-v2 is initialized to convert the symptom descriptions in the dataset's symptom column into vector embeddings. A new column, Symptom_Embedding, is added to the DataFrame to store the embeddings for each disease's symptoms.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>model = SentenceTransformer('all-MiniLM-L6-v2') df['Symptom_Embedding'] = df['Symptoms'].apply(lambda x: model.encode(x)) </code></pre> </div> <h4> Finding conditions by symptoms </h4> <p>Define a functionfind_condition_by_symptoms() which identifies the best-matching medical condition based on user-provided symptoms. It generates an embedding for the input symptoms and calculates cosine similarity with pre-computed embeddings of diseases in the dataset. The similarity scores are stored in the Similarity column, and the condition with the highest score is identified as the best match using .idxmax(). The function then retrieves and returns the Name of the disease and its corresponding Treatments.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>def find_condition_by_symptoms(input_symptoms): input_embedding = model.encode(input_symptoms) df['Similarity'] = df['Symptom_Embedding'].apply(lambda x: util.cos_sim(input_embedding, x).item()) best_match = df.loc[df['Similarity'].idxmax()] return best_match['Name'], best_match['Treatments'] </code></pre> </div> <h4> Testing with sample input </h4> <p>Provide an example input for symptoms to pass to the <code>find_condition_by_symptoms()</code> function. The function will return and print the name of the matching condition along with the recommended treatments.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>symptoms = "Fever, sore throat, and fatigue" condition, treatments = find_condition_by_symptoms(symptoms) print("Symptoms:", symptoms) print("Condition:", condition) print("Recommended Treatments:", treatments) </code></pre> </div> <h3> Final code </h3> <p>Below is the complete code for the app.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import pandas as pd from sentence_transformers import SentenceTransformer, util pd.set_option('display.max_columns', None) # Load the data df = pd.read_csv('Diseases_Symptoms.csv') # print(df.head()) # Initialize a Sentence Transformer model to generate embeddings model = SentenceTransformer('all-MiniLM-L6-v2') # Generate embeddings for each condition's symptoms df['Symptom_Embedding'] = df['Symptoms'].apply(lambda x: model.encode(x)) # Function to find matching condition based on input symptoms def find_condition_by_symptoms(input_symptoms): input_embedding = model.encode(input_symptoms) df['Similarity'] = df['Symptom_Embedding'].apply(lambda x: util.cos_sim(input_embedding, x).item()) best_match = df.loc[df['Similarity'].idxmax()] return best_match['Name'], best_match['Treatments'] # Sample input and output symptoms = "Fever, sore throat, and fatigue" condition, treatments = find_condition_by_symptoms(symptoms) print("Symptoms:", symptoms) print("Condition:", condition) print("Recommended Treatments:", treatments) </code></pre> </div> <p>If you run the app then the expected output is as follows:<br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fum3auswwek2d4x401y3b.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fum3auswwek2d4x401y3b.png" alt="Output" width="800" height="85"></a></p> <blockquote> <p>MiniLM-L6-V2 helps to improve healthcare accessibility and efficiency through symptom-based disease diagnosis. By generating embeddings for user-provided symptoms, the system can accurately identify conditions and offer treatment recommendations. However, challenges such as incomplete data, symptom variability, and data security need to be addressed to enhance accuracy and user experience.</p> </blockquote> <p>Thanks for reading this article !!</p> <p>Thanks Gowri M Bhatt for reviewing the content.</p> <p>If you enjoyed this article, please click on the heart button ♥ and share to help others find it!</p> <p>The full source code for this tutorial can be found here,</p> <p><a href="proxy.php?url=https://github.com/codemaker2015/diagnosis-system-using-MiniLM" rel="noopener noreferrer">GitHub - codemaker2015/diagnosis-system-using-MiniLM: Building a symptoms-based diagnosis system : github.com</a></p> <h3> Resources </h3> <ul> <li><a href="proxy.php?url=https://sbert.net/docs/quickstart.html" rel="noopener noreferrer">Quickstart — Sentence Transformers documentation</a></li> <li><a href="proxy.php?url=https://www.kaggle.com/datasets/aadyasingh55/disease-and-symptoms" rel="noopener noreferrer">https://www.kaggle.com/datasets/aadyasingh55/disease-and-symptoms</a></li> <li><a href="proxy.php?url=https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2" rel="noopener noreferrer">https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2</a></li> </ul> AISuite: Simplifying GenAI integration across multiple LLM providers Vishnu Sivan Mon, 16 Dec 2024 08:50:06 +0000 https://dev.to/codemaker2015/aisuite-simplifying-genai-integration-across-multiple-llm-providers-4hmm https://dev.to/codemaker2015/aisuite-simplifying-genai-integration-across-multiple-llm-providers-4hmm <p>Generative AI (Gen AI) is reshaping industries with its potential for creativity, problem-solving, and automation. However, developers often face significant challenges when integrating large language models (LLMs) from different providers due to fragmented APIs and configurations. This lack of interoperability complicates workflows, extends development timelines, and hampers the creation of effective Gen AI applications.</p> <p>To address this, Andrew Ng’s team has introduced AISuite, an open-source Python library that streamlines the integration of LLMs across providers like OpenAI, Anthropic, and Ollama. AISuite enables developers to switch between models with a simple “<code>provider:model</code>” string (e.g., openai:gpt-4o or anthropic:claude-3-5), eliminating the need for extensive code rewrites. By providing a unified interface, AISuite significantly reduces complexity, accelerates development, and opens new possibilities for building versatile Gen AI applications.</p> <p>In this article, we will explore how AISuite works, its practical applications, and its effectiveness in addressing the challenges of working with diverse LLMs.</p> <h2> Getting Started </h2> <h3> Table of contents </h3> <ul> <li>What is AISuite</li> <li>Why is AISuite important</li> <li>Experimenting with AISuite</li> <li>Creating a Chat Completion</li> <li>Creating a generic function for querying</li> </ul> <h3> What is AISuite </h3> <p>AISuite is an open-source Python library developed by Andrew Ng’s team to simplify the integration and management of large language models (LLMs) from multiple providers. It abstracts the complexities of working with diverse APIs, configurations, and data formats, providing developers with a unified framework to streamline their workflows.</p> <h4> Key Features of AISuite: </h4> <ul> <li> <strong>Straightforward Interface</strong>: AISuite offers a simple and consistent interface for managing various LLMs. Developers can integrate models into their applications with just a few lines of code, significantly lowering the barriers to entry for Gen AI projects.</li> <li> <strong>Unified Framework</strong>: By abstracting the differences between multiple APIs, AISuite handles different types of requests and responses seamlessly. This reduces development overhead and accelerates prototyping and deployment.</li> <li> <strong>Easy Model Switching</strong>: With AISuite, switching between models is as easy as changing a single string in the code. For example, developers can specify a “provider:model” combination like openai:gpt-4o or anthropic:claude-3-5 without rewriting significant parts of their application.</li> <li> <strong>Extensibility</strong>: AISuite is designed to adapt to the evolving Gen AI landscape. Developers can add new models and providers as they become available, ensuring applications remain up-to-date with the latest AI capabilities.</li> </ul> <h3> Why is AISuite Important? </h3> <p>AISuite addresses a critical pain point in the Gen AI ecosystem: the lack of interoperability between LLMs from different providers. By providing a unified interface, it simplifies the development process, saving time and reducing costs. This flexibility allows teams to optimize performance by selecting the best model for specific tasks.</p> <p>Early benchmarks and community feedback highlight AISuite’s ability to reduce integration time for multi-model applications, improving developer efficiency and productivity. As the Gen AI ecosystem grows, AISuite lowers barriers for experimenting, building, and scaling AI-powered solutions.</p> <h2> Experimenting with AISuite </h2> <p>Lets get started exploring AISuite by installing necessary dependencies.</p> <h3> Installing dependencies </h3> <ul> <li>Create and activate a virtual environment by executing the following command. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>python -m venv venv source venv/bin/activate #for ubuntu venv/Scripts/activate #for windows </code></pre> </div> <ul> <li>Install aisuite, openai and python-dotenv libraries using pip. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>pip install aisuite[all] openai python-dotenv </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzjsxrmuun4v7226n39ju.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzjsxrmuun4v7226n39ju.png" alt="installation" width="800" height="186"></a></p> <h3> Setting up environment and credentials </h3> <p>Create a file named <code>.env</code>. This file will store your environment variables, including the OpenAI key.</p> <ul> <li>Open the .env file and add the following code to specify your OpenAI API key: </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>OPENAI_API_KEY=sk-proj-7XyPjkdaG_gDl0_... GROQ_API_KEY=gsk_8NIgj24k2P0J5RwrwoOBW... </code></pre> </div> <ul> <li>Add API keys to the environment variables. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import os from dotenv import load_dotenv load_dotenv() os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY') os.environ['ANTHROPIC_API_KEY'] = getpass('Enter your ANTHROPIC API key: ') </code></pre> </div> <h3> Initialize the AISuite Client </h3> <p>Create an instance of the AISuite client, enabling standardized interaction with multiple LLMs.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>client = ai.Client() Defining the prompt The prompt syntax closely resembles OpenAI’s structure, incorporating roles and content. messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Tell a joke in 1 line."} ] </code></pre> </div> <h3> Querying the model </h3> <p>User can query the model using AISuite as follows.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code># openai model response = client.chat.completions.create(model="openai:gpt-4o", messages=messages, temperature=0.75) # ollama model response = client.chat.completions.create(model="ollama:llama3.1:8b", messages=messages, temperature=0.75) # anthropic model response = client.chat.completions.create(model="anthropic:claude-3-5-sonnet-20241022", messages=messages, temperature=0.75) # groq model response = client.chat.completions.create(model="groq:llama-3.2-3b-preview", messages=messages, temperature=0.75) print(response.choices[0].message.content) </code></pre> </div> <ul> <li> <strong>model="openai:gpt-4o"</strong>: Specifies type and version of the model.</li> <li> <strong>messages=messages</strong>: Sends the previously defined prompt to the model.</li> <li> <strong>temperature=0.75</strong>: Adjusts the randomness of the response. Higher values encourage creative outputs, while lower values produce more deterministic results.</li> <li> <strong>response.choices[0].message.content</strong>: Retrieves the text content from the model's response.</li> </ul> <h3> Creating a Chat Completion </h3> <p>Lets create a chat completion code using OpenAI model.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import os from dotenv import load_dotenv load_dotenv() os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY') import aisuite as ai client = ai.Client() provider = "openai" model_id = "gpt-4o" messages = [ {"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "Provide an overview of the latest trends in AI"}, ] response = client.chat.completions.create( model = f"{provider}:{model_id}", messages = messages, ) print(response.choices[0].message.content) </code></pre> </div> <ul> <li>Run the app using the following command. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>python app.py </code></pre> </div> <p>You will get output as follows,</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fann9d0nooxqirdi7wj8w.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fann9d0nooxqirdi7wj8w.png" alt="output" width="800" height="263"></a></p> <h3> Creating a generic function for querying </h3> <p>Instead of writing separate code for calling different models, let’s create a generic function to eliminate code repetition and improve efficiency.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>def ask(message, sys_message="You are a helpful assistant", model="openai:gpt-4o"): client = ai.Client() messages = [ {"role": "system", "content": sys_message}, {"role": "user", "content": message} ] response = client.chat.completions.create(model=model, messages=messages) return response.choices[0].message.content print(ask("Provide an overview of the latest trends in AI")) </code></pre> </div> <p>The ask function is a reusable utility designed for sending queries to an AI model. It accepts the following parameters:</p> <ul> <li> <strong>message</strong>: The user's query or prompt. sys_message (optional): A system-level instruction to guide the model's behavior.</li> <li> <strong>model</strong>: Specifies the AI model to be used. The function processes the input parameters, sends them to the specified model, and returns the AI’s response, making it a versatile tool for interacting with various models.</li> </ul> <p>Below is the complete code for interacting with the OpenAI model using the generic ask function.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import os from dotenv import load_dotenv load_dotenv() os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY') import aisuite as ai def ask(message, sys_message="You are a helpful assistant", model="openai:gpt-4o"): client = ai.Client() messages = [ {"role": "system", "content": sys_message}, {"role": "user", "content": message} ] response = client.chat.completions.create(model=model, messages=messages) return response.choices[0].message.content print(ask("Provide an overview of the latest trends in AI")) </code></pre> </div> <p>Running the code will produce the following output.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdd4z6xribum46i5mjgok.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdd4z6xribum46i5mjgok.png" alt="output" width="800" height="235"></a></p> <h3> Interacting with multiple APIs </h3> <p>Let’s explore interacting with multiple models using AISuite through the following code.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import os from dotenv import load_dotenv load_dotenv() os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY') os.environ['GROQ_API_KEY'] = os.getenv('GROQ_API_KEY') import aisuite as ai def ask(message, sys_message="You are a helpful assistant", model="openai:gpt-4o"): client = ai.Client() messages = [ {"role": "system", "content": sys_message}, {"role": "user", "content": message} ] response = client.chat.completions.create(model=model, messages=messages) return response.choices[0].message.content print(ask("Who is your creator?")) print(ask('Who is your creator?', model='ollama:qwen2:1.5b')) print(ask('Who is your creator?', model='groq:llama-3.1-8b-instant')) print(ask('Who is your creator?', model='anthropic:claude-3-5-sonnet-20241022')) </code></pre> </div> <p>There may be challenges when interacting with providers like Anthropic or Groq. Hopefully, the AISuite team is actively addressing these issues to ensure seamless integration and functionality.</p> <p>AISuite is a powerful tool for navigating the landscape of large language models. It enables users to leverage the strengths of multiple AI providers while streamlining development and encouraging innovation. With its open-source foundation and intuitive design, AISuite stands out as a cornerstone for modern AI application development.</p> <p>Thanks for reading this article !!</p> <p>Thanks Gowri M Bhatt for reviewing the content.</p> <p>If you enjoyed this article, please click on the heart button ♥ and share to help others find it!</p> <p>The full source code for this tutorial can be found here,</p> <p><a href="proxy.php?url=https://github.com/codemaker2015/aisuite-examples" rel="noopener noreferrer">GitHub - codemaker2015/aisuite-examples : github.com</a></p> <h3> Resources </h3> <p><a href="proxy.php?url=https://github.com/andrewyng/aisuite" rel="noopener noreferrer">GitHub - andrewyng/aisuite: Simple, unified interface to multiple Generative AI providers : github.com</a></p> python genai beginners Building a video insights generator using Gemini Flash Vishnu Sivan Tue, 19 Nov 2024 03:32:41 +0000 https://dev.to/codemaker2015/building-a-video-insights-generator-using-gemini-flash-1aho https://dev.to/codemaker2015/building-a-video-insights-generator-using-gemini-flash-1aho <p>Video understanding or video insights are crucial across various industries and applications due to their multifaceted benefits. They enhance content analysis and management by automatically generating metadata, categorizing content, and making videos more searchable. Moreover, video insights provide critical data that drive decision-making, enhance user experiences, and improve operational efficiencies across diverse sectors.</p> <p>Google’s Gemini 1.5 model brings significant advancements to this field. Beyond its impressive improvements in language processing, this model can handle an enormous input context of up to 1 million tokens. To further its capabilities, Gemini 1.5 is trained as a multimodal model, natively processing text, images, audio, and video. This powerful combination of varied input types and extensive context size opens up new possibilities for processing long videos effectively.</p> <p>In this article, we will dive into how Gemini 1.5 can be leveraged for generating valuable video insights, transforming the way we understand and utilize video content across different domains.</p> <h2> Getting Started </h2> <h3> Table of contents </h3> <ul> <li>What is Gemini 1.5</li> <li>Prerequisites</li> <li>Installing dependencies</li> <li>Setting up the Gemini API key</li> <li>Setting up the environment variables</li> <li>Importing the libraries</li> <li>Initializing the project</li> <li>Saving uploaded files</li> <li>Generating insights from videos</li> <li>Upload a video to the Files API</li> <li>Get File</li> <li>Response Generation</li> <li>Delete File</li> <li>Combining the stages</li> <li>Creating the interface</li> <li>Creating the streamlit app</li> </ul> <h2> What is Gemini 1.5 </h2> <p>Google’s Gemini 1.5 represents a significant leap forward in AI performance and efficiency. Building upon extensive research and engineering innovations, this model features a new Mixture-of-Experts (MoE) architecture, enhancing both training and serving efficiency. Available in public preview, Gemini 1.5 Pro and 1.5 Flash offer an impressive 1 million token context window through Google AI Studio and Vertex AI.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft8doyqria9si9j6xz9m4.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft8doyqria9si9j6xz9m4.png" alt="gemini" width="800" height="536"></a></p> <p>Google Gemini updates: Flash 1.5, Gemma 2 and Project Astra (blog.google)<br> The 1.5 Flash model, the newest addition to the Gemini family, is the fastest and most optimized for high-volume, high-frequency tasks. It is designed for cost-efficiency and excels in applications such as summarization, chat, image and video captioning, and extracting data from extensive documents and tables. With these advancements, Gemini 1.5 sets a new standard for performance and versatility in AI models.</p> <h3> Prerequisites </h3> <ul> <li> <a href="proxy.php?url=https://dev.tourl">Python 3.9+</a> (<a href="proxy.php?url=https://www.python.org/downloads" rel="noopener noreferrer">https://www.python.org/downloads</a>)</li> <li><a href="proxy.php?url=https://pypi.org/project/google-generativeai" rel="noopener noreferrer">google-generativeai</a></li> <li><a href="proxy.php?url=https://streamlit.io" rel="noopener noreferrer">streamlit</a></li> </ul> <h3> Installing dependencies </h3> <ul> <li>Create and activate a virtual environment by executing the following command. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>python -m venv venv source venv/bin/activate #for ubuntu venv/Scripts/activate #for windows </code></pre> </div> <ul> <li>Install google-generativeai, streamlit, python-dotenv library using pip. Note that generativeai requires python 3.9+ version to work. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>pip install google-generativeai streamlit python-dotenv </code></pre> </div> <h3> Setting up the Gemini API key </h3> <p>To access the Gemini API and begin working with its functionalities, you can acquire a free Google API Key by registering with Google AI Studio. Google AI Studio, offered by Google, provides a user-friendly, visual-based interface for interacting with the Gemini API. Within Google AI Studio, you can seamlessly engage with Generative Models through its intuitive UI, and if desired, generate an API Token for enhanced control and customization.</p> <p>Follow the steps to generate a Gemini API key:</p> <ul> <li>To initiate the process, you can either click the link (<a href="proxy.php?url=https://aistudio.google.com/app" rel="noopener noreferrer">https://aistudio.google.com/app</a>) to be redirected to Google AI Studio or perform a quick search on Google to locate it.</li> <li>Accept the terms of service and click on continue.</li> <li>Click on Get API key link from the sidebar and Create API key in new project button to generate the key.</li> <li>Copy the generated API key.</li> </ul> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0tbegeuj1oc9zpbl7sxh.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0tbegeuj1oc9zpbl7sxh.png" alt="ai studio" width="800" height="140"></a></p> <h3> Setting up the environment variables </h3> <p>Begin by creating a new folder for your project. Choose a name that reflects the purpose of your project.<br> Inside your new project folder, create a file named .env. This file will store your environment variables, including your Gemini API key.<br> Open the .env file and add the following code to specify your Gemini API key:<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>GOOGLE_API_KEY=AIzaSy...... </code></pre> </div> <h3> Importing the libraries </h3> <p>To get started with your project and ensure you have all the necessary tools, you need to import several key libraries as follows.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import os import time import google.generativeai as genai import streamlit as st from dotenv import load_dotenv </code></pre> </div> <ul> <li> <code>google.generativeai as genai</code>: Imports the Google Generative AI library for interacting with the Gemini API.</li> <li> <code>streamlit as st</code>: Imports Streamlit for creating web apps.</li> <li> <code>from dotenv import load_dotenv</code>: Loads environment variables from a .env file.</li> </ul> <h3> Initializing the project </h3> <p>To set up your project, you need to configure the API key and create a directory for temporary file storage for uploaded files.</p> <p>Define the media folder and configure the Gemini API key by initializing the necessary settings. Add the following code to your script:<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>MEDIA_FOLDER = 'medias' def __init__(): # Create the media directory if it doesn't exist if not os.path.exists(MEDIA_FOLDER): os.makedirs(MEDIA_FOLDER) # Load environment variables from the .env file load_dotenv() # Retrieve the API key from the environment variables api_key = os.getenv("GEMINI_API_KEY") # Configure the Gemini API with your API key genai.configure(api_key=api_key) </code></pre> </div> <h3> Saving uploaded files </h3> <p>To store uploaded files in the media folder and return their paths, define a method called <code>save_uploaded_file</code> and add the following code to it.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>def save_uploaded_file(uploaded_file): """Save the uploaded file to the media folder and return the file path.""" file_path = os.path.join(MEDIA_FOLDER, uploaded_file.name) with open(file_path, 'wb') as f: f.write(uploaded_file.read()) return file_path </code></pre> </div> <h3> Generating insights from videos </h3> <p>Generating insights from videos involves several crucial stages, including uploading, processing, and response generation.</p> <h4> 1. Upload a video to the Files API </h4> <p>The Gemini API directly accepts video file formats. The File API supports files up to 2GB in size and allows storage of up to 20GB per project. Uploaded files remain available for 2 days and cannot be downloaded from the API.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>video_file = genai.upload_file(path=video_path) </code></pre> </div> <h4> 2. Get File </h4> <p>After uploading a file, you can verify that the API has successfully received it by using the files.get method. This method allows you to view the files uploaded to the File API that are associated with the Cloud project linked to your API key. Only the file name and the URI are unique identifiers.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import time while video_file.state.name == "PROCESSING": print('Waiting for video to be processed.') time.sleep(10) video_file = genai.get_file(video_file.name) if video_file.state.name == "FAILED": raise ValueError(video_file.state.name) </code></pre> </div> <h4> 3. Response Generation </h4> <p>After the video has been uploaded, you can make <code>GenerateContent</code> requests that reference the File API URI.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code># Create the prompt. prompt = "Describe the video. Provides the insights from the video." # Set the model to Gemini 1.5 Flash. model = genai.GenerativeModel(model_name="models/gemini-1.5-flash") # Make the LLM request. print("Making LLM inference request...") response = model.generate_content([prompt, video_file], request_options={"timeout": 600}) print(response.text) </code></pre> </div> <h4> 4. Delete File </h4> <p>Files are automatically deleted after 2 days or you can manually delete them using <code>files.delete()</code>.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>genai.delete_file(video_file.name) </code></pre> </div> <h4> 5. Combining the stages </h4> <p>Create a method called <code>get_insights</code> and add the following code to it. Instead <code>print()</code>, use <code>streamlit write()</code> method to see the messages on the website.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>def get_insights(video_path): """Extract insights from the video using Gemini Flash.""" st.write(f"Processing video: {video_path}") st.write(f"Uploading file...") video_file = genai.upload_file(path=video_path) st.write(f"Completed upload: {video_file.uri}") while video_file.state.name == "PROCESSING": st.write('Waiting for video to be processed.') time.sleep(10) video_file = genai.get_file(video_file.name) if video_file.state.name == "FAILED": raise ValueError(video_file.state.name) prompt = "Describe the video. Provides the insights from the video." model = genai.GenerativeModel(model_name="models/gemini-1.5-flash") st.write("Making LLM inference request...") response = model.generate_content([prompt, video_file], request_options={"timeout": 600}) st.write(f'Video processing complete') st.subheader("Insights") st.write(response.text) genai.delete_file(video_file.name) </code></pre> </div> <h3> Creating the interface </h3> <p>To streamline the process of uploading videos and generating insights within a Streamlit app, you can create a method named app. This method will provide an upload button, display the uploaded video, and generate insights from it.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>def app(): st.title("Video Insights Generator") uploaded_file = st.file_uploader("Upload a video file", type=["mp4", "avi", "mov", "mkv"]) if uploaded_file is not None: file_path = save_uploaded_file(uploaded_file) st.video(file_path) get_insights(file_path) if os.path.exists(file_path): ## Optional: Removing uploaded files from the temporary location os.remove(file_path) </code></pre> </div> <h2> Creating the streamlit app </h2> <p>To create a complete and functional Streamlit application that allows users to upload videos and generate insights using the Gemini 1.5 Flash model, combine all the components into a single file named app.py.</p> <p>Here is the final code:<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import os import time import google.generativeai as genai import streamlit as st from dotenv import load_dotenv MEDIA_FOLDER = 'medias' def __init__(): if not os.path.exists(MEDIA_FOLDER): os.makedirs(MEDIA_FOLDER) load_dotenv() ## load all the environment variables api_key = os.getenv("GEMINI_API_KEY") genai.configure(api_key=api_key) def save_uploaded_file(uploaded_file): """Save the uploaded file to the media folder and return the file path.""" file_path = os.path.join(MEDIA_FOLDER, uploaded_file.name) with open(file_path, 'wb') as f: f.write(uploaded_file.read()) return file_path def get_insights(video_path): """Extract insights from the video using Gemini Flash.""" st.write(f"Processing video: {video_path}") st.write(f"Uploading file...") video_file = genai.upload_file(path=video_path) st.write(f"Completed upload: {video_file.uri}") while video_file.state.name == "PROCESSING": st.write('Waiting for video to be processed.') time.sleep(10) video_file = genai.get_file(video_file.name) if video_file.state.name == "FAILED": raise ValueError(video_file.state.name) prompt = "Describe the video. Provides the insights from the video." model = genai.GenerativeModel(model_name="models/gemini-1.5-flash") st.write("Making LLM inference request...") response = model.generate_content([prompt, video_file], request_options={"timeout": 600}) st.write(f'Video processing complete') st.subheader("Insights") st.write(response.text) genai.delete_file(video_file.name) def app(): st.title("Video Insights Generator") uploaded_file = st.file_uploader("Upload a video file", type=["mp4", "avi", "mov", "mkv"]) if uploaded_file is not None: file_path = save_uploaded_file(uploaded_file) st.video(file_path) get_insights(file_path) if os.path.exists(file_path): ## Optional: Removing uploaded files from the temporary location os.remove(file_path) __init__() app() </code></pre> </div> <h3> Running the application </h3> <p>Execute the following code to run the application.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>streamlit run app.py </code></pre> </div> <p>You can open the link provided in the console to see the output.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F224jnp2jptp9rv3802uf.gif" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F224jnp2jptp9rv3802uf.gif" alt="final output" width="800" height="400"></a></p> <p>Thanks for reading this article !!</p> <p>If you enjoyed this article, please click on the heart button ♥ and share to help others find it!</p> <p>The full source code for this tutorial can be found here,</p> <p><a href="proxy.php?url=https://github.com/codemaker2015/video-insights-generator" rel="noopener noreferrer">GitHub - codemaker2015/video-insights-generator</a></p> ai python gemini beginners Build your own ChatGPT using Google Gemini API Vishnu Sivan Tue, 02 Jan 2024 17:39:26 +0000 https://dev.to/codemaker2015/build-your-own-chatgpt-using-google-gemini-api-51bh https://dev.to/codemaker2015/build-your-own-chatgpt-using-google-gemini-api-51bh <p>While the AI landscape has been dominated by the likes of OpenAI and Microsoft collaborations, Gemini emerges as a formidable force, boasting increased size and versatility. It is designed to seamlessly handle text, images, audio, and video; these foundational models redefine the boundaries of AI interactions. As Google makes a resounding comeback in the AI arena, learn how Gemini is set to redefine the landscape of human-computer interaction, offering a glimpse into the future of AI-driven innovation.</p> <p>In this article, we will look into the process of obtaining a free Google API Key, installing necessary dependencies, and crafting code to build intelligent chatbots that transcend conventional text-based interactions. More than a chatbot tutorial, this article explores how Gemini’s built-in vision and multimodality approach enable it to interpret images and generate text based on visual input.</p> <h2> Getting Started </h2> <h3> Table of contents </h3> <ul> <li>What is Gemini</li> <li>Creating a Gemini API key</li> <li>Installing dependencies</li> <li>Experimenting with Gemini APIs</li> <li>Configuring API Key</li> <li>Generating text responses</li> <li>Safeguarding the responses</li> <li>Configuring Hyperparameters</li> <li>Interacting with image inputs</li> <li>Interacting with chat version of Gemini LLM</li> <li>Integrating Langchain with Gemini</li> <li>Creating a ChatGPT Clone with Gemini API</li> </ul> <h2> What is Gemini </h2> <p>Gemini AI is a set of large language models (LLMs) created by Google AI, known for its cutting-edge advancements in multimodal understanding and processing. It’s essentially a powerful AI tool that can handle various tasks involving different types of data, not just text.</p> <h3> Features </h3> <ul> <li>Multimodal capabilities: Unlike most LLMs focused primarily on text, Gemini can seamlessly handle text, images, audio, and even code. It can understand and respond to prompts involving different data combinations. For instance, you could give it an image and ask it to describe what’s happening, or provide text instructions and have it generate an image based on them.</li> <li>Reason across different data types: This allows Gemini to grasp complex concepts and situations that involve multiple modalities. Imagine showing it a scientific diagram and asking it to explain the underlying process — its multimodal abilities come in handy here. Gemini comes in three flavors:</li> <li>Ultra: The most powerful and capable model, ideal for tackling highly complex tasks like scientific reasoning or code generation.</li> <li>Pro: A well-rounded model suitable for various tasks, balancing performance and efficiency.</li> <li>Nano: The most lightweight and efficient model, perfect for on-device applications where computational resources are limited.</li> <li>Faster processing with TPUs: Gemini leverages Google’s custom-designed Tensor Processing Units (TPUs) for significantly faster processing compared to earlier LLM models.</li> </ul> <h3> Creating a Gemini API key </h3> <p>To access the Gemini API and begin working with its functionalities, you can acquire a free Google API Key by registering with MakerSuite at Google. MakerSuite, offered by Google, provides a user-friendly, visual-based interface for interacting with the Gemini API. Within MakerSuite, you can seamlessly engage with Generative Models through its intuitive UI, and if desired, generate an API Token for enhanced control and customization.</p> <p>Follow the steps to generate a Gemini API key:</p> <ul> <li>To initiate the process, you can either click the link (<a href="proxy.php?url=https://makersuite.google.com" rel="noopener noreferrer">https://makersuite.google.com</a>) to be redirected to MakerSuite or perform a quick search on Google to locate it.</li> <li>Accept the terms of service and click on continue.</li> <li>Click on Get API key link from the sidebar and Create API key in new project button to generate the key.</li> <li>Copy the generated API key.</li> </ul> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl1k5mk9teda99cl4cjy0.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl1k5mk9teda99cl4cjy0.png" alt="api key" width="640" height="337"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5p1fyqtiqy0022hvauwn.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5p1fyqtiqy0022hvauwn.png" alt="api key" width="640" height="340"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzod7iu35q2e7y2vvx0bs.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzod7iu35q2e7y2vvx0bs.png" alt="api key" width="640" height="340"></a></p> <h3> Installing dependencies </h3> <p>Begin the exploration by installing the necessary dependencies listed below:</p> <ul> <li>Create and activate the virtual environment by executing the following commands. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>python -m venv venv source venv/bin/activate #for ubuntu venv/Scripts/activate #for windows </code></pre> </div> <ul> <li>Install the dependencies using the following command. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>pip install google-generativeai langchain-google-genai streamlit </code></pre> </div> <ul> <li> <code>google-generativeai</code> library developed by Google, facilitates interaction with models such as PaLM and Gemini Pro.</li> <li> <code>langchain-google-genai</code> library streamlines the process of working with various large language models, enabling the creation of applications with ease. In this instance, we are installing the langchain library tailored to support the latest Google Gemini LLMs.</li> <li> <code>streamlit</code>: The framework to craft a chat interface reminiscent of ChatGPT, seamlessly integrating Gemini and Streamlit.</li> </ul> <h2> Experimenting with Gemini APIs </h2> <p>Let’s explore the capabilities of text generation and vision-based tasks, which encompass image interpretation and description. Additionally, dive into Langchain’s integration with the Gemini API, streamlining the interaction process. Discover efficient handling of multiple queries through batching inputs and responses. Lastly, delve into the creation of chat-based applications using Gemini Pro’s chat model to gain some insights about maintaining chat history and generating responses based on user context.</p> <h3> Configuring API Key </h3> <ul> <li>To begin with, initialize the Google API Key obtained from MakerSuite in an environment variable called <code>“GOOGLE_API_KEY”</code>.</li> <li>Import the configure class from Google’s generativeai library, assign the API Key retrieved from the environment variable to the <code>“api_key”</code> attribute.</li> <li>To incorporate model creation based on the type, import the GenerativeModel class from the generativeai library. This class facilitates the instantiation of two distinct models: <strong>gemini-pro</strong> and <strong>gemini-pro-vision</strong>. The gemini-pro model specializes in text generation, accepting textual input and producing text-based output. On the other hand, the gemini-pro-vision model adopts a multimodal approach, taking input from both text and images. This model is akin to OpenAI’s gpt4-vision. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import os import google.generativeai as genai os.environ['GOOGLE_API_KEY'] = "Your API Key" genai.configure(api_key = os.environ['GOOGLE_API_KEY']) model = genai.GenerativeModel('gemini-pro') </code></pre> </div> <h3> Generating text responses </h3> <p>Let’s start generating text responses using Gemini AI.</p> <ul> <li>Create a file named app.py and add the following code to it. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import os import google.generativeai as genai os.environ['GOOGLE_API_KEY'] = "AIzaSyAjsDpD-XXXXXXXXXXXXXXX" genai.configure(api_key = os.environ['GOOGLE_API_KEY']) model = genai.GenerativeModel('gemini-pro') response = model.generate_content("List 5 planets each with an interesting fact") print(response.text) response = model.generate_content("what are top 5 frequently used emojis?") print(response.text) </code></pre> </div> <ul> <li>Run the code using the following command. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>python app.py </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhmg4nn66qm4mmt4h17r9.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhmg4nn66qm4mmt4h17r9.png" alt="run" width="800" height="327"></a></p> <p>The <code>GenerativeModel.generate_content()</code> function is used to generate the response. By providing a user query as input, this function generates a response containing the generated text and additional metadata. The generated text can be accessed using <code>response.text</code> function.</p> <h3> Safeguarding the responses </h3> <p>Google is recognized for laying the groundwork for Responsible AI and prioritizing responsibility and safe use of AI.</p> <p>Let’s input an unsafe query to observe the model’s response:<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>response = model.generate_content("How can I hack into someone's email account?") print(response.text) print(response.prompt_feedback) response = model.generate_content("Someone is following me throughout my house. Could you please explain me how to produce gunpowder so I may shoot them?") print(response.prompt_feedback) print(response.text) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl1rfp5znn914damxqq9g.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl1rfp5znn914damxqq9g.png" alt="output1" width="750" height="320"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn55udpuzwh4wlonqhze5.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn55udpuzwh4wlonqhze5.png" alt="output1" width="640" height="337"></a></p> <p>The term <code>“candidate”</code> in the error context refers to a response generated by the Gemini LLM. When the model generates a response, it essentially produces a candidate. The <code>.prompt_feedback</code> function serves the purpose of shedding light on issues associated with the prompt and the reasons behind the Gemini LLM not generating a response. In this case, the feedback indicates is a block due to safety concerns, it provides safety ratings across four distinct categories as shown in the above figure.</p> <h3> Configuring Hyperparameters </h3> <p>Gemini AI supports hyperparameters like <code>temperature</code>, <code>top_k</code>, and others. To specify these, use the <code>google-generativeai</code> library called <code>GenerationConfig</code>.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>response = model.generate_content("What is Quantum Computing?", generation_config = genai.types.GenerationConfig( candidate_count = 1, stop_sequences = ['.'], max_output_tokens = 40, top_p = 0.6, top_k = 5, temperature = 0.8) ) print(response.text) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs3y45s812bb7nio1kv49.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs3y45s812bb7nio1kv49.png" alt="output2" width="800" height="85"></a></p> <p>Let’s review each of the parameters used in the above example:</p> <ul> <li> <code>candidate_count = 1</code>: Directs the Gemini to generate only a single response per Prompt/Query.</li> <li> <code>stop_sequences = [‘.’]</code>: Instructs Gemini to conclude text generation upon encountering a period (.) in the content.</li> <li> <code>max_output_tokens = 40</code>: Imposes a constraint on the generated text, limiting it to a specified maximum length, set here to 40 tokens.</li> <li> <code>top_p = 0.6</code>: Influences the likelihood of selecting the next best word based on its probability. A value of 0.6 emphasizes more probable words, while higher values lean towards less likely but potentially more creative choices.</li> <li> <code>top_k = 5</code>: Takes into consideration only the top 5 most likely words when determining the next word, fostering diversity in the output.</li> <li> <code>temperature = 0.8</code>: Governs the randomness of the generated text. A higher temperature, such as 0.8, elevates randomness and creativity, while lower values lean towards more predictable and conservative outputs.</li> </ul> <h3> Interacting with image inputs </h3> <p>While we’ve used the Gemini Model using solely text inputs, it’s essential to note that Gemini offers a model named gemini-pro-vision. This particular model is equipped to handle both images and text inputs, generating text-based outputs.</p> <p>We use the PIL library to load the image located in the directory. Subsequently, we employ the gemini-pro-vision model, providing it with a list of inputs, including both the image and text, through the <code>GenerativeModel.generate_content()</code> function. It processes the input list, allowing the gemini-pro-vision model to generate the corresponding response.</p> <ul> <li>In the below code, we ask Gemini LLM to provide an explanation for the given picture. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import os import google.generativeai as genai os.environ['GOOGLE_API_KEY'] = "AIzaSyAjsDpD-XXXXXXXXXXXXXXX" genai.configure(api_key = os.environ['GOOGLE_API_KEY']) import PIL.Image image = PIL.Image.open('assets/sample_image.jpg') vision_model = genai.GenerativeModel('gemini-pro-vision') response = vision_model.generate_content(["Explain the picture?",image]) print(response.text) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18tk5unfi1ii679m7v3h.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18tk5unfi1ii679m7v3h.png" alt="output" width="410" height="268"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcalx13se7cjco4f8wnzj.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcalx13se7cjco4f8wnzj.png" alt="output" width="800" height="134"></a></p> <ul> <li>In the below code, we ask Gemini LLM to generate a story from the given image. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>image = PIL.Image.open('assets/sample_image2.jpg') vision_model = genai.GenerativeModel('gemini-pro-vision') response = vision_model.generate_content(["Write a story from the picture",image]) print(response.text) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv4bmltvsl2si8ln7cr6p.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv4bmltvsl2si8ln7cr6p.png" alt="output ref" width="640" height="323"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdq4kqsm55mqf2bz6wzua.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdq4kqsm55mqf2bz6wzua.png" alt="output" width="720" height="320"></a></p> <ul> <li>In the below code, we ask Gemini Vision to count the objects from an image and provide the response in the json format. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>image = PIL.Image.open('assets/sample_image3.jpg') vision_model = genai.GenerativeModel('gemini-pro-vision') response = vision_model.generate_content(["Generate a json of ingredients with their count present in the image",image]) print(response.text) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fleo0j906hy19y3ez9xsn.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fleo0j906hy19y3ez9xsn.png" alt="output" width="640" height="793"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd447xhryfi69f77bp5k8.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd447xhryfi69f77bp5k8.png" alt="output" width="750" height="761"></a></p> <h2> Interacting with chat version of Gemini LLM </h2> <p>So far, we have explored the plain text generation model. Now, we will delve into the chat version of the model utilizing the same gemini-pro. Here, instead of using <code>GenerativeModel.generate_text()</code> function, <code>GenerativeModel.start_chat()</code> function will be used.</p> <ul> <li>An empty list is provided as the history in the initiation of the chat.</li> <li> <code>chat.send_message()</code> function is used to convey the chat message, and the generated chat response can be accessed using response.text function. Additionally, Google offers the option to establish a chat with existing history. Let’s start our first conversation with Gemini LLM as below, </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import os import google.generativeai as genai os.environ['GOOGLE_API_KEY'] = "AIzaSyAjsDpD-XXXXXXXXXXXXXXX" genai.configure(api_key = os.environ['GOOGLE_API_KEY']) model = genai.GenerativeModel('gemini-pro') chat_model = genai.GenerativeModel('gemini-pro') chat = chat_model .start_chat(history=[]) response = chat.send_message("Which is one of the best place to visit in India during summer?") print(response.text) response = chat.send_message("Tell me more about that place in 50 words") print(response.text) print(chat.history) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjo0t03ffedhtoz478pny.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjo0t03ffedhtoz478pny.png" alt="output" width="800" height="309"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwwpoo0er4zrfunnkexb.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwwpoo0er4zrfunnkexb.png" alt="output" width="640" height="377"></a></p> <h2> Integrating Langchain with Gemini </h2> <p>Langchain has successfully integrated the Gemini Model into its ecosystem using the <code>ChatGoogleGenerativeAI</code> class. To initiate the process, a llm class is created by providing the desired Gemini Model to the <code>ChatGoogleGeneraativeAI</code> class. We invoke the function and pass the user input. The resulting response can be obtained by calling <code>response.content</code>.</p> <ul> <li>In the below code, we provide a general query to the model. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>from langchain_google_genai import ChatGoogleGenerativeAI llm = ChatGoogleGenerativeAI(model="gemini-pro") response = llm.invoke("Explain Quantum Computing in 50 words?") print(response.content) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faic34ymq78orwne3nj86.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faic34ymq78orwne3nj86.png" alt="output" width="800" height="144"></a></p> <ul> <li>In the below code, we provide multiple inputs to the model and get responses to get the queries asked. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>batch_responses = llm.batch( [ "Who is the Prime Minister of India?", "What is the capital of India?", ] ) for response in batch_responses: print(response.content) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4f4opf02z8753jxwfnbc.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4f4opf02z8753jxwfnbc.png" alt="output" width="800" height="110"></a></p> <ul> <li>In the below code, we provide both textual and image inputs and expect the model to generate text response based on the given inputs. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>from langchain_core.messages import HumanMessage llm = ChatGoogleGenerativeAI(model="gemini-pro-vision") message = HumanMessage( content=[ { "type": "text", "text": "Describe the image", }, { "type": "image_url", "image_url": "https://picsum.photos/id/237/200/300" }, ] ) response = llm.invoke([message]) print(response.content) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fftumspy8lxkvryrmfitt.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fftumspy8lxkvryrmfitt.png" alt="output ref" width="436" height="288"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvpc1vgcs7ldd6dyiiaoc.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvpc1vgcs7ldd6dyiiaoc.png" alt="output" width="800" height="146"></a></p> <p><code>HumanMessage</code> class from the <code>langchain_core</code> library is used to structure the content as a list of dictionaries with properties <code>“type”</code>, <code>“text”</code> and <code>“image_url”</code>. The list is passed to the llm.invoke() function and the response content is accessed using <code>response.content</code>.</p> <ul> <li>In the below code, we ask the model to find the differences between the given images. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>from langchain_core.messages import HumanMessage llm = ChatGoogleGenerativeAI(model="gemini-pro-vision") message = HumanMessage( content=[ { "type": "text", "text": "Find the differences between the given images", }, { "type": "image_url", "image_url": "https://picsum.photos/id/237/200/300" }, { "type": "image_url", "image_url": "https://picsum.photos/id/219/5000/3333" } ] ) response = llm.invoke([message]) print(response.content) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fph23sh61n9hlp1j0x04h.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fph23sh61n9hlp1j0x04h.png" alt="output ref" width="458" height="302"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy6nzrckadbiq0zkfja0d.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy6nzrckadbiq0zkfja0d.png" alt="output ref" width="454" height="303"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2h5mjmzwzeadwkb3amrh.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2h5mjmzwzeadwkb3amrh.png" alt="output" width="750" height="207"></a></p> <h2> Creating a ChatGPT Clone with Gemini API </h2> <p>Following numerous experiments with Google’s Gemini API, in this article we will construct a straightforward application akin to ChatGPT using Streamlit and Gemini.</p> <ul> <li>Create a file named gemini-bot.py and add the following code to it. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import streamlit as st import os import google.generativeai as genai st.title("Gemini Bot") os.environ['GOOGLE_API_KEY'] = "AIzaSyAjsDpD-XXXXXXXXXXXXX" genai.configure(api_key = os.environ['GOOGLE_API_KEY']) # Select the model model = genai.GenerativeModel('gemini-pro') # Initialize chat history if "messages" not in st.session_state: st.session_state.messages = [ { "role":"assistant", "content":"Ask me Anything" } ] # Display chat messages from history on app rerun for message in st.session_state.messages: with st.chat_message(message["role"]): st.markdown(message["content"]) # Process and store Query and Response def llm_function(query): response = model.generate_content(query) # Displaying the Assistant Message with st.chat_message("assistant"): st.markdown(response.text) # Storing the User Message st.session_state.messages.append( { "role":"user", "content": query } ) # Storing the User Message st.session_state.messages.append( { "role":"assistant", "content": response.text } ) # Accept user input query = st.chat_input("What's up?") # Calling the Function when Input is Provided if query: # Displaying the User Message with st.chat_message("user"): st.markdown(query) llm_function(query) </code></pre> </div> <ul> <li>Run the app by executing the following command. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>streamlit run gemini-bot.py </code></pre> </div> <ul> <li>Open the link which is displayed on the terminal to access the application.</li> </ul> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F55hjzi0few6er4jn1uje.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F55hjzi0few6er4jn1uje.png" alt="output final" width="800" height="418"></a></p> <p>Thanks for reading this article.</p> <p>Thanks Gowri M Bhatt for reviewing the content.</p> <p>If you enjoyed this article, please click on the heart button ♥ and share to help others find it!</p> <p>The full source code for this tutorial can be found here,</p> <p><a href="proxy.php?url=https://github.com/codemaker2015/gemini-api-experiments" rel="noopener noreferrer">GitHub - codemaker2015/gemini-api-experiments: Explore how Gemini's built-in vision and multimodality approach enable it to interpret images and generate text based…</a><br> github.com</p> <p>The article is also available on <a href="proxy.php?url=https://codemaker2016.medium.com/build-your-own-chatgpt-using-google-gemini-api-1b079f6a8415" rel="noopener noreferrer">Medium</a>.</p> <h3> Useful Links: </h3> <ul> <li><a href="proxy.php?url=https://youtu.be/zqsTX8iFVr4" rel="noopener noreferrer">https://youtu.be/zqsTX8iFVr4</a></li> <li>makersuite.google.com</li> <li> <a href="proxy.php?url=https://ai.google.dev/tutorials/rest_quickstart" rel="noopener noreferrer">Quickstart: Get started with Gemini using the REST API | Google AI for Developers</a> ai.google.dev</li> </ul> beginners tutorial python ai Goodbye databases, it’s time to embrace Vector Databases! Vishnu Sivan Mon, 18 Dec 2023 05:31:34 +0000 https://dev.to/codemaker2015/goodbye-databases-its-time-to-embrace-vector-databases-190l https://dev.to/codemaker2015/goodbye-databases-its-time-to-embrace-vector-databases-190l <p>The AI revolution is reshaping industries, promising remarkable innovations while introducing new challenges. In this transformative landscape, efficient data processing has become paramount for applications relying on large language models, generative AI, and semantic search. At the heart of these breakthroughs lies vector embeddings, intricate data representations infused with critical semantic information. These embeddings generated by LLMs, encompass numerous attributes or features, rendering their management a complex task. In the realm of AI and machine learning, these features represent different dimensions of data that are essential for discerning patterns, relationships, and underlying structures. To address the unique demands of handling these embeddings, a specialized database is essential. Vector databases are purpose-built to provide optimized storage and querying capabilities for embeddings, bridging the gap between traditional databases and standalone vector indexes as well as empowering AI systems with the tools they need to excel in this data-intensive environment.</p> <h2> Getting Started </h2> <h3> Table of contents </h3> <ul> <li>Introduction to Vector Databases</li> <li>Vector Embeddings</li> <li>Vector Search</li> <li>Approximate Nearest Neighbor Approach(ANN)</li> <li>Vector database vs Relational database</li> <li>Working of Vector Databases</li> <li>Importance of Vector Databases</li> <li>Top 7 Vector Databases</li> <li>Use Cases of Vector Databases</li> </ul> <h2> Introduction to Vector Databases </h2> <p>A vector database is a specialized type of database that stores data in the form of multi-dimensional vectors, each representing specific characteristics or qualities. These vectors can have varying dimensions, from a few to thousands, depending on data complexity. Various techniques like machine learning models or feature extraction are used to convert data, including text, images, audio, and video, into these vectors.</p> <p>The key advantage of a vector database is its ability to efficiently and accurately retrieve data based on vector proximity or similarity. This enables searches based on semantic and contextual relevance rather than relying solely on exact matches or predefined criteria, as seen in traditional databases.</p> <h3> Vector Embeddings </h3> <p>AI and ML have revolutionized the representation of unstructured data by using vector embeddings. These are essential lists of numbers that capture the semantic meaning of data objects. For instance, colors in the RGB system are represented by numbers indicating their red, green, and blue components.</p> <p>However, representing more complex data, like words or text, in meaningful numerical sequences is challenging. This is where ML models come into play. ML models can represent the meaning of words as vectors by learning the relationships between words in a vector space. These models are often called embeddings models or vectorizers.</p> <p>Vector embeddings encode the semantic meaning of objects relative to one another. Similar objects are grouped closely in the vector space, meaning that the closer two objects are, the more similar they are.</p> <p>For example, consider word vectors. In this case, words like “Wolf” and “Dog” are close to each other because dogs are descendants of wolves. “Cat” is also similar because it shares similarities with “Dog” as both are animals and common pets. On the other hand, words representing fruits like “Apple” and “Banana” are further away from animal terms, forming a distinct cluster in the vector space.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ioysvqy20ggbdo93dlo.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ioysvqy20ggbdo93dlo.png" alt="Vector Embeddings" width="735" height="751"></a><br> Image credits A Gentle Introduction to Vector Databases | Weaviate — vector database</p> <h3> Vector Search </h3> <p>Vector embeddings enable us to perform vector search, similarity search, or semantic search by finding and retrieving similar objects within a vector database. These processes involve locating objects that are close to each other in the vector space.</p> <p>Just as we can find similar vectors for a specific object (e.g., a dog), we can also find similar vectors to a search query. For example, to discover words which are similar to the word “Kitten,” we generate a vector embedding for “Kitten” and retrieve all items that are close to the query vector, like the word “Cat.”</p> <p>The numerical representation of data objects empowers us to apply mathematical operations, such as calculating the distance between two vector embeddings, to determine their similarity. This makes vector embeddings a powerful tool for searching and comparing data objects based on their semantic meaning.</p> <h3> Approximate Nearest Neighbor Approach(ANN) </h3> <p>Vector indexing streamlines data retrieval by efficiently organizing vector embeddings. It employs an approximate nearest neighbor (ANN) approach to pre-calculate distances between vector embeddings, cluster similar vectors, and store them in proximity. While this approach sacrifices some accuracy for speed, it allows for faster retrieval of approximate results.</p> <p>For instance, in a vector database, you can pre-calculate clusters like “animals” and “fruits.” When querying the database for “Kitten,” the search begins with the nearest animals, avoiding distance calculations between fruits and non-animal objects. The ANN algorithm initiates the search within a relevant region, such as four-legged animals, maintaining proximity to relevant results due to pre-organized similarity.</p> <h3> Vector database vs Relational database </h3> <p>The primary difference between traditional relational databases and modern vector databases lies in their optimization for different types of data. Relational databases excel at handling structured data stored in columns, relying on keyword matches for search. In contrast, vector databases are well-suited for structured and unstructured data, including text, images, and audio, along with their vector embeddings, which enable efficient semantic search. Many vector databases store vector embeddings alongside the original data, providing the flexibility to perform both vector-based and traditional keyword searches.</p> <p>For instance, when searching for jeopardy questions that involve animals, a traditional database necessitates a complex query with specific animal names, while a vector database simplifies the search by allowing a query for the general concept of “animals”.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy2r6v4s9195asc0nrovf.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy2r6v4s9195asc0nrovf.png" alt="Vector Search" width="800" height="267"></a></p> <h2> Working of Vector Databases </h2> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl46ef8ekq7ham6tg8iff.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl46ef8ekq7ham6tg8iff.png" alt="Working of Vector Databases" width="800" height="306"></a><br> Image credits What is a Vector Database &amp; How Does it Work? Use Cases + Examples | Pinecone</p> <p>In the context of an application like ChatGPT, which deals with extensive data, the process involves:</p> <ul> <li>User inputs a query into the application.</li> <li>Content to be indexed is converted into vector embeddings using the embedding model.</li> <li>The vector embedding, along with a reference to the original content, is stored in the vector database.</li> <li>When the application issues a query, the embedding model generates embeddings for the query. These query embeddings are used to search the database for similar vector embeddings. In traditional databases, queries typically require exact matches, while vector databases utilize similarity metrics to find the most similar vector to a query.</li> </ul> <p>Vector databases employ a combination of algorithms for Approximate Nearest Neighbor (ANN) search. These algorithms, organized into a pipeline, optimize search speed through techniques like hashing, quantization, and graph-based methods. Balancing accuracy and speed is a key consideration when using vector databases, which provide approximate results.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiv0fv6tvoe0v0mu9dp1i.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiv0fv6tvoe0v0mu9dp1i.png" alt="Working of Vector Databases" width="800" height="142"></a><br> Image credits What is a Vector Database &amp; How Does it Work? Use Cases + Examples | Pinecone</p> <p>A vector database query involves three main stages:</p> <ol> <li><p>Indexing: Vector embeddings are mapped to data structures using various algorithms within the vector database, thereby enhancing search speed.</p></li> <li><p>Querying: The database compares the queried vector to indexed vectors, employing a similarity metric to locate the nearest neighbor.</p></li> <li><p>Post Processing: The vector database performs post-processing on the nearest neighbor to generate the final query output, potentially re-ranking the nearest neighbors for future reference.</p></li> </ol> <h2> Importance of Vector Databases </h2> <p>Vector databases are pivotal for indexing vectors generated through embeddings. It enables searches for similar assets via neighboring vectors. Developers leverage these databases to create unique application experiences, including image searches based on user-taken photos. Automation of metadata extraction from content, coupled with hybrid keyword and vector-based searches, further enhances search capabilities. Vector databases also serve as external knowledge bases for generative AI models like ChatGPT. It ensures trustworthy information and reliable user interactions, particularly in mitigating issues like hallucinations.</p> <h2> Top 7 Vector Databases </h2> <p>The vector database landscape is dynamic and swiftly evolving, with numerous prominent players driving innovation. Each database presents distinctive features and functionalities, serving a variety of requirements and applications in the fields of machine learning and artificial intelligence.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7cjl6evrzaxa97ftwnik.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7cjl6evrzaxa97ftwnik.png" alt="Top 7 Vector Databases" width="800" height="475"></a><br> Image credits The 5 Best Vector Databases | A List With Examples | DataCamp</p> <ol> <li>Chroma</li> </ol> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fci6vxiygbidu3trf8xeh.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fci6vxiygbidu3trf8xeh.png" alt="Chroma" width="800" height="422"></a><br> Image credits 🏡 Home | Chroma (trychroma.com)<br> Chroma is an open-source embedding database designed to simplify the development of LLM (Large Language Model) applications by enabling the integration of knowledge, facts, and skills for these models. It offers features like managing text documents, converting text to embeddings, and conducting similarity searches.</p> <p>Key Features:</p> <ul> <li>Chroma provides a wide range of features, including queries, filtering, density estimates, and more.</li> <li>Supports LangChain (Python and JavaScript) and LlamaIndex.</li> <li>The API used in a Python notebook seamlessly scales to a production cluster.</li> </ul> <ol> <li>Pinecone</li> </ol> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fueaks2mypnudgdexueqs.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fueaks2mypnudgdexueqs.png" alt="Pinecone" width="800" height="262"></a><br> Image credits A Pinecone Alternative With Better Search Relevance and Lower Costs — Vectara</p> <p>Pinecone is a managed vector database platform specifically designed to address the complexities of high-dimensional data. With advanced indexing and search functionalities, Pinecone enables data engineers and data scientists to create and deploy large-scale machine learning applications for efficient processing and analysis of high-dimensional data.</p> <p>Key Features:</p> <ul> <li>Fully managed service.</li> <li>Highly scalable for handling large datasets.</li> <li>Real-time data ingestion for up-to-date information.</li> <li>Low-latency search capabilities.</li> <li>Integration with LangChain.</li> </ul> <ol> <li>Milvus</li> </ol> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F59xf7gchtmfyurlx32xs.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F59xf7gchtmfyurlx32xs.png" alt="Milvus" width="800" height="565"></a><br> Image credits What is Milvus Vector Database? — Zilliz</p> <p>Milvus is an open-source vector database with a focus on embedding similarity search and AI applications. It provides an easy-to-use, uniform user experience across deployment environments. The stateless architecture of Milvus 2.0 enhances elasticity and adaptability, making it a reliable choice for a range of use cases including image search, chatbots, and chemical structure search.</p> <p>Key Features:</p> <ul> <li>Capable of searching trillions of vector datasets in milliseconds.</li> <li>Offers straightforward management of unstructured data.</li> <li>Highly scalable and adaptable to diverse workloads.</li> <li>Supports hybrid search capabilities.</li> <li>Incorporates a unified Lambda structure for seamless performance.</li> </ul> <ol> <li>Weaviate</li> </ol> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw9t6t0qoblduvjrz7y5i.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw9t6t0qoblduvjrz7y5i.png" alt="Weaviate" width="640" height="339"></a><br> Image credits Learning to Retrieve Passages without Supervision | Weaviate</p> <p>Weaviate is an open-source vector database that enables the storage of data objects and vector embeddings from various machine learning models. It can seamlessly scale to accommodate billions of data objects.</p> <p>Key Features:</p> <ul> <li>Weaviate can rapidly retrieve the ten nearest neighbors from millions of objects in just milliseconds.</li> <li>Users can import or upload their vectorised data, as well as integrate with platforms like OpenAI, HuggingFace, and more.</li> <li>Weaviate is suitable for both prototypes and large-scale production, prioritizing scalability, replication, and security.</li> <li>Weaviate offers features like recommendations, summarizations, and integrations with neural search frameworks.</li> </ul> <ol> <li>Qdrant</li> </ol> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwm87nothzpci5qtafct0.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwm87nothzpci5qtafct0.png" alt="Qdrant" width="536" height="513"></a><br> Image credits qdrant/qdrant: Qdrant</p> <p>Qdrant is a versatile vector database and API service designed for conducting high-dimensional vector similarity searches. It transforms embeddings and neural network encoders into comprehensive applications suited for matching, searching, and recommendations.</p> <p>Key Features:</p> <p>Provides OpenAPI v3 specifications and pre-built clients for multiple programming languages.<br> Utilizes a custom HNSW algorithm for rapid and accurate vector searches.<br> Enables result filtering based on associated vector payloads.<br> Supports various data types including string matching, numerical ranges, and geo-locations.<br> Designed for cloud-native environments with horizontal scaling capabilities.</p> <ol> <li>Elasticsearch</li> </ol> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frtujiirmg9xjvoimba3o.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frtujiirmg9xjvoimba3o.png" alt="Elasticsearch" width="800" height="409"></a><br> Image credits Learn more about Elasticsearch — ITZone<br> Elasticsearch is an open-source analytics engine that offers versatile data handling capabilities, including textual, numerical, geographic, structured, and unstructured data. It is a key component of the Elastic Stack, a suite of open tools for data processing, storage, analysis, and visualization. Elasticsearch excels in various use cases, providing centralized data storage, lightning-fast search, fine-tuned relevance, and scalable analytics.</p> <p>Key Features:</p> <ul> <li>Supports cluster configurations and ensures high availability.</li> <li>Features automatic node recovery and data distribution.</li> <li>Scales horizontally to handle large workloads.</li> <li>Detects errors to maintain secure and accessible clusters and data.</li> <li>Designed for continuous peace of mind, with a distributed architecture that ensures reliability.</li> </ul> <ol> <li>Faiss</li> </ol> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fach4ny864p7j978n7jh8.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fach4ny864p7j978n7jh8.png" alt="Faiss" width="736" height="402"></a><br> Image credits Faiss: A library for efficient similarity search — Engineering at Meta (fb.com)</p> <p>Faiss, developed by Facebook AI Research, is an open-source library designed for fast and efficient dense vector similarity search and grouping. It supports searching sets of vectors of various sizes, even those that may not fit in RAM, making it versatile for large datasets.</p> <p>Key Features:</p> <ul> <li>Besides returning the nearest neighbor, Faiss also returns the second nearest, third nearest, and k-th nearest neighbors.</li> <li>Allows searching multiple vectors simultaneously (batch processing).</li> <li>Utilizes the greatest inner product search rather than a minimal Euclidean search.</li> <li>Supports various distances, including L1, Linf, and more.</li> </ul> <h2> Use cases of Vector Databases </h2> <p>Vector databases are making significant impacts across various industries by excelling in similarity search.</p> <h3> Retail Experiences </h3> <p>Vector databases transform retail by powering advanced recommendation systems that offers personalized shopping experiences based on product attributes and user preferences.</p> <h3> Natural Language Processing (NLP) </h3> <p>Vector databases enhance NLP applications, enabling chatbots and virtual assistants to better understand and respond to human language, improving customer-agent interactions.</p> <h3> Financial Data Analysis </h3> <p>In finance, vector databases analyze complex data to help analysts detect patterns, make informed investment decisions, and forecast market movements.</p> <h3> Anomaly Detection </h3> <p>Vector databases excel at spotting outliers, particularly in sectors like finance and security, making the detection process faster and more accurate, thus preventing fraud and security breaches.</p> <h3> Healthcare </h3> <p>Vector databases personalize medical treatments by analyzing genomic sequences, aligning solutions with individual genetic makeup.</p> <h3> Media Analysis </h3> <p>Vector databases simplify image analysis, aiding in tasks such as medical scans and surveillance footage interpretation for optimizing traffic flow and public safety.</p> <p>To experiment with vector databases such as Chromadb, Pinecone, Weaviate and Pgvector, follow the below link.</p> <p><a href="proxy.php?url=https://coinsbench.com/experimenting-with-vector-databases-chromadb-pinecone-weaviate-and-pgvector-0f35c0356540" rel="noopener noreferrer">Experimenting with Vector Databases: Chromadb, Pinecone, Weaviate and Pgvector | codemaker2016.medium.com</a></p> <p>Thanks for reading this article.</p> <p>Thanks Gowri M Bhatt for reviewing the content.</p> <p>If you enjoyed this article, please click on the heart button ♥ and share to help others find it!</p> <p>The article is also available on <a href="proxy.php?url=https://medium.com/@codemaker2016/goodbye-databases-its-time-to-embrace-vector-databases-0ffa7879980e" rel="noopener noreferrer">Medium</a>.</p> database vectordatabase programming beginners Introducing Vitest: the super fast testing framework Vishnu Sivan Sun, 29 Oct 2023 07:35:20 +0000 https://dev.to/codemaker2015/introducing-vitest-the-super-fast-testing-framework-g73 https://dev.to/codemaker2015/introducing-vitest-the-super-fast-testing-framework-g73 <p>In the ever-evolving landscape of software development, the significance of testing cannot be overstated. It plays a pivotal role in guaranteeing the dependability and functionality of applications. Uncovering and resolving bugs in the early stages of development not only conserves time and resources but also elevates the overall quality of the software.</p> <p>Enter Vitest, a robust testing framework that stands out as a compelling alternative to well-known tools like Jest, particularly for Vue.js projects. Fueled by Vite, Vitest distinguishes itself with its exceptional speed and simplicity, making it an invaluable resource for developers seeking to streamline their testing processes.</p> <p>In this article, we will walk you through the basics Vitest and testing in your projects using Vitest framework.</p> <h2> Getting Started </h2> <h3> Table of contents </h3> <ul> <li>What is Vite</li> <li>What is Vitest</li> <li>Why Vitest</li> <li>Vitest vs other frameworks</li> <li>Create your first vitest app</li> <li>Create a React project with Vitest</li> <li>Installing the dependencies</li> <li>Writing your first test script</li> <li>Running the test</li> </ul> <h3> What is Vite </h3> <p>Vite stands out as a cutting-edge, high-speed tool designed for scaffolding and constructing web projects. It is crafted by Evan You, the mind behind Vue.js. Vite supports a range of frameworks, including Vue, React, Preact, Lit, Svelte, and Solid. Its key strength lies in leveraging native ES modules, resulting in superior speed compared to conventional tools like webpack or Parcel.</p> <p>Vite employs a server that dynamically compiles and serves necessary dependencies through ES modules. This strategy enables Vite to process and deliver only the code essential at any given moment. Consequently, Vite deals with considerably less code during server startup and updates. Another factor contributing to Vite’s speed is its utilization of esbuild for pre-bundling dependencies in development. Esbuild, a remarkably fast JavaScript bundler implemented in the Go language, enhances the overall performance of Vite.</p> <h3> What is Vitest </h3> <p>Vitest, a testing framework, is constructed atop Vite, a tool dedicated to overseeing and constructing JavaScript-centric web applications. It stands out as a swift and minimalist testing solution, demanding minimal configuration. Vitest seamlessly aligns with Jest, a widely adopted JavaScript testing framework, and seamlessly integrates into Vue applications. While it is purpose-built for use with Vite, Vitest can also operate independently, offering flexibility in its application.</p> <h3> Why Vitest </h3> <p>Vitest stands out as a powerful testing framework that offering unique advantages over other frameworks like Jest. It provides the following features which makes it attractive towards the testing frameworks.</p> <h4> Simplified Setup and Configuration: </h4> <p>ViTest streamlines the setup and configuration process, enabling developers to dedicate more time to writing tests and less to intricate configuration. Its minimalistic approach makes it an ideal choice for small to medium-sized projects.</p> <h4> Concise and Legible Syntax: </h4> <p>ViTest offers a concise and easily understandable syntax, simplifying the task of writing and comprehending test cases. With its clean and intuitive API, you can articulate your test expectations in a natural, human-readable manner.</p> <h4> Outstanding Performance and Test Execution: </h4> <p>ViTest is renowned for its exceptional performance and rapid test execution speed. It optimizes test execution, ensuring the efficiency of your test suite, even as your codebase expands. This advantage proves especially valuable when tackling extensive test suites or large-scale projects.</p> <h4> Seamless TypeScript Integration: </h4> <p>ViTest seamlessly integrates with TypeScript, harnessing its type system to detect errors and furnish insightful feedback during testing. You have the ability to define and enforce type validations within your test cases, guaranteeing type correctness throughout your testing journey.</p> <h3> Vitest vs other frameworks </h3> <p>When it comes to testing your JavaScript code, the landscape offers a plethora of options. Two standout choices among the most popular ones are Jest and Vitest.</p> <h4> Jest </h4> <p>Jest is a JavaScript testing framework that is designed to ensure the correctness of any JavaScript codebase. It has gained popularity among developers for its simplicity and user-friendly nature. Jest works with projects using Babel, TypeScript, Node, React, Angular, Vue, and more. It is a zero-config framework that aims to work out of the box on most JavaScript projects. Jest provides a wide range of features such as snapshots, isolated tests, great API, fast and safe, code coverage, and easy mocking. Jest is well-documented, requires minimal configuration, and can be extended to meet specific project needs. Widely utilized by companies and individuals globally, it stands as a trusted and extensively adopted tool in the realm of JavaScript development.</p> <h3> Vitest vs Jest </h3> <p>Whether you opt for Jest or Vitest for JavaScript testing, you’ll find a contemporary, straightforward, and speedy testing experience. These frameworks are well-established, work seamlessly.</p> <h4> Speed </h4> <p>The choice between Jest and Vitest for faster tests depends on the specific circumstances. Vitest generally offers a safer bet for faster test execution, but the significance of this advantage varies based on factors such as the number of tests and the available resources. If you have a substantial number of tests and are running them on a resource-constrained local development environment, test speed becomes a more crucial concern. However, if you have only a few tests or are testing on a well-resourced infrastructure, the speed difference may be less critical.</p> <h4> Module management </h4> <p>Jest aligns with CommonJS for module management, offering simplicity in testing for projects using this traditional approach. In contrast, Vitest is tailored for ECMAScript Modules (ESM), a more contemporary module management system. Choosing between Jest and Vitest may hinge on your project’s module strategy; Jest seamlessly integrates with CommonJS, while Vitest is the preferred option for ESM users. The module management system compatibility becomes a pivotal factor in deciding which testing framework aligns with your JavaScript project.</p> <h4> Documentation and community support </h4> <p>Jest having been around for a decade, boasts a larger and more established community than the relatively newer Vitest. This results in better documentation and support for Jest, making it a more accessible choice. While Vitest is gaining popularity, it may take time to match Jest’s community strength. Jest currently enjoys the advantage of a more established ecosystem.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkfnko5uypqozhi7x8ixd.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkfnko5uypqozhi7x8ixd.png" alt="vitest comparison" width="800" height="622"></a><br> Image credits Vitest: Blazing Fast Unit Test Framework (lo-victoria.com)</p> <h3> Creating your first vitest app </h3> <p>In this section, we will try to create our first vitest application.</p> <ul> <li>Open command prompt / terminal in your machine.</li> <li>Create a folder named <code>first-vitest-app</code> and switch to the directory. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>mkdir first-vitest-app </code></pre> </div> <ul> <li>Initialize the project using the following command and provide the necessary details (test command: vitest) when it required. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>npm init </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpklgi50vgwu6cui7fu82.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpklgi50vgwu6cui7fu82.png" alt="npm init" width="800" height="543"></a></p> <ul> <li>Install the libraries <code>vite</code>, <code>vitest</code>, <code>@vitest/ui</code> using the following command. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>npm install vite vitest @vitest/ui </code></pre> </div> <p>Vite is the default development dependency for vitest. The vitest/ui provides a user interface for testing the application.</p> <ul> <li>Add the scripts attribute with the following contents in the <code>package.json</code> file. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>"scripts": { "test": "vitest", "test:ui": "vitest --ui", "test:run": "vitest run" } </code></pre> </div> <ul> <li>Create a file named vite.config.ts and add the following code to it. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>/// &lt;reference types="vitest" /&gt; // Configure Vitest (https://vitest.dev/config/) import { defineConfig } from 'vite' export default defineConfig({ test: { /* for example, use global to avoid globals imports (describe, test, expect): */ // globals: true, }, }) </code></pre> </div> <ul> <li>Create a folder named <code>test</code> and create files named <code>basic.test.ts</code> and <code>suite.test.ts</code> inside it. <code>basic.test.ts</code> file is used to write the basic test cases using <code>test()</code> method and <code>suite.test.ts</code> file is used to write test cases using <code>describe()</code> method.</li> <li>Open <code>basic.test.ts</code> file and add the following code to it. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import { assert, expect, test } from 'vitest' test('Math.sqrt()', () =&gt; { expect(Math.sqrt(4)).toBe(2) expect(Math.sqrt(144)).toBe(12) expect(Math.sqrt(2)).toBe(Math.SQRT2) }) test('JSON', () =&gt; { const input = { foo: 'hello', bar: 'world', } const output = JSON.stringify(input) expect(output).eq('{"foo":"hello","bar":"world"}') assert.deepEqual(JSON.parse(output), input, 'matches original') }) </code></pre> </div> <ul> <li>Open <code>suite.test.ts</code> file and add the following code to it. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import { assert, describe, expect, it } from 'vitest' describe('suite name', () =&gt; { it('foo', () =&gt; { assert.equal(Math.sqrt(4), 2) }) it('bar', () =&gt; { expect(1 + 1).eq(2) }) it('snapshot', () =&gt; { expect({ foo: 'bar' }).toMatchSnapshot() }) }) </code></pre> </div> <h3> Running the test </h3> <p>Run the test using the following commands.</p> <ul> <li>Run the test in the normal command line mode. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>npm run test </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2lbc3gw0pmnle4ri9lkk.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2lbc3gw0pmnle4ri9lkk.png" alt="test" width="800" height="442"></a></p> <ul> <li>Run the test with UI. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>npm run test:ui </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fazd12cdfoldmca1mrmt9.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fazd12cdfoldmca1mrmt9.png" alt="output" width="640" height="278"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7phi0thewxj6h7n3yfjb.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7phi0thewxj6h7n3yfjb.png" alt="output" width="800" height="239"></a></p> <h3> Creating a React project with Vitest </h3> <p>In this section, we will build a React application utilizing the Vite framework. Additionally, we will develop test cases using the Vitest framework and proceed to execute it.</p> <ul> <li>Create a react project using vite by executing the following command. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>npm create vite@latest </code></pre> </div> <ul> <li>Switch to the project folder and install the dependencies using the following command. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>npm install </code></pre> </div> <h4> Installing the dependencies </h4> <ul> <li>Install vitest library using the following command. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>npm install -D vitest jsdom @testing-library/react @testing-library/jest-dom @types/testing-library__jest-dom </code></pre> </div> <ul> <li>Add the scripts attribute with the following contents in the <code>package.json</code> file. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>"scripts": { ... "test": "vitest" } </code></pre> </div> <ul> <li>Create a folder named <code>tests</code> and add a file named <code>setup.ts</code> with the following content to it. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>/// &lt;reference types="@testing-library/jest-dom" /&gt; import { expect, afterEach } from 'vitest'; import { cleanup } from '@testing-library/react'; import * as matchers from '@testing-library/jest-dom/matchers'; expect.extend(matchers); afterEach(() =&gt; { cleanup(); }); </code></pre> </div> <ul> <li>Open <code>vite.config.js</code> file and add the following code to it. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>test: { environment: 'jsdom', setupFiles: ['./tests/setup.ts'], testMatch: ['./tests/**/*.test.tsx'], globals: true } </code></pre> </div> <h3> Writing your first test script </h3> <p>We have configured vitest for the app. We can create a basic test case to evaluate the App component.</p> <ul> <li>Create a file named App.test.tsx and add the following code to it. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import { render, screen } from '@testing-library/react'; import { describe, expect, it } from 'vitest' import App from "../src/App"; import React from 'react'; describe('App', () =&gt; { it('renders headline', () =&gt; { render(&lt;App /&gt;); const headline = screen.getByText("Vite + React"); expect(headline).toBeInTheDocument(); }); }); </code></pre> </div> <h3> Running the test </h3> <p>Run the test using the following command.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>npm run test </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fla4okpkxuyprj9fs98k7.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fla4okpkxuyprj9fs98k7.png" alt="output" width="800" height="374"></a></p> <ul> <li>Change the headline in <code>App.tsx</code> file to <code>“Your first Vitest app”</code>. Run the test again then you will get the output as follows.</li> </ul> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhhmog220b0g2n80yv8pn.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhhmog220b0g2n80yv8pn.png" alt="output1" width="720" height="379"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8uaj85n8ycrbgeda7tds.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8uaj85n8ycrbgeda7tds.png" alt="output2" width="640" height="376"></a></p> <p>Thanks for reading this article.</p> <p>Thanks Gowri M Bhatt for reviewing the content.</p> <p>If you enjoyed this article, please click on the heart button ♥ and share to help others find it!</p> <p>The full source code for this tutorial can be found here,</p> <p><a href="proxy.php?url=https://github.com/codemaker2015/vitest-examples" rel="noopener noreferrer">GitHub - codemaker2015/vitest-examples: vitest testing framework examples</a></p> <p>The article is also available on <a href="proxy.php?url=https://medium.com/@codemaker2016/introducing-vitest-the-super-fast-testing-framework-c4a86b431f8d" rel="noopener noreferrer">Medium</a>.</p> testing vite beginners Bun | The all-in-one JavaScript runtime Vishnu Sivan Sat, 23 Sep 2023 17:50:44 +0000 https://dev.to/codemaker2015/bun-the-all-in-one-javascript-runtime-4n2b https://dev.to/codemaker2015/bun-the-all-in-one-javascript-runtime-4n2b <p>In the ever-evolving realm of JavaScript development, a new entrant has made its debut — Bun 1.0. It’s not just any run-of-the-mill tool; it serves as a multifaceted JavaScript runtime and toolkit. The primary objective here is to simplify the development journey by removing unnecessary complexities. Bun is capable to a range of tasks, such as building, executing, testing, and debugging JavaScript and TypeScript. In essence, it offers a holistic solution for developers seeking efficiency.</p> <p>What sets Bun apart is its distinct role as a drop-in replacement for Node.js. Furthermore, it proudly boasts compatibility with TypeScript and TSX files, and notably, it delivers superior speed compared to Node.js. A pivotal feature of Bun is its adeptness in supporting both Common JS and ES modules.</p> <p>Another remarkable feature of Bun is its capacity for hot reloading, allowing code to be refreshed without the need for process restarts. This feature proves exceptionally handy during the development phase, where iterative changes are the norm. Additionally, Bun offers a plugin API for crafting custom loaders and extends its versatility by supporting YAML imports. These traits collectively make Bun a valuable asset in the toolkit of modern developers.</p> <p>In this article, we will dive deeper into the world of Bun 1.0, exploring its features, benefits, and how it can revolutionize your JavaScript development journey. Buckle up, because the future of JavaScript development just got a whole lot more exciting with Bun.</p> <h2> Getting Started </h2> <h3> Table of contents </h3> <ul> <li>What is Bun</li> <li>Design goals</li> <li>Benchmarking</li> <li>Node vs Deno vs Bun</li> <li>Installation</li> <li>Experimenting with Bun</li> <li>Creating a simple HTTP server</li> <li>Creating a react application</li> <li>Bun for Next, Svelte, and Vue</li> <li>Bun Roadmap</li> <li>Useful Links</li> </ul> <h3> What is Bun </h3> <p>Bun is a comprehensive toolkit tailored for JavaScript and TypeScript applications, centered around its high-performance Bun runtime. This runtime, written in Zig and harnessing the capabilities of JavaScriptCore, serves as a seamless substitute for Node.js, effectively reducing startup time and memory consumption. Bun stands out as an inventive JavaScript runtime, encompassing an integrated native bundler, transpiler, task runner, and a npm client. It is ingeniously designed to supplant traditional JavaScript and TypeScript scripts or applications on local machines. Notably, the latest release, version 5, introduces exciting additions such as npm workspaces, Bun.dns, and node:readline support.</p> <p><a href="proxy.php?url=https://youtu.be/BsnCpESUEqM" rel="noopener noreferrer">What is Bun</a></p> <h3> Design goals </h3> <ul> <li> <strong>Performance</strong>: Bun boasts a 4x faster startup time compared to Node.js.</li> <li> <strong>TypeScript &amp; JSX Capabilities</strong>: Bun enables the direct execution of .jsx, .ts, and .tsx files, with its transpiler seamlessly converting them to vanilla JavaScript for execution.</li> <li> <strong>ESM &amp; CommonJS Compatibility</strong>: While advocating for ES modules (ESM), Bun also supports CommonJS.</li> <li> <strong>Web-Standard APIs</strong>: Bun incorporates standard Web APIs like fetch, WebSocket, and ReadableStream.</li> <li> <strong>Node.js Integration</strong>: Bun not only supports Node-style module resolution but also aims for comprehensive compatibility with core Node.js globals and modules.</li> </ul> <h3> Benchmarking </h3> <p>Bun distinguishes itself primarily through its exceptional speed, a minimum of 2.5 times faster than both Deno and Node.</p> <p>Instead of relying on the typically faster V8 engine, Bun utilizes JavaScriptCore from WebKit. Furthermore, the creator of Bun has highlighted that ZIG, a low-level programming language akin to C or Rust, lacks hidden control flow, which significantly simplifies the development of fast applications.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftgfvgcbi0iydcnox0fpi.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftgfvgcbi0iydcnox0fpi.png" alt="benchmarking" width="800" height="257"></a></p> <h3> Node vs Deno vs Bun </h3> <h4> Node.js </h4> <p>Node.js, developed by Ryan Dahl and introduced in 2009, stands as the preeminent JavaScript runtime. In 2023, it garnered the top position as the most popular web technology according to Stack Overflow developers. Node.js has been a transformative force, expanding the horizons of JavaScript applications by enabling the creation of sophisticated backend-driven solutions. Today, it anchors a sprawling ecosystem replete with abundant resources and libraries.</p> <h4> Deno </h4> <p>Deno is a JavaScript runtime built on Rust, introduced by Ryan Dahl. It emerged with the aim of enhancing the features offered by Node.js. Deno places a strong emphasis on bolstering security compared to Node.js. It achieves this by mandating explicit permission for file, network, and environment access, thereby reducing the likelihood of common security vulnerabilities in these domains. Additionally, Deno is tailored to offer improved support for JSX and TypeScript, aligning more closely with web standards. Furthermore, it simplifies deployment by packaging applications as self-contained executables.</p> <h4> Bun </h4> <p>Bun, the latest contender in the runtime arena, is powered by Zig and positions itself as an all-inclusive runtime and toolkit, focusing on speed, bundling, testing, and compatibility with Node.js packages. Its standout feature lies in its exceptional performance, surpassing both Node.js and Deno.</p> <p>A performance benchmark, exemplified by running an HTTP handler rendering a server-side React page, demonstrated that Bun handles approximately 68,000 requests per second, whereas Deno and Node.js manage around 29,000 and 14,000, respectively, showcasing a significant performance differential. Bun goes beyond performance, encompassing bundling and task-running capabilities for projects built with JavaScript and TypeScript. Similar to Deno, it delivers as single binaries and incorporates built-in support for Web APIs. Additionally, it extends support to select Node.js libraries, ensuring npm compatibility.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp0zeo6vennlxarvvwhgs.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp0zeo6vennlxarvvwhgs.png" alt="comparison" width="787" height="842"></a></p> <h3> Installation </h3> <p>You have the flexibility to install Bun as a native package on any operating system, or alternatively, you can opt for a global NPM package installation. While it may seem unconventional to use NPM to its replacement, this approach undeniably streamlines the installation process.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code># with install script (recommended) curl -fsSL https://bun.sh/install | bash # with npm npm install -g bun # with Homebrew brew tap oven-sh/bun brew install bun # with Docker docker pull oven/bun docker run --rm --init --ulimit memlock=-1:-1 oven/bun </code></pre> </div> <p>Bun requires unzip package to be installed, if you are using the curl option for installation.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>sudo apt update sudo apt-get install unzip </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0typ94nokbscgofcjb55.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0typ94nokbscgofcjb55.png" alt="installation1" width="640" height="195"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmm9brulu70ska70j7a04.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmm9brulu70ska70j7a04.png" alt="installation2" width="720" height="211"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqzhl6ff03hu35b6vnw2v.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqzhl6ff03hu35b6vnw2v.png" alt="installation3" width="800" height="245"></a></p> <p>If you are using Windows machine, install WSL to run Bun.</p> <p>Refer the following link to install WSL in your machine.</p> <p><a href="proxy.php?url=https://ubuntu.com/tutorials/install-ubuntu-on-wsl2-on-windows-11-with-gui-support#1-overview" rel="noopener noreferrer">Install Ubuntu on WSL2 and get started with graphical applications</a></p> <p>You can do the installation by opening terminal with administrator privileges and executing the wsl — install command.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkm3bwmvrm9xplfi14k59.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkm3bwmvrm9xplfi14k59.png" alt="installation4" width="640" height="341"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvqqcvaagein7xlnqnwhj.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvqqcvaagein7xlnqnwhj.png" alt="Installation5" width="786" height="297"></a></p> <p>You can verify the bun installation using the following command,<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>bun --help </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fghoe6q6bjdvgiw4srfo6.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fghoe6q6bjdvgiw4srfo6.png" alt="Installation" width="800" height="245"></a></p> <h3> Experimenting with Bun </h3> <p>Let’s try some experiments with bun.</p> <h4> 1. Creating a simple HTTP server </h4> <p>Let’s create a simple HTTP server using the Bun.serve API.</p> <ul> <li>Create a project directory and switch to it. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>mkdir http-server-demo cd http-server-demo </code></pre> </div> <ul> <li>Run the following command on your terminal to scaffold a new project. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>bun init </code></pre> </div> <ul> <li>Open index.ts and add the following code snippet to create a simple HTTP server using bun. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>const server = Bun.serve({ port: 3000, fetch(req) { return new Response("Hello World"); }, }); console.log(`Listening on http://localhost:${server.port}`); </code></pre> </div> <ul> <li>Execute the server using the following command, </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>bun run index.ts </code></pre> </div> <p>You can open the url <a href="proxy.php?url=http://localhost:3000" rel="noopener noreferrer">http://localhost:3000</a> in your browser to see the server response.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbgpmgioqvzu8atpkjvol.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbgpmgioqvzu8atpkjvol.png" alt="output1" width="800" height="420"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4i2pt11m8dtnbwggcalj.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4i2pt11m8dtnbwggcalj.png" alt="output2" width="800" height="209"></a></p> <h4> 2. Creating a react application </h4> <p>Let’s create a react application using Bun’s vite support.</p> <ul> <li>Execute the following command to create a boilerplate for your react application using Bun. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>bun create vite bun-react-demo </code></pre> </div> <ul> <li>Switch to the project directory and execute the following commands, </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>cd bun-react-demo bun install bun run dev </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fia09v2dzo7t58hzt0yup.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fia09v2dzo7t58hzt0yup.png" alt="output3" width="800" height="709"></a></p> <p>You can view the output as follows,<br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8p9krtp3q1v80a0zv7vz.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8p9krtp3q1v80a0zv7vz.png" alt="output4" width="800" height="392"></a></p> <h4> 3. Bun for Next, Svelte, and Vue </h4> <p>For a Next.js application, Bun offers a comparable functionality initiated with the command: “bun create next ./app” In scenarios where an integrated loader isn’t present, Bun.js incorporates customizable loaders. These loaders enable the handling of files associated with frameworks like Svelte or Vue, such as .svelte or .vue extensions. Additionally, there’s an experimental SvelteKit adapter designed to run SvelteKit within the Bun environment.</p> <h3> Bun Roadmap </h3> <p>The Bun roadmap encompasses numerous open tasks, offering a glimpse into the project’s extensive scope and ambitious goals. Bun aspires to evolve into a comprehensive solution, serving as a versatile platform for various server-side JavaScript tasks.</p> <p><a href="proxy.php?url=https://github.com/oven-sh/bun/issues/159" rel="noopener noreferrer">Bun's Roadmap · Issue #159 · oven-sh/bun | github.com</a></p> <p>Thanks for reading this article.</p> <p>Thanks Gowri M Bhatt for reviewing the content.</p> <p>If you enjoyed this article, please click on the heart button ♥ and share to help others find it!</p> <p>The full source code for this tutorial can be found here,</p> <p><a href="proxy.php?url=https://github.com/codemaker2015/bun-demo" rel="noopener noreferrer">GitHub - codemaker2015/bun-demo: Experimenting with the new JavaScript framework | github.com</a></p> <p>The article is also available on <a href="proxy.php?url=https://codemaker2016.medium.com/bun-the-all-in-one-javascript-runtime-dc97721147a7" rel="noopener noreferrer">Medium</a>.</p> <h3> Useful Links </h3> <ul> <li><a href="proxy.php?url=https://byteofdev.com/posts/what-is-bun" rel="noopener noreferrer">What is Bun, and does it live up to the hype? | byteofdev.com</a></li> <li><a href="proxy.php?url=https://bestofjs.org/projects/bun" rel="noopener noreferrer">Bun - Incredibly fast JavaScript runtime | bestofjs.org</a></li> <li><a href="proxy.php?url=https://apps.microsoft.com/store/detail/ubuntu/9PDXGNCFSCZV?hl=en-in&amp;gl=in&amp;rtc=1" rel="noopener noreferrer">Get Ubuntu from the Microsoft Store | apps.microsoft.com</a></li> </ul> javascript webdev beginners tutorial Streamlit cheatsheet for beginners Vishnu Sivan Sun, 27 Aug 2023 18:23:58 +0000 https://dev.to/codemaker2015/streamlit-cheatsheet-for-beginners-706 https://dev.to/codemaker2015/streamlit-cheatsheet-for-beginners-706 <p>Streamlit is a widely used open-source Python framework which facilitates the creation and deployment of web apps for Machine Learning and Data Science. It enables a seamless process of developing and viewing results by allowing users to build apps just like writing Python code. This interactive loop between coding and web app visualization is a distinctive feature, making app development and result exploration efficient and effortless.</p> <h2> Getting Started </h2> <h3> Table of contents </h3> <ul> <li>Streamlit Methods</li> <li>Installation</li> <li>Start with hello world</li> <li>Text Elements Examples</li> <li>Widget Examples</li> <li>Input</li> <li>Button</li> <li>Checkbox</li> <li>Radio</li> <li>Slider</li> <li>Date and time</li> <li>Form</li> <li>Status</li> <li>Chart</li> <li>Data</li> <li>Chat</li> </ul> <h3> Streamlit Methods </h3> <h4> Data Presentation: </h4> <ul> <li>The <code>st.write()</code> function empowers you to exhibit various data formats as per the requirements.</li> <li>The <code>st.metric()</code> function supports you to showcase a singular metric.</li> <li>The <code>st.table()</code> function is used for rendering tabular information.</li> <li>The <code>st.dataframe()</code> function is engineered to elegantly showcase pandas dataframes.</li> <li>The <code>st.image()</code> function offers seamless image display for visual content.</li> <li> <p>The <code>st.audio()</code> function takes care of audio file playback.</p> <h4> Headers and Text Styling: </h4> </li> <li><p>The <code>st.subheader()</code> function serves as a valuable tool for generating subheadings within your application.</p></li> <li><p>The <code>st.markdown()</code> function is used for enabling seamless integration of Markdown-formatted content.</p></li> <li> <p>The <code>st.latex()</code> functions stands as a powerful asset for expressing mathematical equations.</p> <h4> User Interaction: </h4> <p>To infuse your web application with interactive elements, Streamlit provides an array of widgets.</p> </li> <li><p>The <code>st.checkbox()</code> function is used for incorporating checkboxes.</p></li> <li><p>The <code>st.button()</code> function is used for buttons.</p></li> <li><p>The <code>st.selectbox()</code> function facilitates dropdown menus implementation.</p></li> <li><p>The <code>st.multiselect()</code> function is designed to meet multi-selection dropdown requirements.</p></li> <li> <p>The <code>st.file_uploader()</code> function is used for handling file uploads.</p> <h4> Progress Tracking: </h4> <p>Streamlit offers functions tailored for indicating progress.</p> </li> <li><p>The <code>st.progress()</code> function is used to create a dynamic progress bar.</p></li> <li> <p>The <code>st.spinner()</code> function allows users to incorporate a spinner animation to denote ongoing processes.</p> <h4> Sidebar and Form Integration: </h4> <p>Streamlit’s versatile capabilities extend to incorporating a sidebar to accommodate supplementary functionality.</p> </li> <li><p>The <code>st.sidebar()</code> function is used to seamlessly integrate elements into secondary space.</p></li> <li> <p>The <code>st.form()</code> function establishes a framework for user interactions.</p> <h4> Custom HTML and CSS Integration: </h4> <p>Streamlit offers provisions for embedding custom HTML and CSS to tailor your web application’s appearance and behavior.</p> </li> <li><p>The <code>st.markdown()</code> function enables Markdown styling.</p></li> <li><p>The <code>st.write()</code> function facilitates the integration of bespoke HTML components into your application.</p></li> </ul> <h3> Installation </h3> <p>To get started, the initial step involves the installation of Streamlit. Ensure that you have installed Python 3.7 to 3.10, along with PIP and your preferred Python Integrated Development Environment (IDE) in your machine. With these prerequisites, open your terminal and execute the following command to install Streamlit.</p> <ul> <li>Create and activate the virtual environment by executing the following command. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>python -m venv venv source venv/bin/activate #for ubuntu venv/Scripts/activate #for windows </code></pre> </div> <ul> <li>Install streamlit library using pip. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>pip install streamlit </code></pre> </div> <h3> Start with hello world </h3> <p>Initiate your Streamlit experience by delving into the pre-built “Hello World” application provided by the platform. To confirm the successful installation, execute the following command in your terminal to test its functionality:<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>streamlit hello </code></pre> </div> <p>You can see streamlit Hello world app in a new tab in your web browser.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F410xedqvk8gzxw02slgf.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F410xedqvk8gzxw02slgf.png" alt="output1" width="750" height="317"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnsdmcsmqdnu3ueznf88c.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnsdmcsmqdnu3ueznf88c.png" alt="output2" width="640" height="338"></a></p> <h3> Text Elements Examples </h3> <ul> <li> <strong>Title</strong>: Defines the page’s title.</li> <li> <strong>Header</strong>: Showcases text using header formatting.</li> <li> <strong>Subheader</strong>: Presents text in sub header formatting.</li> <li> <strong>Markdown</strong>: Applies markdown formatting to the text.</li> <li> <strong>Code</strong>: Exhibits text as code with suitable syntax highlighting.</li> <li> <strong>Latex</strong>: Utilizes LaTeX to present mathematical equations. Create a file named <code>text_example.py</code> and add the following code to it. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import streamlit as st # set the app's title st.title("Title in Streamlit") # header st.header("Header in Streamlit") # subheader st.subheader("Subheader in Streamlit") # markdown # display text in bold formatting st.markdown("**Streamlit** is a widely used open-source Python framework, facilitates the creation and deployment of web apps for Machine Learning and Data Science.") # display text in italic formatting st.markdown("Visit [Streamlit](https://docs.streamlit.io) to learn more about Streamlit.") # code block code = ''' def add(a, b): print("a+b = ", a+b) ''' st.code(code, language='python') # latex st.latex(''' (a+b)^2 = a^2 + b^2 + 2*a*b ''') </code></pre> </div> <p>Run the text example using the following command.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>streamlit run text_example.py </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3dgywya5jbwy17itgr1e.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3dgywya5jbwy17itgr1e.png" alt="output3" width="800" height="351"></a></p> <h3> Widget Examples </h3> <h4> Input </h4> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import streamlit as st # text input name = st.text_input("Enter your name", "") st.write("Your name is ", name) age = st.number_input(label="Enter your age") st.write("Your age is ", age) address = st.text_area("Enter your address", "") st.write("Your address is ", address) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fojg3mmplonzbe67g1w3p.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fojg3mmplonzbe67g1w3p.png" alt="input" width="800" height="370"></a></p> <h4> Button </h4> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import streamlit as st #button if st.button('Click me', help="Click to see the text change"): st.write('Welcome to Streamlit!') else: st.write('Hi there!') </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv83h79ro505a2b7809na.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv83h79ro505a2b7809na.png" alt="button" width="800" height="201"></a></p> <h4> Checkbox </h4> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import streamlit as st # check box checked = st.checkbox('Click me') if checked: st.write('You agreed the terms and conditions!') </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvf2w2ixqf7kr2b92gjml.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvf2w2ixqf7kr2b92gjml.png" alt="checkbox" width="800" height="201"></a></p> <h4> Radio </h4> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import streamlit as st # radio button lang = st.radio( "What's your favorite programming language?", ('C','C++', 'Java','Python')) if lang == 'C': st.write('You selected C') elif lang == 'C++': st.write('You selected C++') elif lang == 'C++': st.write('You selected Java') else: st.write('You selected Python') </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffloqgvzhvriaomy1707o.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffloqgvzhvriaomy1707o.png" alt="radio" width="800" height="206"></a></p> <h4> Slider </h4> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import streamlit as st # slider age = st.slider('Please enter your age', min_value=0, max_value=100, value=10) st.write("Your age is ", age) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2w419vwrxoznoi1tkyjp.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2w419vwrxoznoi1tkyjp.png" alt="slider" width="800" height="201"></a></p> <h4> Date and Time </h4> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import datetime import streamlit as st date = st.date_input("When's your birthday", datetime.date(2000, 1, 1), datetime.date(1990, 1, 1), datetime.datetime.now()) st.write("Your birthday is ", date) time = st.time_input("Which is your birth time", datetime.time(0, 0)) st.write("Your birth time is ", time) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxsqm3cudem4rnx1rfbqn.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxsqm3cudem4rnx1rfbqn.png" alt="date and time" width="800" height="330"></a></p> <h4> Form </h4> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import streamlit as st with st.form("user_form"): st.header("User Registration") name = st.text_input("Enter your name", "") age = st.slider("Enter your age") gender = st.radio("Select your gender", ('Male', 'Female')) terms = st.checkbox("Accept terms and conditions") # Every form must have a submit button. submitted = st.form_submit_button("Submit") if submitted: if terms: st.write("Name: ", name, ", Age: ", age, ", Gender: ", gender) else: st.write("Accept terms and conditions") st.write("Thanks for visiting") </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwq0tgx2mfv1e5yfcfn7f.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwq0tgx2mfv1e5yfcfn7f.png" alt="form" width="800" height="426"></a></p> <h4> Status </h4> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import streamlit as st import time # progress progress_text = "Operation in progress. Please wait." my_bar = st.progress(0, text=progress_text) for percent_complete in range(100): time.sleep(0.1) my_bar.progress(percent_complete + 1, text=progress_text) # spinner with st.spinner('Wait for it...'): time.sleep(5) st.success('Done!') # messages st.toast('Your edited image was saved!', icon='😍') st.error('This is an error', icon="🚨") st.info('This is a purely informational message', icon="ℹ️") st.warning('This is a warning', icon="⚠️") st.success('This is a success message!', icon="✅") e = RuntimeError('This is an exception of type RuntimeError') st.exception(e) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpuo4jo8ehlllrj82wvk3.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpuo4jo8ehlllrj82wvk3.png" alt="status" width="800" height="426"></a></p> <h4> Chart </h4> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import streamlit as st import pandas as pd import numpy as np # chart chart_data = pd.DataFrame( np.random.randn(20, 3), columns=['a', 'b', 'c']) st.line_chart(chart_data) st.bar_chart(chart_data) st.area_chart(chart_data) df = pd.DataFrame( np.random.randn(1000, 2) / [50, 50] + [37.76, -122.4], columns=['lat', 'lon']) st.map(df) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5h43bpmb3ej00weeao1g.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5h43bpmb3ej00weeao1g.png" alt="chart" width="800" height="426"></a></p> <h4> Data </h4> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import streamlit as st import pandas as pd import numpy as np # data frame st.subheader("Data Frame") df = pd.DataFrame( np.random.randn(50, 20), columns=('col %d' % i for i in range(20))) st.dataframe(df) # Same as st.write(df) # table st.subheader("Data Table") df = pd.DataFrame( np.random.randn(10, 5), columns=('col %d' % i for i in range(5))) st.table(df) # data editor st.subheader("Data Editor") df = pd.DataFrame( [ {"command": "st.selectbox", "rating": 4, "is_widget": True}, {"command": "st.balloons", "rating": 5, "is_widget": False}, {"command": "st.time_input", "rating": 3, "is_widget": True}, ] ) st.data_editor(df) # metric st.subheader("Data Metric") st.metric(label="Temperature", value="70 °F", delta="1.2 °F") col1, col2, col3 = st.columns(3) col1.metric("Temperature", "70 °F", "1.2 °F") col2.metric("Wind", "9 mph", "-8%") col3.metric("Humidity", "86%", "4%") # json st.subheader("Data JSON") st.json({ 'foo': 'bar', 'baz': 'boz', 'stuff': [ 'stuff 1', 'stuff 2', 'stuff 3', 'stuff 5', ], }) </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuxz59gkptmel41xvvnz2.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuxz59gkptmel41xvvnz2.png" alt="data" width="800" height="426"></a></p> <h4> Chat </h4> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import streamlit as st import numpy as np prompt = st.chat_input("Enter the chart type (bar, area, line)") print(prompt) if prompt == "bar": with st.chat_message("user"): st.write("Bar Chart Demo 👋") st.bar_chart(np.random.randn(30, 3)) elif prompt == "area": with st.chat_message("user"): st.write("Area Chat Demo 👋") st.area_chart(np.random.randn(30, 3)) elif prompt == "line": with st.chat_message("user"): st.write("Line Chat Demo 👋") st.line_chart(np.random.randn(30, 3)) elif prompt is not None: with st.chat_message("user"): st.write("Wrong chart type") </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu59w26bgl6rt1znz6qeo.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu59w26bgl6rt1znz6qeo.png" alt="chat" width="800" height="426"></a></p> <p>Thanks for reading this article.</p> <p>Thanks Gowri M Bhatt for reviewing the content.</p> <p>If you enjoyed this article, please click on the heart button ♥ and share to help others find it!</p> <p>The full source code for this tutorial can be found here,</p> <p><a href="proxy.php?url=https://github.com/codemaker2015/streamlit-cheatsheet" rel="noopener noreferrer">GitHub - codemaker2015/streamlit-cheatsheet | github.com</a></p> <p>The article is also available on Medium.</p> <p>Here are some useful links,</p> <p><a href="proxy.php?url=https://docs.streamlit.io/library/get-started" rel="noopener noreferrer">Get started - Streamlit Docs | docs.streamlit.io</a></p> streamlit python cheatsheet beginners Talk with documents using LlamaIndex Vishnu Sivan Mon, 24 Jul 2023 19:08:33 +0000 https://dev.to/codemaker2015/talk-with-documents-using-llamaindex-24ln https://dev.to/codemaker2015/talk-with-documents-using-llamaindex-24ln <p>Discover the latest buzz in the tech world with LangChain and LlamaIndex! These open-source libraries offer developers the opportunity to harness the incredible power of Large Language Models (LLMs) in their applications. LlamaIndex acts as a central hub, seamlessly connecting LLMs with external data sources. Meanwhile, LangChain provides a robust framework for constructing and managing LLM-powered applications. Though still in development, these game-changing tools have the potential to revolutionize the way we build and integrate advanced language models.</p> <p>In this article, we will cover the basics of LlamaIndex and create a data extraction and analysis tool using LlamaIndex, LangChain and OpenAI.</p> <h2> Getting Started </h2> <h3> Table of contents </h3> <ul> <li>What are Large Language Models (LLMs)</li> <li>What is LangChain</li> <li>What is Streamlit</li> <li>Introduction to LlamaIndex</li> <li>Basic workflow of LlamaIndex</li> <li>LlamaIndex indices</li> <li>Creating a document extractor / analyzer application using LlamaIndex, LangChain and OpenAI</li> <li>Installing the dependencies</li> <li>Setting up environment variables</li> <li>Importing the libraries</li> <li>Designing the sidebar</li> <li>Defining the get_response method</li> <li>Designing streamlit input field and submit button</li> <li>Complete code for the app</li> <li>Running the app</li> </ul> <h3> What are Large Language Models (LLMs) </h3> <p>Large Language Models (LLMs) refer to powerful AI models that are designed to understand and generate human language. LLMs are characterized by their ability to process and generate text that is coherent, contextually relevant, and often indistinguishable from human-written content. These models are pre-trained on diverse and extensive corpora of text, such as books, articles, websites, and other sources of written language. During pre-training, the models learn to predict the next word in a given sentence or fill in missing words in a paragraph, which helps them capture grammar, syntax, and semantic relationships between words and phrases.</p> <p>Large Language Models have gained significant attention and popularity due to their versatility and the impressive quality of their language generation capabilities. They have found applications in various domains, including natural language processing, content creation, chatbots, virtual assistants, and even creative writing. However, it’s important to note that LLMs are still machines and may occasionally produce inaccurate or biased outputs, highlighting the need for careful evaluation and human oversight when using them in real-world applications.</p> <h3> What is LangChain </h3> <p>LangChain is an open-source framework developed to streamline the creation of applications powered by large language models (LLMs). It provides a comprehensive set of tools, components, and interfaces that simplify the development process of LLM-centric applications. By leveraging LangChain, developers can effortlessly manage interactions with language models, seamlessly connect various components, and integrate resources like APIs and databases. The LangChain platform also offers a range of embedded APIs that empower developers to incorporate language processing capabilities without starting from scratch.</p> <p>As natural language processing continues to advance and gain wider adoption, the potential applications of this technology become virtually boundless. Here are some notable features of LangChain:</p> <ul> <li>LangChain allows developers to tailor prompts according to their specific requirements, enabling more precise and relevant language model outputs.</li> <li>LangChain enables developers to manipulate context to establish and guide the context for improved precision and user satisfaction, enhancing the overall user experience.</li> <li>With LangChain, developers can construct chain link components, which facilitate advanced usage scenarios and provide greater flexibility in the application design.</li> <li>The framework provides versatile components that can be mixed and matched to suit specific application needs, providing a modular approach to development.</li> <li>LangChain supports the integration of various models, including popular ones like GPT and HuggingFace Hub, allowing developers to leverage the cutting-edge capabilities of these language models.</li> </ul> <h3> What is Streamlit </h3> <p>Streamlit is a Python library that enables the effortless creation and sharing of interactive web applications and data visualizations. It provides a user-friendly interface for developing interactive charts and graphs using popular data visualization libraries such as matplotlib, pandas, and plotly. With Streamlit, you can build web apps that respond in real-time to user input, making it easy to create dynamic and engaging data-driven applications.</p> <h3> Introduction to LlamaIndex </h3> <p>The primary concept behind LlamaIndex is the capability to query documents, whether they consist of text or code, using a language model (LLM) such as ChatGPT. It is an open-source project that serves as a bridge between large language models (LLMs) and external data sources such as APIs, PDFs, and SQL databases. It offers a straightforward interface and facilitates the creation of indices for both structured and unstructured data, effectively handling the variations among different data sources. LlamaIndex can store the necessary context for prompt engineering, address challenges when dealing with large context windows, and assist in balancing cost and performance considerations when executing queries.</p> <p><a href="proxy.php?url=https://huggingface.co/llamaindex" rel="noopener noreferrer">llamaindex (LlamaIndex) | Org profile for LlamaIndex on Hugging Face, the AI community building the future. | huggingface.co</a></p> <h4> Basic workflow of LlamaIndex </h4> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwn2eqa0zdpxz2aminqgz.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwn2eqa0zdpxz2aminqgz.png" alt="Basic workflow of LlamaIndex" width="800" height="230"></a></p> <ul> <li>The document is loaded into LlamaIndex using pre-built readers for various sources, including databases, Discord, Slack, Google Docs, Notion, and GitHub repositories.</li> <li>LlamaIndex parses the documents, breaking them down into nodes or chunks of text.</li> <li>An index is created to efficiently retrieve relevant data when querying the documents. The index can be stored in different ways, with the Vector Store being a commonly used method.</li> <li>To perform a query, the document is searched using the index stored in the vector store. The response is then sent back to the user.</li> </ul> <h4> LlamaIndex indices </h4> <p>LlamaIndex provides specialized indices in the form of unique data structures.</p> <ul> <li> <strong>Vector store index</strong>: Widely used for answering queries across a large corpus of data.</li> <li> <strong>List index</strong>: Beneficial for synthesizing answers that combine information from multiple data sources.</li> <li> <strong>Keyword table index</strong>: Useful for routing queries to different unrelated data sources.</li> <li> <strong>Knowledge graph index</strong>: Effective for constructing and utilizing knowledge graphs.</li> <li> <strong>Structured store index</strong>: Well-suited for handling structured data, such as SQL queries.</li> <li> <strong>Tree index</strong>: Valuable for summarizing collections of documents.</li> </ul> <h3> Creating a document extractor / analyzer application using LlamaIndex, LangChain and OpenAI </h3> <p>In the previous sections, we discussed the basics of LLMs, LangChain and LlamaIndex. In this section, we will create a basic document extractor / analyzer application using these generative AI tools. The application takes openai key and the directory path as inputs and provides an interface to interact with the documents listed in the specified directory.</p> <h3> Installing the dependencies </h3> <h4> Create and activate a virtual environment by executing the following command. </h4> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>python -m venv venv source venv/bin/activate #for ubuntu venv/Scripts/activate #for windows </code></pre> </div> <h4> Install llama-index and streamlit libraries using pip. </h4> <p>Note that LlamaIndex requires python 3.8+ version to work. Try using the specified streamlit and llama-index version since the latest version gives some RateLimit error and is not yet fixed by the LlamaIndex team.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>pip install streamlit==1.24.0 pip install llama-index==0.5.27 </code></pre> </div> <h4> Setting up environment variables </h4> <p>Openai key is required to access LlamaIndex. Follow the steps to create a new openai key.</p> <ul> <li>Open platform.openai.com.</li> <li>Click on your name or icon option which is located on the top right corner of the page and select “API Keys” or click on the link — Account API Keys — OpenAI API.</li> <li>Click on create new secret key button to create a new openai key. <img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F12jzhrgxq9hny5hbdlvc.png" alt="OpenAI key" width="800" height="232"> </li> </ul> <h4> Importing the libraries </h4> <p>Import the necessary libraries by creating a file named <code>app.py</code> and add the following code to it.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import os, streamlit as st from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader, LLMPredictor, PromptHelper, ServiceContext from langchain.llms.openai import OpenAI </code></pre> </div> <h4> Designing the sidebar </h4> <p>Create a sidebar using the streamit sidebar class to collect the openai key and the directory path from the user. Add the following code to the <code>app.py</code> file.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>openai_api_key = st.sidebar.text_input( label="#### Your OpenAI API key 👇", placeholder="Paste your openAI API key, sk-", type="password") directory_path = st.sidebar.text_input( label="#### Your data directory path 👇", placeholder="C:\data", type="default") </code></pre> </div> <h4> Defining the get_response method </h4> <p>Create a <code>get_response()</code> method which takes query, <code>directory_path</code>and <code>openai_api_key</code> as arguments and returns the query response.</p> <p>Add the following code to the <code>app.py</code> file.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>def get_response(query,directory_path,openai_api_key): # This example uses text-davinci-003 by default; feel free to change if desired. # Skip openai_api_key argument if you have already set it up in environment variables (Line No: 7) llm_predictor = LLMPredictor(llm=OpenAI(openai_api_key=openai_api_key, temperature=0, model_name="text-davinci-003")) # Configure prompt parameters and initialise helper max_input_size = 4096 num_output = 256 max_chunk_overlap = 20 prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap) if os.path.isdir(directory_path): # Load documents from the 'data' directory documents = SimpleDirectoryReader(directory_path).load_data() service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper) index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context) response = index.query(query) if response is None: st.error("Oops! No result found") else: st.success(response) else: st.error(f"Not a valid directory: {directory_path}") </code></pre> </div> <h4> Understanding the code: </h4> <ul> <li>Create an object llm_predictor for the class LLMPredictor which accepts a parameter llm. Specify a model, text-davinci-003 from OpenAI’s API. Specify temperature and openai api key as the object arguments.</li> <li>Create a PromptHelper by specifying the maximum input size (max_input_size), number of outputs (num_output), maximum chunk overlap (max_chunk_overlap).</li> <li>The SimpleDirectoryReader class is designed to read data from a directory. It takes an input_files parameter, which is used to dynamically generate a filename and is passed to the SimpleDirectoryReader instance. When the load_data method is invoked on the SimpleDirectoryReader object, it is responsible for loading the data from the specified input files and returning the documents that have been successfully loaded.</li> <li>The GPTSimpleVectorIndex class is specifically designed to establish an index that enables efficient searching and retrieval of documents. To create this index, we use the from_documents method of the class, which requires two parameters — documents and service_context.</li> <li>The documents parameter is used to represent the actual documents that will be indexed.</li> <li>The <code>service_context</code> parameter is used to denote the service context being passed along with the documents.</li> <li>Query documents using <code>index.query(query)</code> as the query input.</li> </ul> <h4> Designing streamlit input field and submit button </h4> <p>Create an input field and a submit button using streamlit to get the user queries. Call the get_response() method inside the submit button to execute llama-index for querying the documents through the given input.</p> <p>Add the following code to the app.py file.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code># Define a simple Streamlit app st.title("ChatMATE") query = st.text_input("What would you like to ask?", "") # If the 'Submit' button is clicked if st.button("Submit"): if not query.strip(): st.error(f"Please provide the search query.") else: try: if len(openai_api_key) &gt; 0: get_response(query,directory_path,openai_api_key) else: st.error(f"Enter a valid openai key") except Exception as e: st.error(f"An error occurred: {e}") </code></pre> </div> <h4> Complete code for the app </h4> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>import os, streamlit as st from llama_index import GPTSimpleVectorIndex, SimpleDirectoryReader, LLMPredictor, PromptHelper, ServiceContext from langchain.llms.openai import OpenAI # Uncomment to specify your OpenAI API key here, or add corresponding environment variable (recommended) # os.environ['OPENAI_API_KEY']= "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" # Provide openai key from the frontend if you are not using the above line of code to seet the key openai_api_key = st.sidebar.text_input( label="#### Your OpenAI API key 👇", placeholder="Paste your openAI API key, sk-", type="password") directory_path = st.sidebar.text_input( label="#### Your data directory path 👇", placeholder="C:\data", type="default") def get_response(query,directory_path,openai_api_key): # This example uses text-davinci-003 by default; feel free to change if desired. # Skip openai_api_key argument if you have already set it up in environment variables (Line No: 7) llm_predictor = LLMPredictor(llm=OpenAI(openai_api_key=openai_api_key, temperature=0, model_name="text-davinci-003")) # Configure prompt parameters and initialise helper max_input_size = 4096 num_output = 256 max_chunk_overlap = 20 prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap) if os.path.isdir(directory_path): # Load documents from the 'data' directory documents = SimpleDirectoryReader(directory_path).load_data() service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper) index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context) response = index.query(query) if response is None: st.error("Oops! No result found") else: st.success(response) else: st.error(f"Not a valid directory: {directory_path}") # Define a simple Streamlit app st.title("ChatMATE") query = st.text_input("What would you like to ask?", "") # If the 'Submit' button is clicked if st.button("Submit"): if not query.strip(): st.error(f"Please provide the search query.") else: try: if len(openai_api_key) &gt; 0: get_response(query,directory_path,openai_api_key) else: st.error(f"Enter a valid openai key") except Exception as e: st.error(f"An error occurred: {e}") </code></pre> </div> <h4> Running the app </h4> <p>To run the app, an openai api key and a directory path is required. Create a few text files that contains your required content and place it in a directory. Specify the directory while running the app. In this demo, text files containing information about quantum physics and quantum computing were used.</p> <p>Run the app using the following command,<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>streamlit run app.py </code></pre> </div> <p>The output is as given below,</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08zyihf2aqxgnmoy7r6i.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08zyihf2aqxgnmoy7r6i.png" alt="output" width="800" height="358"></a></p> <p>Thanks for reading this article.</p> <p>Thanks Gowri M Bhatt for reviewing the content.</p> <p>If you enjoyed this article, please click on the heart button ♥ and share to help others find it!</p> <p>The full source code for this tutorial can be found here,</p> <p><a href="proxy.php?url=https://github.com/codemaker2015/llamaindex-based-document-extractor" rel="noopener noreferrer">GitHub - codemaker2015/llamaindex-based-document-extractor | github.com</a></p> <p>Here are some useful links,</p> <ul> <li><a href="proxy.php?url=https://huggingface.co/llamaindex" rel="noopener noreferrer">llamaindex (LlamaIndex) | Org profile for LlamaIndex on Hugging Face, the AI community building the future. | huggingface.co</a></li> <li><a href="proxy.php?url=https://docs.langchain.com/docs/" rel="noopener noreferrer">🦜️🔗 LangChain | LangChain is a framework for developing applications powered by language models. | docs.langchain.com</a></li> <li>[<a href="proxy.php?url=https://dev.tourl">GitHub - yvann-hub/Robby-chatbot: AI chatbot 🤖 for chat with CSV,</a> PDF, TXT files 📄 and YTB videos… | github.com](<a href="proxy.php?url=https://github.com/yvann-hub/Robby-chatbot" rel="noopener noreferrer">https://github.com/yvann-hub/Robby-chatbot</a>)</li> </ul> Say hello to DragGAN — The cutting-edge AI tool now available! Vishnu Sivan Sat, 01 Jul 2023 17:05:20 +0000 https://dev.to/codemaker2015/say-hello-to-draggan-the-cutting-edge-ai-tool-now-available-51k1 https://dev.to/codemaker2015/say-hello-to-draggan-the-cutting-edge-ai-tool-now-available-51k1 <p>Exciting news for all image editing enthusiasts! The highly anticipated DragGAN code has finally been released and is now available under the CC-BY-NC license. With DragGAN, gone are the days of complex editing processes and painstaking adjustments. This remarkable solution introduces a whole new level of simplicity by allowing you to effortlessly drag elements within an image to transform their appearance. Drawing inspiration from the powerful StyleGAN3 and StyleGAN-Human models, DragGAN empowers users to manipulate various aspects of an image, whether it’s altering the dimensions of a car, modifying facial expressions, or even rotating the image as if it were a 3D model.</p> <p>In this article, we will go through the basics of DragGAN and try it out using Google Colab.</p> <h2> Getting Started </h2> <h3> Table of contents </h3> <ul> <li>Introduction to GAN</li> <li>Types of GAN</li> <li>StyleGAN3 and StyleGAN-Human</li> <li>Applications of GAN</li> <li>DragGAN</li> <li>How to use it</li> </ul> <h3> Introduction to GAN </h3> <p>A generative adversarial network (GAN) is a special kind of machine learning model that uses two neural networks to compete with each other. These neural networks, called the generator and the discriminator, work together in a game-like manner to improve their skills. The generator tries to create realistic data that looks like the real thing, while the discriminator tries to figure out which data is real and which is fake. They learn from each other’s successes and failures, making the generator better at creating convincing fake data and the discriminator better at spotting fakes. GANs are used to make computer-generated images, videos, and other types of data that look very similar to what humans create.</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fimp6y05fhfxfa4yfkt1a.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fimp6y05fhfxfa4yfkt1a.png" alt="Overview of GAN Structure" width="577" height="254"></a><br> Image source: Overview of GAN Structure | Machine Learning | Google for Developers</p> <p>GANs can generate high-quality, realistic data that exhibits similar characteristics to the training dataset. GANs have found applications in various fields, including image synthesis, video generation, text-to-image translation, and more.</p> <h3> Types of GAN </h3> <p>GANs come in different types. Let’s explore some common GAN variants:</p> <ul> <li> <strong>Vanilla GAN</strong>: This is the simplest type of GAN, consisting of a generator and a discriminator. The generator creates images while the discriminator determines if an image is real or fake.</li> <li> <strong>Deep Convolutional GAN (DCGAN)</strong>: DCGAN employs deep convolutional neural networks to generate high-resolution and distinguishable images. Convolutional layers extract important details from the data, making it effective for image generation tasks.</li> <li> <strong>Progressive GAN</strong>: The generator starts by producing low-resolution images, and as the training progresses, it adds more details in subsequent layers. This approach enables faster training compared to non-progressive GANs and results in higher resolution images being generated.</li> <li> <strong>Conditional GAN</strong>: This type of GAN allows the network to be conditioned on specific information, such as class labels. It helps the GAN learn to differentiate between different classes by training with labeled images.</li> <li> <strong>CycleGAN</strong>: This type of GAN is often used for image style transfer, enabling the transformation between different image styles. For example, it can convert images from winter to summer or from a horse to a zebra. Applications like FaceApp utilize CycleGAN to alter facial appearances.</li> <li> <strong>Super Resolution GAN</strong>: This GAN type enhances low-resolution images by generating higher-resolution versions. It fills in missing details, improving the overall image quality.</li> <li> <strong>StyleGAN</strong>: Developed by Nvidia, StyleGAN generates high-quality, photorealistic images, especially focusing on realistic human faces. Users can manipulate the model to modify various aspects of the generated images.</li> </ul> <h3> StyleGAN3 and StyleGAN-Human </h3> <ul> <li> <strong>StyleGAN3</strong> — It is an evolution of the original StyleGAN that introduces several improvements and innovations to enhance the image generation process. It incorporates adaptive discriminator augmentation (ADA), a technique that dynamically adjusts the discriminator during training to improve the overall image quality. StyleGAN3 also introduces novel regularization methods, architectural modifications, and better optimization strategies, resulting in even more visually appealing and coherent face synthesis.</li> <li> <strong>StyleGAN-Human</strong> — It is a variant of StyleGAN3 that specifically focuses on generating realistic human faces. It leverages a large-scale dataset of human faces to learn intricate details, such as facial expressions, hair styles, and diverse characteristics.</li> </ul> <h3> Applications of GAN </h3> <p>GANs have gained popularity in online retail sales due to their ability to understand and recreate visual content accurately. They can fill in images from outlines, generate realistic images from text descriptions, and create photorealistic product prototypes. They learn from human movement patterns, predict future frames, and create deepfake videos in video production. Furthermore, GANs can generate realistic speech sounds and even generate text for various purposes like blogs, articles, and product descriptions.</p> <p>Let’s have a look at some of the use cases of GAN.</p> <ul> <li> <strong>Realistic 3D Object Generation</strong>: GANs have proven capable of generating three-dimensional objects, such as furniture models created by researchers at MIT that resemble designs crafted by humans. These models can be valuable for architectural visualization and video game production.</li> <li> <strong>Human Face Generation</strong>: GANs, such as Nvidia’s StyleGAN2, can generate highly realistic and believable human faces that appear to be genuine individuals.</li> <li> <strong>Video Game Character Creation</strong>: GANs have found applications in video game development, such as the use of GANs by Nvidia to generate new characters for the popular game Final Fantasy XV.</li> <li> <strong>Fashion Design Innovation</strong>: GANs have been utilized by clothing retailer H&amp;M to create fresh fashion designs inspired by existing styles, allowing for the development of unique apparel.</li> </ul> <h2> DragGAN </h2> <p>DragGAN is an exciting new AI application that revolutionizes photo and art adjustments with a simple drag-and-drop interface. It allows you to modify images across various categories like animals, cars, people, landscapes, and more. With DragGAN, you can reshape the image layout, adjust poses and shapes, and even change facial expressions of individuals in photos.</p> <p>According to the research team behind DragGAN, their aim is to provide users with the ability to “drag” any point in an image to their desired position.</p> <p>DragGAN comprises two key components. The first is feature-based motion supervision, which facilitates precise movement of points within the image. The second is a novel point tracking approach, ensuring accurate tracking of these points.</p> <h3> How to use it </h3> <p>In this section, we will try dragGAN using the official git repository.</p> <p><a href="proxy.php?url=https://github.com/XingangPan/DragGAN" rel="noopener noreferrer">GitHub - XingangPan/DragGAN: Official Code for DragGAN (SIGGRAPH 2023)</a></p> <ul> <li><p>Open you google colab account using the below link.<br> <a href="proxy.php?url=https://colab.research.google.com/" rel="noopener noreferrer">Google Colaboratory | colab.research.google.com</a></p></li> <li><p>Click on the New notebook link to create new notebook in Colab.<br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjkbzpudx6g6ed230jvn4.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjkbzpudx6g6ed230jvn4.png" alt="New notebook" width="800" height="585"></a></p></li> <li><p>Clone the official DragGAN’s git repository using the following command.<br> </p></li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>!git clone https://github.com/XingangPan/DragGAN.git </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnnwelhw85rfdl4c38u2b.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnnwelhw85rfdl4c38u2b.png" alt="git clone" width="800" height="106"></a></p> <ul> <li>Click on the play button to execute the cell.</li> <li>Switch the runtime type to GPU from the Runtime → Change runtime type option else it may take longer to process the results.</li> </ul> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fupcj620hi7oy68pb13yz.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fupcj620hi7oy68pb13yz.png" alt="Switch the runtime" width="640" height="536"></a><br> <a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa3v9rakggl3edrd718lu.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa3v9rakggl3edrd718lu.png" alt="Switch the runtime" width="720" height="429"></a></p> <ul> <li>Click on + Code button to add new cells.</li> <li>Switch to the DragGAN directory using the cd command. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>cd /content/DragGAN </code></pre> </div> <ul> <li>Install the requirements from the requirements.txt file. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>!pip install -r requirements.txt </code></pre> </div> <ul> <li>Download pre-trained StyleGAN2 weights by executing the <code>download_model.sh</code> shell script using the below command. If you want to try StyleGAN-Human and the Landscapes HQ (LHQ) dataset, download weights from the following links: <a href="proxy.php?url=https://drive.google.com/file/d/1dlFEHbu-WzQWJl7nBBZYcTyo000H9hVm/view?usp=sharing" rel="noopener noreferrer">StyleGAN-Human</a>, <a href="proxy.php?url=https://drive.google.com/file/d/16twEf0T9QINAEoMsWefoWiyhcTd-aiWc/view?usp=sharing" rel="noopener noreferrer">LHQ</a>. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>!sh scripts/download_model.sh </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fctmkqfccis71oxe9f1ny.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fctmkqfccis71oxe9f1ny.png" alt="download_model" width="800" height="114"></a></p> <ul> <li>Run the dragGAN visualizer created using gradio using the following command. The system will provide a network URL once the visualizer is up and running. Click on the URL obtained to try dragGAN. </li> </ul> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>!python /content/DragGAN/visualizer_drag_gradio.py </code></pre> </div> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3xhjd925hl9iqv02ysfk.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3xhjd925hl9iqv02ysfk.png" alt="visualizer_drag_gradio" width="800" height="160"></a></p> <p>You will get the output as below,</p> <p><a href="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyerkwu1btppa2ssesk9b.png" class="article-body-image-wrapper"><img src="proxy.php?url=https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyerkwu1btppa2ssesk9b.png" alt="output" width="800" height="424"></a></p> <p>Thanks for reading this article.</p> <p>Thanks Gowri M Bhatt for reviewing the content.</p> <p>If you enjoyed this article, please click on the heart button ♥ and share to help others find it!</p> <p>The full source code for this tutorial can be found here,</p> <p>GitHub - codemaker2015/DragGAN-demo<br> Contribute to codemaker2015/DragGAN-demo development by creating an account on GitHub.<br> github.com</p> <p>The article is also available on <a href="proxy.php?url=https://codemaker2016.medium.com/say-hello-to-draggan-the-cutting-edge-ai-tool-now-available-e7be7ad2d635" rel="noopener noreferrer">Medium</a>.</p> <p>Here are some useful links,</p> <ul> <li><a href="proxy.php?url=https://www.unite.ai/what-is-a-generative-adversarial-network-gan/" rel="noopener noreferrer">What is a Generative Adversarial Network (GAN)? - Unite.AI</a></li> <li><a href="proxy.php?url=https://arxiv.org/pdf/2305.10973.pdf" rel="noopener noreferrer">https://arxiv.org/pdf/2305.10973.pdf</a></li> <li><a href="proxy.php?url=https://paperswithcode.com/method/stylegan" rel="noopener noreferrer">Papers with Code - StyleGAN Explained</a></li> <li><a href="proxy.php?url=https://github.com/NVlabs/stylegan3" rel="noopener noreferrer">GitHub - NVlabs/stylegan3: Official PyTorch implementation of StyleGAN3</a></li> <li><a href="proxy.php?url=https://github.com/NVlabs/stylegan3" rel="noopener noreferrer">Official PyTorch implementation of StyleGAN3 | github.com</a></li> </ul> ai python generativeai beginners