tag:github.com,2008:https://github.com/NVIDIA/GenerativeAIExamples/releases Release notes from GenerativeAIExamples 2024-08-20T20:59:36Z tag:github.com,2008:Repository/707237272/v0.8.0 2024-08-21T03:15:23Z v0.8.0 <p>This release completely refactors the directory structure of the repository for a more seamless and intuitive developer journey. It also adds support to deploy the latest accelerated embedding and reranking models across the cloud, data center, and workstation using <a href="https://docs.nvidia.com/nim/index.html#nemo-retriever" rel="nofollow">NVIDIA NeMo Retriever NIM microservices</a>.</p> <h3>Added</h3> <ul> <li><a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/RAG/examples">End-to-end RAG examples</a> enhancements <ul> <li><a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/README.md#try-it-now">Single-command deployment</a> for all the examples using Docker Compose.</li> <li>All end to end RAG examples are now more encapsulated with documentation, code and deployment assets residing in dedicated example specific directory.</li> <li>Segregated examples into <a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/RAG/examples">basic and advanced RAG</a> with dedicated READMEs.</li> <li>Added reranker model support to <a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/RAG/examples/advanced_rag/multi_turn_rag">multi-turn RAG example</a>.</li> <li>Added <a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/docs/prompt-customization.md">dedicated prompt configuration file for every example</a>.</li> <li>Removed Python dev packages from containers to enhance security.</li> <li>Updated to latest version of <a href="https://python.langchain.com/v0.2/docs/integrations/providers/nvidia/" rel="nofollow">langchain-nvidia-ai-endpoints</a>.</li> </ul> </li> <li><a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/docs/riva-asr-tts.md">Speech support using RAG Playground</a> <ul> <li>Added support to access <a href="https://build.nvidia.com/explore/speech" rel="nofollow">RIVA speech models from NVIDIA API Catalog</a>.</li> <li>Speech support in RAG Playground is opt-in.</li> </ul> </li> <li>Documentation enhancements <ul> <li>Added more comprehensive <a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/README.md#how-to-guides">how-to guides</a> for end to end RAG examples.</li> <li>Added <a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/RAG/examples/basic_rag/langchain">example specific architecture diagrams</a> in each example directory.</li> </ul> </li> <li>Added a new industry specific <a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/industries">top level directory</a> <ul> <li>Added <a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/industries/healthcare/medical-device-training-assistant">health care domain specific Medical Device Training Assistant RAG</a>.</li> </ul> </li> <li>Added notebooks showcasing new usecases <ul> <li><a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/RAG/notebooks/langchain/langchain_basic_RAG.ipynb">Basic langchain based RAG pipeline</a> using latest NVIDIA API Catalog connectors.</li> <li><a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/RAG/notebooks/llamaindex/llamaindex_basic_RAG.ipynb">Basic llamaindex based RAG pipeline</a> using latest NVIDIA API Catalog connectors.</li> <li><a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/RAG/notebooks/langchain/NeMo_Guardrails_with_LangChain_RAG">NeMo Guardrails with basic langchain RAG</a>.</li> <li><a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/RAG/notebooks/langchain/Using_NVIDIA_NIMs_with_NeMo_Guardrails">NVIDIA NIM microservices using NeMo Guardrails based RAG</a>.</li> <li><a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/RAG/notebooks/nemo/Nemo%20Evaluator%20Llama%203.1%20Workbook">Using NeMo Evaluator using Llama 3.1 8B Instruct</a>.</li> <li><a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/RAG/notebooks/langchain/agentic_rag_with_nemo_retriever_nim.ipynb">Agentic RAG pipeline with Nemo Retriever and NIM for LLMs</a>.</li> </ul> </li> <li>Added new <code>community</code> (before <code>experimental</code>) example <ul> <li>Create a simple web interface to interact with different <a href="/NVIDIA/GenerativeAIExamples/blob/v0.8.0/community/llm-prompt-design-helper">selectable NIM endpoints</a>. The provided interface of this project supports designing a system prompt to call the LLM.</li> </ul> </li> </ul> <h3>Changed</h3> <ul> <li>Major restructuring and reorganisation of the assets within the repository <ul> <li>Top level <code>experimental</code> directory has been renamed as <code>community</code>.</li> <li>Top level <code>RetrievalAugmentedGeneration</code> directory has been renamed as just <code>RAG</code>.</li> <li>The Docker Compose files inside top level <code>deploy</code> directory has been migrated to example-specific directories under <code>RAG/examples</code>. The vector database and on-prem NIM microservices deployment files are under <code>RAG/examples/local_deploy</code>.</li> <li>Top level <code>models</code> has been renamed to <code>finetuning</code>.</li> <li>Top level <code>notebooks</code> directory has been moved to under <code>RAG/notebooks</code> and has been organised framework wise.</li> <li>Top level <code>tools</code> directory has been migrated to <code>RAG/tools</code>.</li> <li>Top level <code>integrations</code> directory has been moved into <code>RAG/src</code>.</li> <li><code>RetreivalAugmentedGeneration/common</code> is now residing under <code>RAG/src/chain_server</code>.</li> <li><code>RetreivalAugmentedGeneration/frontend</code> is now residing under <code>RAG/src/rag_playground/default</code>.</li> <li><code>5 mins RAG No GPU</code> example under top level <code>examples</code> directory, is now under <code>community</code>.</li> </ul> </li> </ul> <h3>Deprecated</h3> <ul> <li>Github pages based documentation is now replaced with markdown based documentation.</li> <li>Top level <code>examples</code> directory has been removed.</li> <li>Following notebooks were removed <ul> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.7.0/notebooks/02_Option(1)_NVIDIA_AI_endpoint_simple.ipynb">02_Option(1)_NVIDIA_AI_endpoint_simple.ipynb</a></li> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.7.0/notebooks/02_Option(2)_minimalistic_RAG_with_langchain_local_HF_LLM.ipynb">notebooks/02_Option(2)_minimalistic_RAG_with_langchain_local_HF_LLM.ipynb</a></li> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.7.0/notebooks/03_Option(1)_llama_index_with_NVIDIA_AI_endpoint.ipynb">notebooks/03_Option(1)_llama_index_with_NVIDIA_AI_endpoint.ipynb</a></li> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.7.0/notebooks/03_Option(2)_llama_index_with_HF_local_LLM.ipynb">notebooks/03_Option(2)_llama_index_with_HF_local_LLM.ipynb</a></li> </ul> </li> </ul> shubhadeepd tag:github.com,2008:Repository/707237272/v0.7.0 2024-06-18T15:52:25Z v0.7.0 <p>This release switches all examples to use cloud hosted GPU accelerated LLM and embedding models from <a href="https://build.nvidia.com" rel="nofollow">Nvidia API Catalog</a> as default. It also deprecates support to deploy on-prem models using NeMo Inference Framework Container and adds support to deploy accelerated generative AI models across the cloud, data center, and workstation using <a href="https://docs.nvidia.com/nim/large-language-models/latest/introduction.html" rel="nofollow">latest Nvidia NIM-LLM</a>.</p> <h3>Added</h3> <ul> <li>Added model <a href="/NVIDIA/GenerativeAIExamples/blob/v0.7.0/deploy/compose/docker-compose-nim-ms.yaml">auto download and caching support for <code>nemo-retriever-embedding-microservice</code> and <code>nemo-retriever-reranking-microservice</code></a>. Updated steps to deploy the services can be found <a href="https://nvidia.github.io/GenerativeAIExamples/latest/nim-llms.html" rel="nofollow">here</a>.</li> <li><a href="https://nvidia.github.io/GenerativeAIExamples/latest/multimodal-data.html" rel="nofollow">Multimodal RAG Example enhancements</a> <ul> <li>Moved to the <a href="https://pypi.org/project/pdfplumber/" rel="nofollow">PDF Plumber library</a> for parsing text and images.</li> <li>Added <code>pgvector</code> vector DB support.</li> <li>Added support to ingest files with .pptx extension</li> <li>Improved accuracy of image parsing by using <a href="https://pypi.org/project/tesseract-ocr/" rel="nofollow">tesseract-ocr</a></li> </ul> </li> <li>Added a <a href="/NVIDIA/GenerativeAIExamples/blob/v0.7.0/notebooks/08_RAG_Langchain_with_Local_NIM.ipynb">new notebook showcasing RAG usecase using accelerated NIM based on-prem deployed models</a></li> <li>Added a <a href="/NVIDIA/GenerativeAIExamples/blob/v0.7.0/experimental/rag-developer-chatbot">new experimental example</a> showcasing how to create a developer-focused RAG chatbot using RAPIDS cuDF source code and API documentation.</li> <li>Added a <a href="/NVIDIA/GenerativeAIExamples/blob/v0.7.0/experimental/event-driven-rag-cve-analysis">new experimental example</a> demonstrating how NVIDIA Morpheus, NIMs, and RAG pipelines can be integrated to create LLM-based agent pipelines.</li> </ul> <h3>Changed</h3> <ul> <li>All examples now use llama3 models from <a href="https://build.nvidia.com/search?term=llama3" rel="nofollow">Nvidia API Catalog</a> as default. Summary of updated examples and the model it uses is available <a href="https://nvidia.github.io/GenerativeAIExamples/latest/index.html#developer-rag-examples" rel="nofollow">here</a>.</li> <li>Switched default embedding model of all examples to <a href="https://build.nvidia.com/snowflake/arctic-embed-l" rel="nofollow">Snowflake arctic-embed-I model</a></li> <li>Added more verbose logs and support to configure <a href="https://nvidia.github.io/GenerativeAIExamples/latest/configuration.html#chain-server" rel="nofollow">log level for chain server using LOG_LEVEL enviroment variable</a>.</li> <li>Bumped up version of <code>langchain-nvidia-ai-endpoints</code>, <code>sentence-transformers</code> package and <code>milvus</code> containers</li> <li>Updated base containers to use ubuntu 22.04 image <code>nvcr.io/nvidia/base/ubuntu:22.04_20240212</code></li> <li>Added <code>llama-index-readers-file</code> as dependency to avoid runtime package installation within chain server.</li> </ul> <h3>Deprecated</h3> <ul> <li>Deprecated support of on-prem LLM model deployment using <a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.6.0/deploy/compose/rag-app-text-chatbot.yaml#L2">NeMo Inference Framework Container</a>. Developers can use <a href="https://nvidia.github.io/GenerativeAIExamples/latest/nim-llms.html" rel="nofollow">Nvidia NIM-LLM to deploy TensorRT optimized models on-prem and plug them in with existing examples</a>.</li> <li>Deprecated <a href="https://github.com/NVIDIA/GenerativeAIExamples/tree/v0.6.0/deploy/k8s-operator/kube-trailblazer">kubernetes operator support</a>.</li> <li><code>nvolveqa_40k</code> embedding model was deprecated from <a href="https://build.nvidia.com" rel="nofollow">Nvidia API Catalog</a>. Updated all <a href="/NVIDIA/GenerativeAIExamples/blob/v0.7.0/notebooks">notebooks</a> and <a href="/NVIDIA/GenerativeAIExamples/blob/v0.7.0/experimental">experimental artifacts</a> to use <a href="https://build.nvidia.com/nvidia/embed-qa-4" rel="nofollow">Nvidia embed-qa-4 model</a> instead.</li> <li>Removed <a href="https://github.com/NVIDIA/GenerativeAIExamples/tree/v0.6.0/notebooks">notebooks numbered 00-04</a>, which used on-prem LLM model deployment using deprecated <a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.6.0/deploy/compose/rag-app-text-chatbot.yaml#L2">NeMo Inference Framework Container</a>.</li> </ul> shubhadeepd tag:github.com,2008:Repository/707237272/v0.6.0 2024-05-10T17:19:40Z v0.6.0 <p>This release adds ability to switch between <a href="https://build.nvidia.com/explore/discover" rel="nofollow">API Catalog</a> models and on-prem models using <a href="https://docs.nvidia.com/ai-enterprise/nim-llm/latest/index.html" rel="nofollow">NIM-LLM</a> and adds documentation on how to build an RAG application from scratch. It also releases a containerized end to end RAG evaluation application integrated with RAG chain-server APIs.</p> <h3>Added</h3> <ul> <li>Ability to switch between <a href="https://build.nvidia.com/explore/discover" rel="nofollow">API Catalog</a> models to on-prem models using <a href="https://docs.nvidia.com/ai-enterprise/nim-llm/latest/index.html" rel="nofollow">NIM-LLM</a>.</li> <li>New API endpoint <ul> <li><code>/health</code> - Provides a health check for the chain server.</li> </ul> </li> <li>Containerized <a href="/NVIDIA/GenerativeAIExamples/blob/v0.6.0/tools/evaluation">evaluation application</a> for RAG pipeline accuracy measurement.</li> <li>Observability support for langchain based examples.</li> <li>New Notebooks <ul> <li>Added <a href="/NVIDIA/GenerativeAIExamples/blob/v0.6.0/notebooks/12_Chat_wtih_nvidia_financial_reports.ipynb">Chat with NVIDIA financial data</a> notebook.</li> <li>Added notebook showcasing <a href="/NVIDIA/GenerativeAIExamples/blob/v0.6.0/notebooks/11_LangGraph_HandlingAgent_IntermediateSteps.ipynb">langgraph agent handling</a>.</li> </ul> </li> <li>A <a href="https://nvidia.github.io/GenerativeAIExamples/latest/simple-examples.html" rel="nofollow">simple rag example template</a> showcasing how to build an example from scratch.</li> </ul> <h3>Changed</h3> <ul> <li>Renamed example <code>csv_rag</code> to <a href="/NVIDIA/GenerativeAIExamples/blob/v0.6.0/RetrievalAugmentedGeneration/examples/structured_data_rag">structured_data_rag</a></li> <li>Model Engine name update <ul> <li><code>nv-ai-foundation</code> and <code>nv-api-catalog</code> llm engine are renamed to <code>nvidia-ai-endpoints</code></li> <li><code>nv-ai-foundation</code> embedding engine is renamed to <code>nvidia-ai-endpoints</code></li> </ul> </li> <li>Embedding model update <ul> <li><code>developer_rag</code> example uses <a href="https://huggingface.co/WhereIsAI/UAE-Large-V1" rel="nofollow">UAE-Large-V1</a> embedding model.</li> <li>Using <code>ai-embed-qa-4</code> for api catalog examples instead of <code>nvolveqa_40k</code> as embedding model</li> </ul> </li> <li>Ingested data now persists across multiple sessions.</li> <li>Updated langchain-nvidia-endpoints to version 0.0.11, enabling support for models like llama3.</li> <li>File extension based validation to throw error for unsupported files.</li> <li>The default output token length in the UI has been increased from 250 to 1024 for more comprehensive responses.</li> <li>Stricter chain-server API validation support to enhance API security</li> <li>Updated version of llama-index, pymilvus.</li> <li>Updated pgvector container to <code>pgvector/pgvector:pg16</code></li> <li>LLM Model Updates <ul> <li><a href="/NVIDIA/GenerativeAIExamples/blob/v0.6.0/RetrievalAugmentedGeneration/examples/multi_turn_rag">Multiturn Chatbot</a> now uses <code>ai-mixtral-8x7b-instruct</code> model for response generation.</li> <li><a href="/NVIDIA/GenerativeAIExamples/blob/v0.6.0/RetrievalAugmentedGeneration/examples/structured_data_rag">Structured data rag</a> now uses <code>ai-llama3-70b</code> for response and code generation.</li> </ul> </li> </ul> sumitkbh tag:github.com,2008:Repository/707237272/v0.5.0 2024-03-20T18:23:30Z v0.5.0 <p>This release adds new dedicated RAG examples showcasing state of the art usecases, switches to the latest <a href="https://build.nvidia.com/explore/discover" rel="nofollow">API catalog endpoints from NVIDIA</a> and also refactors the API interface of chain-server. This release also improves the developer experience by adding github pages based documentation and streamlining the example deployment flow using dedicated compose files.</p> <h3>Added</h3> <ul> <li><a href="https://nvidia.github.io/GenerativeAIExamples/latest/index.html" rel="nofollow">Github pages based documentation.</a></li> <li>New examples showcasing <ul> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/tree/v0.5.0/RetrievalAugmentedGeneration/examples/multi_turn_rag">Multi-turn RAG</a></li> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/tree/v0.5.0/RetrievalAugmentedGeneration/examples/multimodal_rag">Multi-modal RAG</a></li> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/tree/v0.5.0/RetrievalAugmentedGeneration/examples/csv_rag">Structured data CSV RAG</a></li> </ul> </li> <li>Support for <a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.5.0/docs/api_reference/openapi_schema.json">delete and list APIs</a> in chain-server component</li> <li>Streamlined RAG example deployment <ul> <li>Dedicated new <a href="https://github.com/NVIDIA/GenerativeAIExamples/tree/v0.5.0/deploy/compose">docker compose files</a> for every examples.</li> <li>Dedicated <a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.5.0/deploy/compose/docker-compose-vectordb.yaml">docker compose files</a> for launching vector DB solutions.</li> </ul> </li> <li>New configurations to control top k and confidence score of retrieval pipeline.</li> <li>Added <a href="https://github.com/NVIDIA/GenerativeAIExamples/tree/v0.5.0/models/NeMo/slm">a notebook</a> which covers how to train SLMs with various techniques using NeMo Framework.</li> <li>Added more <a href="https://github.com/NVIDIA/GenerativeAIExamples/tree/v0.5.0/experimental">experimental examples</a> showcasing new usecases. <ul> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/tree/v0.5.0/experimental/oran-chatbot-multimodal">NVIDIA ORAN chatbot multimodal Assistant</a></li> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/tree/v0.5.0/experimental/synthetic-data-retriever-customization">NVIDIA Retrieval Customization</a></li> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/tree/v0.5.0/experimental/streaming_ingest_rag">NVIDIA RAG Streaming Document Ingestion Pipeline</a></li> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/tree/v0.5.0/experimental/fm-asr-streaming-rag">NVIDIA Live FM Radio ASR RAG</a></li> </ul> </li> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.5.0/notebooks/10_RAG_for_HTML_docs_with_Langchain_NVIDIA_AI_Endpoints.ipynb">New dedicated notebook</a> showcasing a RAG pipeline using web pages.</li> </ul> <h3>Changed</h3> <ul> <li>Switched from NVIDIA AI Foundation to <a href="https://build.nvidia.com/explore/discover" rel="nofollow">NVIDIA API Catalog endpoints</a> for accessing cloud hosted LLM models.</li> <li>Refactored <a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.5.0/docs/api_reference/openapi_schema.json">API schema of chain-server component</a> to support runtime allocation of llm parameters like temperature, max tokens, chat history etc.</li> <li>Renamed <code>llm-playground</code> service in compose files to <code>rag-playground</code>.</li> <li>Switched base containers for all components to ubuntu instead of pytorch and optimized container build time as well as container size.</li> <li>Deprecated yaml based configuration to avoid confusion, all configurations are now environment variable based.</li> <li>Removed requirement of hardcoding <code>NVIDIA_API_KEY</code> in <code>compose.env</code> file.</li> <li>Upgraded all python dependencies for chain-server and rag-playground services.</li> </ul> <h3>Fixed</h3> <ul> <li>Fixed a bug causing hallucinated answer when retriever fails to return any documents.</li> <li>Fixed some accuracy issues for all the examples.</li> </ul> shubhadeepd tag:github.com,2008:Repository/707237272/v0.4.0 2024-02-22T20:51:31Z v0.4.0 <p>This release adds new dedicated notebooks showcasing usage of cloud based NVIDIA AI Foundation models, upgraded milvus container version to enable GPU accelerated vector search and added support for FAISS vector database. Detailed changes are listed below:</p> <h3>Added</h3> <ul> <li><a href="/NVIDIA/GenerativeAIExamples/blob/v0.4.0/docs/rag/jupyter_server.md">New dedicated notebooks</a> showcasing usage of cloud based Nvidia AI Foundation based models using Langchain connectors as well as local model deployment using Huggingface.</li> <li>Upgraded milvus container version to enable GPU accelerated vector search.</li> <li>Added support to interact with models behind NeMo Inference Microservices using new model engines <code>nemo-embed</code> and <code>nemo-infer</code>.</li> <li>Added support to provide example specific collection name for vector databases using an environment variable named <code>COLLECTION_NAME</code>.</li> <li>Added <code>faiss</code> as a generic vector database solution behind <code>utils.py</code>.</li> </ul> <h3>Changed</h3> <ul> <li>Upgraded and changed base containers for all components to pytorch <code>23.12-py3</code>.</li> <li>Added langchain specific vector database connector in <code>utils.py</code>.</li> <li>Changed speech support to use single channel for Riva ASR and TTS.</li> <li>Changed <code>get_llm</code> utility in <code>utils.py</code> to return Langchain wrapper instead of Llmaindex wrappers.</li> </ul> <h3>Fixed</h3> <ul> <li>Fixed a bug causing empty rating in evaluation notebook</li> <li>Fixed document search implementation of query decomposition example.</li> </ul> sumitkbh tag:github.com,2008:Repository/707237272/v0.3.0 2024-01-22T16:48:50Z v0.3.0 <p>This release adds support for <a href="https://github.com/pgvector/pgvector">PGvector</a> Vector DB, speech-in speech-out support using RIVA and RAG observability tooling. This release also adds a dedicated example for RAG pipeline using only models from NVIDIA AI Foundation and one example demonstrating query decomposition. Detailed changes are listed below:</p> <h3>Added</h3> <ul> <li><a href="/NVIDIA/GenerativeAIExamples/blob/v0.3.0/docs/rag/aiplayground.md">New dedicated example</a> showcasing Nvidia AI Playground based models using Langchain connectors.</li> <li><a href="/NVIDIA/GenerativeAIExamples/blob/v0.3.0/RetrievalAugmentedGeneration/README.md#5-qa-chatbot-with-task-decomposition-example----a100h100l40s">New example</a> demonstrating query decomposition.</li> <li>Support for using <a href="/NVIDIA/GenerativeAIExamples/blob/v0.3.0/RetrievalAugmentedGeneration/README.md#deploying-with-pgvector-vector-store">PG Vector as a vector database in the developer rag canonical example.</a></li> <li>Support for using Speech-in Speech-out interface in the sample frontend leveraging RIVA Skills.</li> <li>New tool showcasing <a href="/NVIDIA/GenerativeAIExamples/blob/v0.3.0/tools/observability">RAG observability support.</a></li> <li>Support for on-prem deployment of <a href="/NVIDIA/GenerativeAIExamples/blob/v0.3.0/RetrievalAugmentedGeneration/README.md#6-qa-chatbot----nemotron-model">TRTLLM based nemotron models.</a></li> </ul> <h3>Changed</h3> <ul> <li>Upgraded Langchain and llamaindex dependencies for all container.</li> <li>Restructured <a href="/NVIDIA/GenerativeAIExamples/blob/v0.3.0/README.md">README</a> files for better intuitiveness.</li> <li>Added provision to plug in multiple examples using <a href="/NVIDIA/GenerativeAIExamples/blob/v0.3.0/RetrievalAugmentedGeneration/common/base.py">a common base class</a>.</li> <li>Changed <code>minio</code> service's port to <code>9010</code>from <code>9000</code> in docker based deployment.</li> <li>Moved <code>evaluation</code> directory from top level to under <code>tools</code> and created a <a href="/NVIDIA/GenerativeAIExamples/blob/v0.3.0/deploy/compose/docker-compose-evaluation.yaml">dedicated compose file</a>.</li> <li>Added an <a href="/NVIDIA/GenerativeAIExamples/blob/v0.3.0/experimental">experimental directory</a> for plugging in experimental features.</li> <li>Modified notebooks to use TRTLLM and Nvidia AI foundation based connectors from langchain.</li> <li>Changed <code>ai-playground</code> model engine name to <code>nv-ai-foundation</code> in configurations.</li> </ul> <h3>Fixed</h3> <ul> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/issues/19" data-hovercard-type="issue" data-hovercard-url="/NVIDIA/GenerativeAIExamples/issues/19/hovercard">Fixed issue #19</a></li> </ul> sumitkbh tag:github.com,2008:Repository/707237272/v0.2.0 2023-12-15T20:24:48Z Release v0.2.0 <p>This release builds on the feedback received and brings many improvements, bugfixes and new features. This release is the first to include Nvidia AI Foundational models support and support for quantized LLM models. Detailed changes are listed below:</p> <h2>What's Added</h2> <ul> <li>Support for using <a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.2.0/docs/rag/aiplayground.md#using-nvdia-cloud-based-llms">Nvidia AI Foundational LLM models</a></li> <li>Support for using <a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.2.0/docs/rag/aiplayground.md#using-nvidia-cloud-based-embedding-models">Nvidia AI Foundational embedding models</a></li> <li>Support for <a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.2.0/docs/rag/llm_inference_server.md#quantized-llama2-model-deployment">deploying and using quantized LLM models</a></li> <li>Support for <a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.2.0/evaluation/README.md">evaluating RAG pipeline</a></li> </ul> <h2>What's Changed</h2> <ul> <li>Repository restructing to allow better open source contributions</li> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.2.0/RetrievalAugmentedGeneration/requirements.txt">Upgraded dependencies</a> for chain server container</li> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.2.0/RetrievalAugmentedGeneration/llm-inference-server/Dockerfile">Upgraded NeMo Inference Framework container version</a>, no seperate sign up needed now for access.</li> <li>Main <a href="https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.2.0/README.md">README</a> now provides more details.</li> <li>Documentation improvements.</li> <li>Better error handling and reporting mechanism for corner cases.</li> <li>Renamed triton-inference-server container and service to llm-inference-server</li> </ul> <h2>What's Fixed</h2> <ul> <li><a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2026877567" data-permission-text="Title is private" data-url="https://github.com/NVIDIA/GenerativeAIExamples/issues/13" data-hovercard-type="issue" data-hovercard-url="/NVIDIA/GenerativeAIExamples/issues/13/hovercard" href="https://github.com/NVIDIA/GenerativeAIExamples/issues/13">#13</a> of pipeline not able to answer questions unrelated to knowledge base</li> <li><a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2026820965" data-permission-text="Title is private" data-url="https://github.com/NVIDIA/GenerativeAIExamples/issues/12" data-hovercard-type="issue" data-hovercard-url="/NVIDIA/GenerativeAIExamples/issues/12/hovercard" href="https://github.com/NVIDIA/GenerativeAIExamples/issues/12">#12</a> typechecking while uploading PDF files</li> </ul> shubhadeepd tag:github.com,2008:Repository/707237272/v0.1.0 2023-11-16T19:51:40Z v0.1.0 <p>Bump postcss and next (<a class="issue-link js-issue-link" href="https://github.com/NVIDIA/GenerativeAIExamples/pull/4">#4</a>)</p> <p>Bumps [postcss](<a href="https://github.com/postcss/postcss">https://github.com/postcss/postcss</a>) to 8.4.31 and updates ancestor dependency [next](<a href="https://github.com/vercel/next.js">https://github.com/vercel/next.js</a>). These dependencies need to be updated together.</p> <p>Updates `postcss` from 8.4.14 to 8.4.31 <br />- [Release notes](<a href="https://github.com/postcss/postcss/releases">https://github.com/postcss/postcss/releases</a>) <br />- [Changelog](<a href="https://github.com/postcss/postcss/blob/main/CHANGELOG.md">https://github.com/postcss/postcss/blob/main/CHANGELOG.md</a>) <br />- [Commits](<a class="commit-link" href="https://github.com/postcss/postcss/compare/8.4.14...8.4.31">postcss/postcss@<tt>8.4.14...8.4.31</tt></a>)</p> <p>Updates `next` from 13.4.12 to 13.5.6 <br />- [Release notes](<a href="https://github.com/vercel/next.js/releases">https://github.com/vercel/next.js/releases</a>) <br />- [Changelog](<a href="https://github.com/vercel/next.js/blob/canary/release.js">https://github.com/vercel/next.js/blob/canary/release.js</a>) <br />- [Commits](<a class="commit-link" href="https://github.com/vercel/next.js/compare/v13.4.12...v13.5.6">vercel/next.js@<tt>v13.4.12...v13.5.6</tt></a>)</p> <p>--- <br />updated-dependencies: <br />- dependency-name: postcss <br /> dependency-type: indirect <br />- dependency-name: next <br /> dependency-type: direct:production <br />...</p> <p>Signed-off-by: dependabot[bot] &lt;[email protected]&gt; <br />Co-authored-by: dependabot[bot] &lt;49699333+dependabot[bot]@users.noreply.github.com&gt;</p> sumitkbh