llm2graph

Inspiration

The advent of Large Language Models (LLM) significantly impacted how researchers and AI scientists work with unstructured data. Understanding language and context in written text accelerated AI applications in various fields. However, after the initial amazement phase, the limitations of LLM such as hallucinations or forgetfulness due to limited context size became more obvious. While re-training an LLM might not be feasible or too expensive for any purpose other than building a foundation model, fine-tuning LLM (e.g., through Low-Rank Adaptation), or assisting it with Retrieval-Augmented Generation (RAG) are options to include private data or post-training data. The latter gained popularity since transforming text into vector databases is somewhat accessible.

The problem with LLM/RAG systems lies fundamentally in the underlying understanding of textual information. While the LLM has contextual knowledge of the trained data it was used on, RAG systems do not learn new connections. Besides retrieving information based on vector similarity, knowledge remains hidden in the new data if it is scattered across a document or if different sections are interdependent for context. Creating a relational structured layer for a RAG system appears to be an attractive idea to mitigate wrong or low-quality responses and hallucinations.

What it does

Our project aims to build a relational structure (a knowledge graph) based on encountering new information in real-time by an LLM agent. Think of an agent roaming around in an environment and encountering new objects. With the reasoning abilities of an LLM, the agent can generate information on said object. We used a minimal example of creating food relations. This can be used for use cases such as retrieval in large documents that span over thousands of pages. Use cases span from retrieving medical billing codes in healthcare, understanding nuances in complex judicial cases, or giving a well-rounded picture of financial documents. In other words, our solution is aimed to include all relevant information before retrieval.

How we built it

The backend consists of a Python program that interacts with the Prediction Guard API to generate a knowledge graph based on user input. The input string is a detailed instruction (user input as well as system prompt) for an AI model to generate a knowledge graph based on user-provided rules about food preferences. The AI model is expected to return a JSON object containing information about food items and the relations between them. The script is designed to be used in the context of designing nutritious, affordable meals based on user preferences. The user provides rules such as 'I don't like bananas', and the AI model generates a knowledge graph that represents these rules. The graph includes information about different types of food (e.g., fruit, vegetable, carbohydrate, protein, fat, spice) and their features (e.g., calories, protein, carbs, fat, cost, extra information). It also includes information about relations between different foods, such as which foods pair well together or which foods a user prefers or dislikes.

Challenges we ran into

Due to LLM hallucination, the generated output was not consistent enough. Reproducibility was a needed characteristic, yet challenging to achieve. Workarounds such as repeated generation were working only limitedly.