KnowledGenie

What it does

KnowledGenie is a web application that allows users to upload a PDF and ask queries to which answers will be provided along with the reference text. The application uses a large language model (LLM) to process the PDF and generate answers to the user's queries. The LLM is trained on a massive dataset of text and code, which allows it to understand and respond to a wide range of queries.

How we built it

We built KnowledGenie using LangChain with HuggingFace LLMs. LangChain is a framework that makes it easy to build LLM-powered applications. The Hugging Face Hub is a platform with over 120k models, 20k datasets, and 50k demo apps (Spaces), all open source and publicly available, in an online platform where people can easily collaborate and build ML together.

The model utilized for the retrieval of answers in KnowledGenie is flan-t5-large. This model is a Transformer-based LLM that has been trained on a massive dataset of text and code. The model is able to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way. In our case, we utilized flan-t5-large to answer queries asked by the user and point to where the answer was picked from in the uploaded document.

The backend is made with Flask and it handles the user input, calls the function to run the LLM through langchain. The model is downloaded from HugginFacehub and trained on the pdf. The answer to the query is sent to the frontend and updated in the chat interface.

Challenges we ran into

One of the major roadblocks was to choose the right methods to ensure that the model can perform question answering. In LangChain, there are multiple ways to accomplish the same task. So, it took us quite some time to find the best way to answer the user's queries in the least time possible.

We also faced some issues with HuggingFace API which was solved by asking the user to provide their own HuggingFace API so that the user doesn't face timeouts or internal errors.

Hosting the whole application was also hectic. We realised that training the LLM on the same instance was not the most efficient solution. As the dependencies for both the web and the ML part were being installed on the same instance, it cost us a lot of time to build and deploy the app. This was especially problematic because it would take even longer to build the dependencies locally. Due to this, we, unfortunately, could test the application locally as intended.

Accomplishments that we're proud of

We are proud of the fact that we were able to build KnowledGenie using a relatively new technology. LLMs are still in their early stages of development, but we believe that they have the potential to revolutionize the way we interact with information.

We are also proud of the fact that we were able to build a user-friendly application that is easy to use and understand. We believe that KnowledGenie has the potential to be a valuable tool for students, researchers, and anyone else who needs to access information quickly and easily.

What we learned

We learned a lot about LLMs while building KnowledGenie. We learned about the different types of LLMs, how they are trained, and how they can be used to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.

We also learned about working together and collaborating. Integrating all the different components

What's next for KnowledGenie

We have a number of ideas for future development of KnowledGenie. We plan to add new features, such as the ability to process videos and allow users to ask questions to a video and retrieve relevant answers from it. We also plan to make KnowledGenie more accessible to a wider range of users by translating the application into multiple languages.

We are excited to see what the future holds for KnowledGenie, and we are grateful for the opportunity to share our project with you.

Built With

flask
langchain
python
railway

Submitted to

AI Hackfest
- Winner Best Domain Name from GoDaddy Registry [APAC Only]

Created by

I worked on the frontend part of the project . I was was responsible for UI/UX of the website.

Shubham Yadav
I was responsible for making and deploying the Flask backend and integrating the question-answer functionality provided by the Langchain script.

Sahil Sagwekar
I'm an undergrad IT student. I am always open to new ideas and love to collaborate with people from diverse backgrounds.
I worked on the backend, using LangChain to implement the question-answering model for our project.

Private user
I contributed to created the AI part, it was hard because it was the first time whehn i used that Python library, but i learned how to read documentation and get information from it

Chucho Montesinos