Inspiration

There is a lot of disinformation and misinformation spread around the world, especially for under represented groups like LGBTQ2+. Transgender issues are in the spotlight now with news articles and laws. PrismBot is a chat bot that answers difficult questions grounded by reliable sources including Rainbow Health Ontario. It provides a source of truth in this world of misinformation.

What it does

Via the chat interface, users can ask any question they want. Questions are compared to our known database of reliable information to provided grounded answers. If the bot doesn't know, it will explicitly say rather than hallucinate. It cites sources from where it retrieved information so the user knows exactly where to look. There is also a "Take me to safety!" button on the left to let the user quickly leave the website. No user information is stored.

How we built it

Using Langchain, I was able to connect all the components together. To parse documents, I use PyPDF to read PDFs and extract the text. This text is split into manageable sizes and converted into embeddings using Vertex AI. Everything is then stored into the Redis database. Websites are processed in a similar way by extracting content before being split.

The front-end is built using Streamlit. User queries are parsed and converted into embeddings. A similarity search is run against the Redis database to find relevant information. Using Gemeni Pro, the information is processed into user-friendly text before being presented.

Using GitHub actions and Docker, the app was built to be deployed anywhere.

Using Google Cloud

Google Cloud Platform is one of the crucial pillars of this project.

Using Vertex AI, I was able to get all the resources I needed. Embeddings were done using the "textembedding-gecko@001" model. It allowed quick processing of data for input and output.

The Large Language Model for retrieval-augmented generation was "gemini-pro". Gemini used the content from the sources to properly format answers. It also determined whether questions where outside the scope of the database.

I did use Google Cloud Run to deploy the app online, but I encountered issues with authentication. It did deploy automatically from GitHub, but it wasn't able to process any user requests. The version linked below is a few revisions behind and cannot access Vertex AI and errors out.

Challenges we ran into

The challenge that took the most amount of time was GCP credentials. The app is deployed to Google Cloud, but it isn't able to access credentials.

Accomplishments that we're proud of

  • The bot cites sources properly
  • It works fast

What we learned

  • How to integrate Vertex AI to create a chat bot
  • How to cite documents including PDF page numbers
  • How to process PDF and websites

What's next for PrismBot

  • Further data scraping
  • Remove sources when bot cannot answer question

Built With

Share this project:

Updates