Inspiration
A lot of our friends and family in other countries struggled with healthcare, either because too expensive or they didn't have access to it. We wanted to build a tool that would help people all over the world get a quick and accurate diagnosis.
What it does
MedSense takes in symptoms and images that the user texts through Whatsapp or Telegram. Those messages go to Gemini which references the PubMed database to give the user an accurate diagnosis, home remedies, and causes. It also shows clinics that the user can go to if they share their location.
Retrieval Augmented Generation (RAG) is very efficient and useful when it comes down to tasks like this, especially when dealing with sensitive health-related information. Minimizing LLM hallucinations and mistakes was the top priority, which the PubMed database retrieval integration ensured.
Furthermore, the implementation keeps track of the user's medical history, only saving important facts, and omitting unnecessary details, which increases helpfulness, accuracy and cost-efficiency. The LangGraph agent also asynchronously followed up the user on the symptoms they provided 24 hours after they were revealed, ensuring accurate diagnosis and dynamic support.
Finally, the location feature allows the bot (or agent) to provide the user with locations of nearest medical facilities (hospitals and such) that are capable of providing medical assistance for the exact type of problem that the user is experiencing. A background monitoring system also referenced the World Health Organization to keep track of new disease outbreaks, and then notifies the user about them, as soon as they are spotted.
How we built it
We built it using a Flask / Python backend. We used LangChain/LangGraph agents to make all the functionality.
Under the hood, the system is a generic LLM wrapper for Gemini, that we bound tools to. The tools included PubMed API search, OpenStreetMaps API, various user profile management tools (the exact ones that ensure only relevant information is stored) and several other complimentary ones.
We also used webhooks to make sure that the messages were received and sent from WhatsApp and Telegram. Once a message is sent, a user profile is created in the database, paired with any data that the user has provided (by default the bot asks the user for their sex and age). Then, the bot asks the user to provide symptoms/questions that they would like to get information or assistance about. Once that happens, the bot navigates the LangGraph graph, in order for the LLM agent to call tools, then the graph processes those tool requests, and returns them to the agent. This process is recursive, the repeats until no tool calls are detected.
These tool calls usually include retrieval from the medical database, user profile information updates and general information transfer (location, symptoms, past prescription, etc.). After that happens, the LLM generated a final response, including all the citations to the medical sources that it has used, and probable causes.
The entire graph is constructed using LangGraph and LangChain, coupled with various libraries that offer access to several APIs and webhooks. The server itself is running on Flask, because it offered versatility and stability.
We're hosting the backend on Render with and uptime monitor to make sure it doesn't spin down.
Finally, it is important to note that the agent never forgets to mention that it is AI-based assistance, and does not replace real medical help. It is designed to provide quick, and preliminary assessments of the medical situations, not final diagnoses.
Challenges we ran into
One of the biggest challenges we had was configuring and designing the tool-using agent using langgraph. It was a constant challenge to make sure that the tools that the LLM used at any given moment are correct, and that those tools' arguments are designed correctly so that the model would not trip up while using it. The task required a lot of prompt engineering, and it ended up being hundreds of words long at the end.
Furthermore, finding and integrating the right APIs for this project was another difficult challenge. Gemini was easy but we looked through and tried several APIs before PubMed worked as our medical database. Another challenge we faced was a bug where the bot kept sending several messages, but after some perseverance, we fixed that too.
Accomplishments that we're proud of
We managed to configure the complex execution graph that this agent possesses. For most of the project development duration it would constantly break down, or the data flow would be flawed. Debugging the pipeline was an extensive and extremely complex task, and we are proud that we managed to make it into what it is now.
In addition, we got the webhooks to function properly. It was a struggle integrating the webhooks, we tried many different ways and finally it worked. This was a big deal because our whole project is about MedSense running on WhatsApp and Telegram through webhooks.
What we learned
We learnt all the difficulties about creating a bot and accessing messages, and designing LLM agents. Configuring them with precision, and designing prompts that work, and do not cause the language model to swerve away from the topic (or forget to do some crucial task) was a massive challenge, and also required many new skills to be acquired.
What's next for MedSense
We want to publish this as a public bot on WhatsApp so that even more people can use it. Beyond that, we want to integrate SMS into this too. SMS will be crucial because it will reach those people who have immensely low connectivity and help them too.
Built With
- flask
- gemini
- langchain
- langgraph
- nominatim
- openstreetmap
- pubmed
- python
- render
- telegram
Log in or sign up for Devpost to join the conversation.