Inspiration

When lives are at stake, ignorance isn’t an option.

Over 60% of Americans aren’t confident that they understand their doctor’s advice and information during a visit. Nearly 25% claim that they don't understand a single thing their physician recommended.

We have a new epidemic on our hands: a widespread lack of health literacy.

Health literacy -- the idea that people have enough information to make informed health decisions -- is one of the most important prerequisites for promoting a healthy society. Despite its necessity, it is one of the most overlooked crises worldwide. People with low health literacy can spend up to an additional $8,000 a year in healthcare costs, contributing to an additional $238 billion in healthcare expenditure across the country.

Here's the worst part: marginalized individuals living below the poverty line are the most likely group to have inadequate health literacy. In the status quo, the people who earn the least are punished the most by a simple lack of information.

What if we had a tool that could help doctors educate their patients on their diagnoses while increasing their efficiency in the process?

Enter Thrive AI

Thrive is an AI-powered visualization tool that can bridge the knowledge gap between a veteran medical practitioner and someone who just had their first annual exam. By providing doctors with a resource that can convey information in an easy-to-digest format, we help ensure that people have enough context to make educated decisions about their health and well-being.

What it does

The core mission of Thrive is exactly what its name suggests: to help people maintain the best version of themselves through informed healthcare decisions.

Think of Thrive as an all-in-one personal assistant tailored specifically for your medical results. Since our application has widespread uses across a variety of diseases, we’ve decided to focus our functionality on serious illnesses that manifest themselves within the lungs for TreeHacks.

When patients open their results with Thrive, they’re greeted with 3 key components:

A comprehensive CT scan/image of your lungs with information about specific biological markers.
An intelligent chat interface that can answer any questions patients have about their summary and next steps.
A personalized message from the physician outlining the results.

When combined together, these three components provide a comprehensive overview of information given by a physician.

The scan highlights specific areas within the lung that are likely to have led to the patient’s diagnosis, providing them with an interactive visual understanding of the factors that contributed to their positive/negative result. The chat interface allows the patient to directly “talk” to their results and ask clarifying questions about anything related to the results or the diagnosis -- including clarifying information about the CT scan and next steps. The doctor’s note facilitates a pivotal personal communication between the patient and physician while also providing context for the chat interface to reference in its responses.

When would Thrive be used?

Thrive has two primary use environments: large healthcare providers and smaller private practices. In both cases, our application aims to serve the same purpose -- acting as a powerful tool to supplement the communication between a physician and their patient, not replace it. When coupled with in-person visits and exams, Thrive is a powerful alternative to the textbooks worth of paper generally given to patients after being diagnosed with a medical condition.

Holistically, we hope to be a powerful resource for patients looking to better understand their health and make informed decisions about their treatment options, increasing overall trust and transparency within the healthcare system.

How we built it

Our project had 3 core facets: backend/data storage, machine learning implementation, and frontend

Backend/data storage

Account data and CAT scans are stored within a locally-hosted JSON database using Flask. Passwords are encrypted using a hash function to ensure that user privacy is maintained.

Machine learning implementation

Our project heavily relies on two machine learning features running independently of one another. The first feature is a model trained on classified CAT scan data from existing COVID patients. This model was used to generate GradCAM images (heatmaps) of features in the lungs which the model found relevant in its analysis process. The second feature is the integration of a large learning model to summarize patient data. We used the Davinci model via the OpenAI API and tweaked it for the specific medical use case, restricting information and responses to ones pertinent to the diagnosis at hand.

Frontend

The frontend, or visible interface of our application, was built using React.js, a framework built upon JavaScript, HTML, and CSS. To streamline our development process, we used a framework called Tailwind CSS which helped us style components more efficiently in conjunction with Daisy UI, a library with customizable, premade website features.

Challenges we ran into

As we built this project, we faced a dizzying number of obstacles and hurdles. Some of them included:

Figuring out how to manage hundreds of dependencies across three separate machines
Training different machine learning models and comparing results to achieve the highest accuracy
Connecting backend APIs and databases to the frontend for data display
Writing this very summary!

Accomplishments that we're proud of

One of our most significant accomplishments was the implementation of an intelligent chat interface, which enables patients to ask questions about their diagnosis and receive immediate answers using only the information provided by a doctor to a patient.

Another key accomplishment was the incorporation of machine learning into the development of Thrive. This allowed us to analyze large amounts of data and identify patterns and insights about the impact of COVID on the lungs that would be difficult to detect manually. By using machine learning, we were able to highlight specific areas that likely contributed to the patient's diagnosis and provide an interactive visual understanding of the factors behind their results.

Our team also focused on creating a personalized experience for each user. Thrive uses a comprehensive CT scan/image of the lungs with information about specific biological markers, along with a personalized message from the physician outlining the results. Additionally, users have the ability to log into their own portals with industry standard encryption and view their data localized from anyone else using the service.

What we learned

During the hackathon, our team gained a variety of valuable skills and experiences related to healthcare technology development. From a technical standpoint, we learned more about the intricacies of web development, database management, NLP, and machine learning, all of which are critical tools for creating powerful and effective healthcare applications. We also gained a deeper understanding of medical imaging and the complex factors involved in the diagnosis of serious lung diseases.

Throughout the project, we also honed our design thinking skills, which helped us prioritize user needs and experiences when developing Thrive. By empathizing with patients and identifying pain points in the healthcare system, we were able to create an application that provides users with the information and support they need to make informed healthcare decisions.

Another key aspect of our experience during the hackathon was collaboration and communication. We learned how to divide tasks, delegate responsibilities, and communicate effectively with one another, which was essential for ensuring that the project was completed on time and to a high standard.

What's next for Thrive AI

Although TreeHacks helped our team demonstrate the power of context-driven queries using large language models for data interpretation, much of our application was restricted by our initial COVID-19 dataset. In the future, we hope for Thrive to be made more generalizable for a wide range of diseases and applications, transforming into a key assistant for physicians to interact with their patients. Another key feature we hope to implement is the ability to add compartmentalized modules of data to feed into the large language model to create inferences off of. This would enable doctors to remove and add pertinent data at-will via real time updates to the model's context.

Ethical considerations

Healthcare has been established as a basic human right.

When over 77 million Americans have difficulty with using health services, maintaining sustainable behaviors, and obtaining quality care, being able to harness it effectively should be one too.

Safely and ethically bolstering health literacy isn’t something that can be taken lightly -- when dealing with inferences made on medical information, it is imperative that data is presented and interpreted accurately. If these two conditions aren’t achieved without the utmost precision, we run the risk of undermining the very principles we aim to uphold.

To tackle these challenges, we followed considerations posited by medically-involved software engineer Alan Cossitt of the Hastings Center, who outlined a set of ethical questions for the use of technology in medicine. For our specific application, Cossitt raised two pertinent issues: who should have access to AI-generated data, and what type of patient consent is ethical for predictive algorithms?

How did we incorporate these considerations into our design?

Access to AI-generated data

To maintain doctor-patient confidentiality, all AI generated output is dealt with so that only doctors and patients who have logged into their portal can access and interface with information influenced by machine learning. For example, AI generated heatmaps on lung scans are only accessible by the doctor to begin with, and are made available to only the pertinent patient once the doctor adds in contextualizing information. No data is used to draw predictions about other patients and no data leaves the closed system. Additionally, the initial training data used to generate the heat maps is completely anonymous and external of the system.

Patient consent

The main AI generated information with a vague implementation of patient consent is the generation of heat maps based on organic lung scans. The model to generate the heat map is run outside of the environment of Thrive and assumes agreement outside of the software to have it initially inputted as a piece of data. The inputs received by the chat are user-driven and thereby inherently have consent. In an effort to maintain transparency about how the chat works, a disclaimer is placed over the chat indicating the potential sources that may be seeing the questions and responses and how they may use the data. How did following these guidelines help us present and interpret data accurately?

Limiting access to AI generated data and making patient consent a priority ensured that data influenced by machine learning was both presented and interpreted in an accurate and transparent manner. For example, our prompt training for Davinci deliberately avoided the introduction of biases so it could give an impartial representation of information when prompted, rather than an analysis of it. This also avoids the issue of large learning models generating false and inaccurate information based on too much context interference. Following these guidelines throughout our development process helped us build a more robust app while maintaining the level of privacy expected surrounding sensitive medical information.

Although we tried our best to uphold these principles for all aspects of our project, we faced some critical ethical challenges and unforeseen consequences within our technology.

Dealing with uncertainty within machine learning models.

Machine learning models aren’t perfect -- there will always be a margin of error when classifying and displaying data. Our heatmap generation is no exception. Although we managed to minimize error and achieve a model which is accurate 95% of the time, the ethical question of informing patients with the possibility of inaccurate data representation will always exist. Is it ethical to provide data that is accurate enough for a majority of use cases but not 100% consistent? Should we have machine learning models involved in medical technology at all unless they are guaranteed to work? Although our implementation of machine learning serves as an educational reference and not a diagnostic tool, the remaining uncertainty in the model may create interference that hinders a patient making the best decision possible.

Limiting the amount of information fed to large language systems.

Although our integration of the Davinci model aggregates the limited amount of information in a diagnostic report and only allows users to intelligently query that data, there is an inherent ethical question in regards to allowing large language models to process sensitive information. Many popular models trained on vast swaths of data tend to inherit biases as a consequence of the information it’s trained on. Although we limit the scope of our integrated LLM using prompt engineering, it's possible that these inherent biases are persistent in an influential way. An ethical question to explore would be whether or not using these models are safe to use in a medical setting if it is theoretically possible to provide different responses based on demographic and inferred information from the context provided.

As the field of medicine and its intersection with revolutionary technologies evolves, ethical viewpoints on how much machine learning should be used as a tool to assist in diagnosis and prevention are bound to change -- Thrive aims to evolve with it.

References

https://ahimafoundation.org/media/ngfbggsk/oct2021_understanding_access_use_health_information_america_ahima_foundation.pdf

https://milkeninstitute.org/sites/default/files/202205/Health_Literacy_United_States_Final_Report.pdf

https://nciom.org/wp-content/uploads/2017/07/HealthLiteracy_Chap2.pdf

https://www.thehastingscenter.org/why-health-care-organizations-need-technology-ethics-committees/

https://huggingface.co/blog/evaluating-llm-bias#:~:text=In%20fact%2C%20many%20popular%20language,of%20harms%20against%20marginalized%20groups.

Built With

Submitted to

TreeHacks 2023

Created by

I worked largely on product ideation, prompt engineering, and frontend layouts. I designed the prompt environment for the Davinci model to localize the context it uses to generate responses.

Arul Mathur
I was responsible for most of the front-end and back-end development, and connected the pipelines between the AI interface with a Python flask server.

Patrick Li
I was responsible
for training, deploying, and interacting with all things ML/AI related. For example, after training our model, I wrote an API to interact with it by generating predictions and visualizations, both of which were ultimately used in our final product.

Kameron Gano

Updates

Arul Mathur started this project — Feb 19, 2023 12:53 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.