Inspiration
There's a need in the world for safely distributing data related to cancer and is characteristic expression in patients. Having the ability to de-identify patient data by synthesizing it from existing data would allow for the field of medicine to easily access a wealth of data which would contribute to advancing disease research.
What it does
Securely bypasses the need to directly use confidential patient data by using AI/ML tools to mimic the original data and passing synthesized data sets where it needs to be.
How we built it
Utilizing python we took in the data, used data's synthetic library to take in the data and generate a new set of data based on the variability of the original data.
Challenges we ran into
The data itself is large, and difficult to integrate into a python securely and seamlessly.
Accomplishments that we're proud of
We all learned so much, and worked hard together in a short amount of time to come up with a project that we are proud of.
What we learned
Familiarity with the tools needed to facilitate training AI/ML models.
What's next for CopyCat.ai
Continue development of our secure data synthesis pipeline.

Log in or sign up for Devpost to join the conversation.