Inspiration

There's a need in the world for safely distributing data related to cancer and is characteristic expression in patients. Having the ability to de-identify patient data by synthesizing it from existing data would allow for the field of medicine to easily access a wealth of data which would contribute to advancing disease research.

What it does

Securely bypasses the need to directly use confidential patient data by using AI/ML tools to mimic the original data and passing synthesized data sets where it needs to be.

How we built it

Utilizing python we took in the data, used data's synthetic library to take in the data and generate a new set of data based on the variability of the original data.

Challenges we ran into

The data itself is large, and difficult to integrate into a python securely and seamlessly.

Accomplishments that we're proud of

We all learned so much, and worked hard together in a short amount of time to come up with a project that we are proud of.

What we learned

Familiarity with the tools needed to facilitate training AI/ML models.

What's next for CopyCat.ai

Continue development of our secure data synthesis pipeline.

Built With

Share this project:

Updates