Privacy in the World of Health Care

Healthcare data, collected by hospitals, is confidential and fragmented. Many hospitals want to utilize Machine Learning applications for bioinformatics, but legally are restricted from sharing data between hospitals due to patient confidentiality (PHIPA).

Federated Learning allows different people to train a single shared model on their combined data set, without having to ever share/publish their individual data.

Guardin: What it does

Guardin is an application that allows health care professionals to utilize Federated learning and train a single model together, without having to share their data.

It coordinates sharing of model updates (containing their weights) between the different clients (healthcare professionals), where the training is done on each hospital's hardware with their dataset individually. It then synthesizes these model weights together to culminate into a single model that utilizes the data of all of its clients in training, then publishes a snapshot of this model back for them to use.

How we built it

We built Guardin using the Flower Federated Learning Framework, combined with an Electron app controlling Docker containers with Pytorch models.

The Guardin control plane (powered by Flower AI) runs in Kubernetes, and hosts a Docker Registry for storing images.

Challenges we ran into

The Flower AI Framework's standard usage pattern does not involve completely separating the clients (models that contain data and run training) from the control plane.

Connecting our PyTorch models to the Flower Client to the Flower Fleet Plane was a significant challenge since it involves an unconventional setup.

Accomplishments that we're proud of

In making Guardin, we have made a tool that both coordinates federate learning, and requires no code from health care professionals to contribute to the model training process. Guardin manages validating data, running, and federating learning for you.

By doing this we have also opened up the Guardin Platform to a wider userbase, since it is so simple yet powerful to use. Federated learning theory has existed for a while, but its implementation hasn't been widespread in part because of the overhead in implementing it.

We also managed to meet many of our stretch goals, including implementing Differential Privacy to ensure Data Anonymization, as well as heavily streamlining the app experience.

What's next for Guardin

  • Incorporate other incentives for Hospitals to participate in the Guardin Federated Learning Platform other than ease-of-use.
    • Federated learning isn't widespread because often times the benefits are hard to quantify (and rationalize against the overhead of implementation).
  • Identify potential pain points in standardizing data across hospitals
  • Further enhance the user experience, adding more diagnostic information and model creation options, as well as an in-app model evaluation feature (built-in model garden).
Share this project:

Updates