LLM bio hackathon (Ultrasound image simulator)

We have developed multiple models for this hackathon to generate images of soft tissue with cancer. There are two main models, a text-to-image system that generates random images given a prompt and a text+image-to-image model that inputs disease markers in an input image.

Some examples for the generated ultrasound images (text-to-image)

`stable-diffusion-v1` with prompt 'a ultrasound image of breast with two small benign tumor'

`stable-diffusion-v1` with prompt 'a ultrasound image of normal breast'

`stable-diffusion-v1` with prompt 'a ultrasound image of breast with a malignant tumor with unclear boundary'

`stable-diffusion-v2` with prompt 'a ultrasound image of breast with two small benign tumor'

`stable-diffusion-v2` with prompt 'a ultrasound image of normal breast'

`stable-diffusion-v2` with prompt 'a ultrasound image of breast with a malignant tumor with unclear boundary'

The second model is a text+image-to-image model designed to input cancer disease markers in an input image. There is a front-end application to do this. Some examples are shown in the video below.

WebApp (text+image-to-image)

We use Gradio to crate a small webapp to do generation

python inference_img2img_gradio.py

Through this interface at 127.0.0.1:7860, you can upload an image along with a prompt and a negative prompt. It will generate four images with four seeds. We have found that using the prompt 'ultrasound image with a malignant tumor' and the negative prompt 'ultrasound scanning device' yields good results.

Example of generation with one sample (text+image-to-image)

An example of how to do generation using the web app. The following example inputs malignant carcinogen-related tumor markers in an input image with four different initialization seeds. This gives four potential ways that cancer may appear in the same tissue.

single_example_soft_tissue_cancer_simulator.mp4

Highlighting the Significance of the Image Guidance Feature (text+image-to-image)

Other generation methods are text-to-image only, allowing comparison between a normal tissue scan and a randomly generated cancer image. Using image+text prompting allows inputting disease markers within the same input image, allowing for a real one to one comparison between normal tissue and the hypothetical same tissue with cancerogeneous disease markers.

the_importance_of_image_guidance.mp4

Python environment configuration

Install conda on linux with command

wget https://repo.anaconda.com/archive/Anaconda3-2023.07-1-Linux-x86_64.sh

or the latest version from https://www.anaconda.com/download#downloads for your particular machine. Then do:

bash Anaconda3-2023.07-1-Linux-x86_64.sh

Now create a conda enviroment with name ultrasound

conda env create -f environment.yml

Finally, activate the environment by

conda activate ultrasound

Data preparation

The data that we use to fine-tune is the Breast Ultrasound Images Dataset, which contains 780 images with size 500*500. The images are categorized into three classes, which are normal, benign, and malignant. One can download the dataset from the link https://www.kaggle.com/datasets/aryashah2k/breast-ultrasound-images-dataset. The images will be stored in the folder Dataset_BUSI_with_GT.

Data processing

After downloading the ultrasound dataset, create a folder ultrasound to store all the ultrasound images that exclude the mask. Then create three folders benign, malignant and normal to store the corresponded class of images.

This can be done by runing

python data_loading.py

Remember to change the source_dir and destination_dir for different classes in the data.py. Finally, this will remove the mask images from the classes and copy the images into the ultrasound folder. The dataset and environment are now ready!

Fine-tuning

The training pipeline is mainly based on https://huggingface.co/docs/diffusers/v0.11.0/en/training/dreambooth.

Specifically, we use the stable-diffusion models for fine-tuning based on two methods: DreamBooth, DreamBooth with LORA. For each method, we fine-tune in two classes of stable-diffusion models: stable-diffusion-v1, stable-diffusion-2.

The fine-tuning process follows by the order normal->benign->malignant with totally 3 ruonds of training.

To run the code, firstly install Diffusers from Github:

pip install git+https://github.com/huggingface/diffusers
pip install -U -r diffusers/examples/dreambooth/requirements.txt

After all the dependencies have been set up, initialize a Accelerate environment with:

accelerate config

Then run the following code to fine-tune in a stable-diffusion-v1 model by DreamBooth with LORA method.

bash train_lora_v1.sh

The saved checkpoints are in the folder dreambooth_lora_v1.

For stable-diffusion-v1 model + DreamBooth method, run

bash train_v1.sh

The saved checkpoints are in the folder output_models.

For stable-diffusion-2 model + DreamBooth method, run

bash train_v2.sh

The saved checkpoints are in the folder output_models.

For stable-diffusion-2 model + DreamBooth with LORA method, run

bash train_lora_v2.sh

The saved checkpoints are in the folder path_to_saved_model.

Inference

For the inference part, we have two kinds of pipelines:

text -> image
image + text -> image

After we obtain the checkpoints for the fine-tuned models, we can do inference with a text prompt. For the stable-diffusion-v1 model + DreamBooth_LORA method, run

python inference_text2img_lora_v1.py

by changing the prompt to your own text prompt. The output image will be saved in the folder saved_images/dreambooth_lora_v1.

For other text-> image models, run

python inference_text2img.py

by changing the prompot in the python file with your own text prompt. The output image will be saved in the folder saved_images/text2img.

For the image + text -> image models, run

python inference_img2img_v2.py

The output image will be saved in the folder saved_images/img2img.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM bio hackathon (Ultrasound image simulator)

Some examples for the generated ultrasound images (text-to-image)

`stable-diffusion-v1` with prompt 'a ultrasound image of breast with two small benign tumor'

`stable-diffusion-v1` with prompt 'a ultrasound image of normal breast'

`stable-diffusion-v1` with prompt 'a ultrasound image of breast with a malignant tumor with unclear boundary'

`stable-diffusion-v2` with prompt 'a ultrasound image of breast with two small benign tumor'

`stable-diffusion-v2` with prompt 'a ultrasound image of normal breast'

`stable-diffusion-v2` with prompt 'a ultrasound image of breast with a malignant tumor with unclear boundary'

WebApp (text+image-to-image)

Example of generation with one sample (text+image-to-image)

Highlighting the Significance of the Image Guidance Feature (text+image-to-image)

Python environment configuration

Data preparation

Data processing

Fine-tuning

Inference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
diffusers		diffusers
dream_ultra		dream_ultra
dreambooth_lora_v1		dreambooth_lora_v1
full_images_generated_image_to_image		full_images_generated_image_to_image
full_images_preprocessed		full_images_preprocessed
images_generated_image_to_image		images_generated_image_to_image
images_generated_text_to_image		images_generated_text_to_image
images_preprocessed		images_preprocessed
normal_malignant_image2image		normal_malignant_image2image
normal_to_malignant_image2image		normal_to_malignant_image2image
saved_images		saved_images
ultrasound		ultrasound
README.md		README.md
data_loading.py		data_loading.py
environment.yml		environment.yml
inference.py		inference.py
inference_img2img_gradio.py		inference_img2img_gradio.py
inference_img2img_v2.py		inference_img2img_v2.py
inference_imgtoimg.ipynb		inference_imgtoimg.ipynb
inference_text2img.py		inference_text2img.py
inference_text2img_lora_v1.py		inference_text2img_lora_v1.py
single_example_soft_tissue_cancer_simulator.mp4		single_example_soft_tissue_cancer_simulator.mp4
testing_diffusers.ipynb		testing_diffusers.ipynb
the_importance_of_image_guidance.mp4		the_importance_of_image_guidance.mp4
train_lora_v1.sh		train_lora_v1.sh
train_lora_v2.sh		train_lora_v2.sh
train_v1.sh		train_v1.sh
train_v2.sh		train_v2.sh

Folders and files

Latest commit

History

Repository files navigation

LLM bio hackathon (Ultrasound image simulator)

Some examples for the generated ultrasound images (text-to-image)

stable-diffusion-v1 with prompt 'a ultrasound image of breast with two small benign tumor'

stable-diffusion-v1 with prompt 'a ultrasound image of normal breast'

stable-diffusion-v1 with prompt 'a ultrasound image of breast with a malignant tumor with unclear boundary'

stable-diffusion-v2 with prompt 'a ultrasound image of breast with two small benign tumor'

stable-diffusion-v2 with prompt 'a ultrasound image of normal breast'

stable-diffusion-v2 with prompt 'a ultrasound image of breast with a malignant tumor with unclear boundary'

WebApp (text+image-to-image)

Example of generation with one sample (text+image-to-image)

Highlighting the Significance of the Image Guidance Feature (text+image-to-image)

Python environment configuration

Data preparation

Data processing

Fine-tuning

Inference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`stable-diffusion-v1` with prompt 'a ultrasound image of breast with two small benign tumor'

`stable-diffusion-v1` with prompt 'a ultrasound image of normal breast'

`stable-diffusion-v1` with prompt 'a ultrasound image of breast with a malignant tumor with unclear boundary'

`stable-diffusion-v2` with prompt 'a ultrasound image of breast with two small benign tumor'

`stable-diffusion-v2` with prompt 'a ultrasound image of normal breast'

`stable-diffusion-v2` with prompt 'a ultrasound image of breast with a malignant tumor with unclear boundary'

Packages