ProbingPrivacy

Reproduction Code for Paper "Watch Out Your Album! On the Inadvertent Privacy Memorization in Multi-Modal Large Language Models". The preprint of our paper is publicly available at this link.

🛠️ Installation

Dependencies

The project requires the setup of two separate environments. Here are the steps to configure each environment:

conda create -n llava python=3.10 -y
conda activate llava
pip install -r requirements_llava.txt

conda create -n qwen python=3.10 -y
conda activate qwen
pip install -r requirements_qwen.txt

Models

The models required for our experiments are Qwen-VL-Chat and vicuna-7b-v1.5. Please download these two models and place them in a custom directory for use in subsequent experiments.

Dataset Preparation

We conduct experiments using the following datasets: COCO, GQA, OCR-VQA, TextVQA, and VisualGenome.

Please download the dataset and the description files, and structure the description file according to the following format：

  {
    "id": "000000341431",
    "image": "coco/train2017/000000341431.jpg",
    "conversations": [
      {
        "from": "human",
        "value": "<image>\nWhat period of day does the image capture?"
      },
      {
        "from": "gpt",
        "value": "The image captures a nocturnal scene."
      },
      {
        "from": "human",
        "value": "How is the road illuminated in the image?"
      },
      {
        "from": "gpt",
        "value": "The street is illuminated by numerous street lamps and the vibrant lights from the surrounding buildings. Additionally, there is a time lapse effect causing blurred lights on the road."
      },
      {
        "from": "human",
        "value": "Is there any distinctive architecture in the image?"
      },
      {
        "from": "gpt",
        "value": "Yes, there is a striking piece of architecture in the image: a church can be seen across the street, adding to the overall atmosphere of the scene.\n\n(I changed 'notable' to'striking', 'cathedral' to 'church', and 'ambiance' to 'atmosphere'.)"
      },
      {
        "from": "human",
        "value": "What is situated in the middle of the thoroughfare?"
      },
      {
        "from": "gpt",
        "value": "In the midst of the thoroughfare, there is a decorative column that serves as a separator for the highway."
      },
      {
        "from": "human",
        "value": "What does the cinematic effect in the image achieve?"
      },
      {
        "from": "gpt",
        "value": "The dynamic effect in the image creates a sense of motion and dynamism, as well as a visually intriguing element. It captures the movement of lights in the scene over a period of time, resulting in a blurred, streaked effect on the road. This artistic technique adds to the overall visual appeal of the image and conveys the vibrant atmosphere of the city street at night.\n\n(I changed \"time-lapse\" to \"dynamic\", and \"aesthetic\" to \"visual\")"
      }
    ]
  }

place the description files under data/llava and data/qwen respectively, then filter the dataset according to the description files using tools/choose.py and Split the dataset to obtain the training set using tools/split.py:

python choose.py

python split.py

The training set should be organized as follows:

data/
└── data_without_privacy/
   └──train/
      ├── coco/
      │   └── train2017/
      ├── gqa/
      │   └── images/
      ├── ocr_vqa/
      │   └── images/
      ├── textvqa/
      │   └── train_images/
      └── vg/
          ├── VG_100K/
          └── VG_100K_2/

In this project, we need datasets with embedded privacy. The following are the steps for generating the privacy dataset:

Generate private information using tools/generate_user_info.py：

python generate_user_info.py

Use tools/add_privacy_to_image.py to embed private information into the dataset in order to obtain a privacy-preserving dataset. Modify the code at line 32 to adjust the embedding rate.

python add_privacy_to_image.py

Additionally, our experiments utilize datasets with text and image augmentations.Use tools/augmentation/text_augmentation.py and tools/augmentation/image_augmentation.py to perform text augmentation or image augmentation on the original dataset.

python text_augmentation.py

python image_augmentation.py

📘 Instructions

Fine-tuning

For the LLaVA model, use finetune/finetune_lora_llava.sh for fine-tuning. Modify --data_path to use either the original description files or the augmented description files, and modify --image_folder to use the original image data, augmented image data, or privacy-embedded image data.

bash finetune_lora_llava.sh

For the Qwen model, use finetune/finetune_lora_qwenvl.sh for fine-tuning. Modify --data_path to use various datasets.

Evaluation

To explore how task-irrelevant content might affect the finetuning process, we examine the performance of the MLLMs on standard VQA tasks before and after embedding the task-irrelevant private content.

ScienceQA Under data/eval/scienceqa, download images, pid_splits.json, problems.json from the data/scienceqa folder of the ScienceQA repo and scienceqa_test_img.jsonl.

For LLaVA, Single-GPU inference and evaluate.

CUDA_VISIBLE_DEVICES=0 bash scripts/eval/llava/sqa.sh

For Qwen-VL, run scripts/eval/qwen/evaluate_multiple_choice.py as follows.

ds="scienceqa_test_img"
checkpoint=/PATH/TO/CHECKPOINT
python -m torch.distributed.launch --use-env \
    --nproc_per_node ${NPROC_PER_NODE:-8} \
    --nnodes ${WORLD_SIZE:-1} \
    --node_rank ${RANK:-0} \
    --master_addr ${MASTER_ADDR:-127.0.0.1} \
    --master_port ${MASTER_PORT:-12345} \
    evaluate_multiple_choice.py \
    --checkpoint $checkpoint \
    --dataset $ds \
    --batch-size 8 \
    --num-workers 2

MME Download the data following the official instructions here.

For LLaVA, downloaded images to MME_Benchmark_release_version, put the official eval_tool and MME_Benchmark_release_version under data/eval/MME, then Single-GPU inference and evaluate.

CUDA_VISIBLE_DEVICES=0 bash scripts/eval/llava/mme.sh

For Qwen-VL: Rearrange images by executing python get_images.py. Evaluate Qwen-VL-Chat results by executing python eval.py.

Cosine Gradient Similarity Comparison

Modify the Transformers library, add a path for saving gradient outputs, then run the fine-tuning script.

After obtaining the results using training datasets with different privacy rates, run the gradient similarity comparison code in tools/compute_gradients.

LLaVA:

python gradients_llava.py

Qwen-VL:

puthon gradients_qwen.py

Probing

After fine-tuning, divide the test set according to the requirements in the paper, and then follow the steps below to complete the probing experiments and line chart plotting for Qwen-VL.

Use probing/run_qwen.py to generate result files in probing/results.

python run_qwen.py --model-base --query

Use probing/results/analyze.py to generate line charts based on the results.

python analyze.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProbingPrivacy

🛠️ Installation

Dependencies

Models

Dataset Preparation

📘 Instructions

Fine-tuning

Evaluation

Cosine Gradient Similarity Comparison

Probing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
checkpoints/pretrain/llava-v1.5-7b-pretrain		checkpoints/pretrain/llava-v1.5-7b-pretrain
data		data
finetune		finetune
llava		llava
probing		probing
qwenvl		qwenvl
scripts/eval		scripts/eval
tools		tools
README.md		README.md
requirements_llava.txt		requirements_llava.txt
requirements_qwen.txt		requirements_qwen.txt

Folders and files

Latest commit

History

Repository files navigation

ProbingPrivacy

🛠️ Installation

Dependencies

Models

Dataset Preparation

📘 Instructions

Fine-tuning

Evaluation

Cosine Gradient Similarity Comparison

Probing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages