Skip to content

UCLA-SEAL/WhyFlow

Repository files navigation

WhyFlow: Interrogative Debugger for Taint Analysis

ICSE 2026 DOI License

Artifact Summary

This is the artifact for the ICSE 2026 paper: "WhyFlow: Interrogative Debugger for Sensemaking Taint Analysis" by Burak Yetiştiren, Hong Jin Kang, and Miryung Kim.

Claimed Badges

Badge Justification
Available Archived on Zenodo with DOI 10.5281/zenodo.18250071
Reusable Docker environment, documented extension guide, reusable Soufflé query templates

Provenance

User Study Data

The statistical_tests/ and data/ directories contain anonymized user study data from 12 participants. The study was conducted under UCLA IRB approval. No personally identifiable information is included. Storage requirement: < 10 MB.

Overview

WhyFlow is an interrogative debugging tool for taint analysis that enables developers to ask why, why-not, and what-if questions about dataflows.

WhyFlow addresses the challenge of making sense of taint analysis results by providing:

  • Interrogative Debugging: Ask questions about the existence or absence of specific dataflows
  • Speculative Analysis: Explore the impact of different third-party library models and configurations
  • Visual Sensemaking: Graph-based visualization with color-coded annotations for global connectivity reasoning
  • Interactive Q&A Interface: Template-based queries with contextualized selections for sources, sinks, and APIs

Setup

Hardware Requirements

  • RAM: 4 GB minimum (8 GB recommended)
  • Disk: 5 GB free space (for Docker image)
  • CPU: Any modern x86_64 or ARM64 processor

Quick Start (Docker)

The easiest way to run WhyFlow is with Docker.

Requirements: Docker 20.10+ (or Docker Desktop 4.x+)

# Clone the repository
git clone https://github.com/UCLA-SEAL/WhyFlow.git
cd WhyFlow

# Build the Docker image
docker build -t whyflow .

# Run WhyFlow
docker run -p 3000:3000 whyflow

Open your browser to http://localhost:3000

Note: The first build compiles Soufflé from source for cross-platform compatibility.

Verify Installation

To confirm WhyFlow is running correctly:

  1. Open http://localhost:3000 in your browser
  2. You should see the WhyFlow interface with the Query Options panel
  3. The D-SRC dropdown should populate with source nodes (e.g., "(2) msg : HttpRequest...")
  4. Check Docker logs for: => App running at: http://localhost:3000/
# View container logs
docker logs $(docker ps -q --filter ancestor=whyflow)

Installation (Native)

For development or if you prefer not to use Docker:

Prerequisites

Setup

  1. Install Meteor:

    curl https://install.meteor.com/ | sh
  2. Install Soufflé:

    # macOS
    brew install souffle-lang/souffle/souffle
    
    # Ubuntu/Debian - see https://souffle-lang.github.io/build
  3. Install dependencies:

    cd taint_debug_app/taint_debug
    meteor npm install
  4. Run WhyFlow:

    cd taint_debug_app/taint_debug
    meteor run

Open your browser to http://localhost:3000

Using WhyFlow

Supported Queries

WhyFlow supports six interrogative query types:

Query Question
WhyFlow Why is there a taint flow from source X to sink Y?
WhyNotFlow Why is there no taint flow from source X to sink Y?
AffectedSinks If we alter a third-party library's model, which sinks are affected?
DivergentSinks Which third-party library model could influence multiple flows from the same source?
DivergentSources Which third-party library model could influence multiple flows to the same sink?
GlobalImpact Which third-party library model has the largest global influence?

Sample Queries: See replication/Experiment-Reproduction.md for concrete example queries with specific source/sink IDs that you can execute.

Graph Visualization

  • Green nodes: Sources
  • Red nodes: Sinks
  • Orange nodes: Third-party API calls
  • Blue nodes: Other intermediate nodes
  • Solid edges: Active taint flows
  • Dashed edges: Plausible flows (currently blocked)

Click on any node to view the corresponding source code location.

Repository Structure

WhyFlow/
├── paper.pdf                     # Accepted ICSE 2026 paper
├── LICENSE                       # MIT License
├── README.md                     # This file
├── Dockerfile                    # Docker container definition
├── taint_debug_app/              # Main WhyFlow application
│   ├── taint_debug/              # Meteor web application
│   ├── analysis_files/           # Analysis data and fact files
│   ├── app_souffle_queries/      # Soufflé Datalog query files
│   └── souffle_output/           # Generated query outputs
├── Subject_Prog_CodeQL_Taint/    # Subject program (Apache Dubbo) and CodeQL results
├── statistical_tests/            # User study statistical analysis
├── data/                         # User study materials and plots
│   ├── whyflow.csv               # WhyFlow accuracy results
│   ├── codeql.csv                # CodeQL accuracy results
│   └── plots.ipynb               # Jupyter notebook for figures
└── replication/                  # Artifact evaluation resources

Extending WhyFlow

Analyzing New Programs

  1. Run CodeQL taint analysis on your target program
  2. Export results in JSON/CSV format
  3. Place results in Subject_Prog_CodeQL_Taint/
  4. Update paths in the application configuration

See replication/Extending-WhyFlow.md for detailed instructions.

Adding Custom Queries

Place Soufflé Datalog query files in taint_debug_app/app_souffle_queries/

Citation

If you use WhyFlow in your research, please cite our paper:

@inproceedings{yetistiren2026whyflow,
  title={WhyFlow: Interrogative Debugger for Sensemaking Taint Analysis},
  author={Yetiştiren, Burak and Kang, Hong Jin and Kim, Miryung},
  booktitle={Proceedings of the 48th International Conference on Software Engineering},
  year={2026},
  organization={ACM}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

Acknowledgments

This work is supported by the National Science Foundation under grant numbers 2426162, 2106838, and 2106404, with additional support from Amazon and Samsung.

About

WhyFlow: Interrogative Debugger for Sensemaking Static Taint Analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors