Skip to content

Latest commit

 

History

History
113 lines (84 loc) · 3.96 KB

File metadata and controls

113 lines (84 loc) · 3.96 KB

DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents

Hao Li, Xiaogeng Liu, Hung-Chun Chiu, Dianqi Li, Ning Zhang, Chaowei Xiao.

Github license

framework

The official implementation of NeurIPS 2025 paper "DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents".

Update

  • [2026.4.19] 🛠️ Update the evaluation on AgentDyn.
  • [2026.1.30] 🛠️ Support the evaluation on more agents.
  • [2026.1.30] 🛠️ Update the evaluation code on ASB.

How to Start

We provide the evaluation of DRIFT, you can reproduce the results following:

Evaluating on AgentDojo

Construct Your Environment

conda create -n drift python=3.11
source activate drift
pip install "agentdojo==0.1.35"
pip install -r requirements.txt

Set Your API KEY

We provide three API providers, including OpenAI, Google, and OpenRouter. Please set up the API Key as you need.

export OPENAI_API_KEY=your_key
export GOOGLE_API_KEY=your_key
export OPENROUTER_API_KEY=your_key

run task with no attack

python pipeline_main.py \
--model gpt-4o-mini-2024-07-18 \
--build_constraints --injection_isolation --dynamic_validation
--suites banking,slack,travel,workspace

run task under attack

python pipeline_main.py \
--model gpt-4o-mini-2024-07-18 --do_attack \
--attack_type important_instructions \
--build_constraints --injection_isolation --dynamic_validation
--suites banking,slack,travel,workspace

You can evaluate any model from the supported providers by passing its model identifier (eg., gemini-2.5-pro) to the --model flag. To evaluate under an adaptive attack, include the --adaptive_attack configuration.

Evaluating on AgentDyn

To evaluate on AgentDyn, you can directly replace the AgentDojo dependency with the AgentDyn version. First, run:

git clone [email protected]:SaFo-Lab/AgentDyn.git
cd AgentDyn
pip install -e .

Then, AgentDojo dependency has been replaced with the AgentDyn version, which additionally supports the shopping, github, and dailylife suites. You can use the same commands as for evaluating on AgentDojo to evaluate on these three suites, as shown below:

run task with no attack

python pipeline_main.py \
--model gpt-4o-mini-2024-07-18 \
--build_constraints --injection_isolation --dynamic_validation
--suites shopping,github,dailylife

run task under attack

python pipeline_main.py \
--model gpt-4o-mini-2024-07-18 --do_attack \
--attack_type important_instructions \
--build_constraints --injection_isolation --dynamic_validation
--suites shopping,github,dailylife

Evaluating on ASB

Please refer to ASB_DRIFT/README.md.

Inspect Results

You can find the cached results in runs/.

References

If you find this work useful in your research or applications, we appreciate that if you can kindly cite:

@articles{DRIFT,
  title={DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents},
  author={Hao Li and Xiaogeng Liu and Hung-Chun Chiu and Dianqi Li and Ning Zhang and Chaowei Xiao},
  journal = {NeurIPS},
  year={2025}
}