AIBuildAI – An AI agent that automatically builds AI models

_{🏆 #1 on OpenAI MLE-Bench}

demo.mp4

Introduction

AIBuildAI is an AI agent that automatically builds AI models. Given a task, it runs an agent loop that analyzes the problem, designs models, writes code to implement them, trains them, tunes hyperparameters, evaluates model performance, and iteratively improves the models. By automating the model development workflow, AIBuildAI reduces much of the manual effort required to build AI models.

Current Results

On OpenAI MLE-Bench, AIBuildAI ranked #1, demonstrating strong performance on real-world AI model building tasks.

Quick Start

Installation

AIBuildAI requires a Linux x86_64 machine.

curl -L -O https://github.com/aibuildai/AI-Build-AI/releases/latest/download/aibuildai-linux-x86_64-v0.1.0.tar.gz
tar -xzf aibuildai-linux-x86_64-v0.1.0.tar.gz
cd aibuildai-linux-x86_64-v0.1.0
./install.sh

Set up credentials

export ANTHROPIC_API_KEY=your-api-key

Run

Example task: Predict the enzyme class of a protein from its amino acid sequence (Yu et al., Science 2023).

git clone https://github.com/aibuildai/AI-Build-AI.git && cd AI-Build-AI
aibuildai --task-name protein-ec-prediction \
  --data-dir data/protein-ec-prediction \
  --playground-dir /path/to/playground \
  --model claude-opus-4-6 \
  --max-agent-calls 8 \
  --run-budget-minutes 60 \
  --num-candidates 3 \
  --instruction "$(cat tasks/protein-ec-prediction.md)" \
  --pipeline-budget-minutes 90 \
  --no-form

AIBuildAI takes two key inputs: --data-dir, the path to the training data for the task, and --instruction, a natural-language description of the AI task to solve.

Important:

Run the command directly in your terminal. Do not wrap the command in a .sh or .bash script. Running it through a script may cause the TUI (Text User Interface) to crash.

Results

Output directory

After a run completes, the output directory usually looks like (structure may slightly vary by task):

├── candidate_1/  candidate_2/  candidate_3/  # Auto-generated training scripts and model checkpoints
├── checkpoint.pth       # Best model checkpoint
├── inference.py         # Standalone inference script for the final model
├── submission.csv       # Test predictions (if test inputs are provided)
└── progress.pdf         # Visual progress report

The main outputs of an AIBuildAI run are the model checkpoints and the script inference.py, which runs predictions with the final model on any data.

Evaluation

In the example protein-ec-prediction task, we provide unlabeled test data in the data folder, so AIBuildAI also generates a predicted-label file submission.csv. To evaluate the predictions against ground-truth labels:

python scripts/eval_protein_ec.py \
  --labels data/labels/protein-ec-prediction.csv \
  --submission /path/to/playground/code/protein-ec-prediction/timestamp/submission.csv

Other tasks

We provide additional task markdowns in the tasks/ folder. You can also write your own task markdown and point --data-dir to your own dataset.

Command line options

To see all available options, run:

aibuildai -h

Interactive form mode

Alternatively, you can run AIBuildAI using the interactive form interface by running without --no-form:

aibuildai

This will launch a TUI (Text User Interface) where you can fill in the required parameters interactively.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Citation

@misc{zhang2026aibuildai,
  title={AIBuildAI: An AI agent that automatically builds AI models},
  author={Ruiyi Zhang and Peijia Qin and Qi Cao and Li Zhang and Pengtao Xie},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
data		data
scripts		scripts
tasks		tasks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AIBuildAI – An AI agent that automatically builds AI models

_{🏆 #1 on OpenAI MLE-Bench}

Introduction

Current Results

Quick Start

Installation

Set up credentials

Run

Results

Output directory

Evaluation

Other tasks

Command line options

Interactive form mode

License

Citation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AIBuildAI – An AI agent that automatically builds AI models

🏆 #1 on OpenAI MLE-Bench

Introduction

Current Results

Quick Start

Installation

Set up credentials

Run

Results

Output directory

Evaluation

Other tasks

Command line options

Interactive form mode

License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

_{🏆 #1 on OpenAI MLE-Bench}

Packages