Skip to content

aibuildai/AI-Build-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AIBuildAI – An AI agent that automatically builds AI models

🏆 #1 on OpenAI MLE-Bench

Downloads


demo.mp4

Introduction

AIBuildAI is an AI agent that automatically builds AI models. Given a task, it runs an agent loop that analyzes the problem, designs models, writes code to implement them, trains them, tunes hyperparameters, evaluates model performance, and iteratively improves the models. By automating the model development workflow, AIBuildAI reduces much of the manual effort required to build AI models.

AIBuildAI Architecture


Current Results

On OpenAI MLE-Bench, AIBuildAI ranked #1, demonstrating strong performance on real-world AI model building tasks.

MLE-Bench Results


Quick Start

Installation

AIBuildAI requires a Linux x86_64 machine.

curl -L -O https://github.com/aibuildai/AI-Build-AI/releases/latest/download/aibuildai-linux-x86_64-v0.1.0.tar.gz
tar -xzf aibuildai-linux-x86_64-v0.1.0.tar.gz
cd aibuildai-linux-x86_64-v0.1.0
./install.sh

Set up credentials

export ANTHROPIC_API_KEY=your-api-key

Run

Example task: Predict the enzyme class of a protein from its amino acid sequence (Yu et al., Science 2023).

git clone https://github.com/aibuildai/AI-Build-AI.git && cd AI-Build-AI
aibuildai --task-name protein-ec-prediction \
  --data-dir data/protein-ec-prediction \
  --playground-dir /path/to/playground \
  --model claude-opus-4-6 \
  --max-agent-calls 8 \
  --run-budget-minutes 60 \
  --num-candidates 3 \
  --instruction "$(cat tasks/protein-ec-prediction.md)" \
  --pipeline-budget-minutes 90 \
  --no-form

AIBuildAI takes two key inputs: --data-dir, the path to the training data for the task, and --instruction, a natural-language description of the AI task to solve.

Important:

Run the command directly in your terminal. Do not wrap the command in a .sh or .bash script. Running it through a script may cause the TUI (Text User Interface) to crash.

Results

Output directory

After a run completes, the output directory usually looks like (structure may slightly vary by task):

├── candidate_1/  candidate_2/  candidate_3/  # Auto-generated training scripts and model checkpoints
├── checkpoint.pth       # Best model checkpoint
├── inference.py         # Standalone inference script for the final model
├── submission.csv       # Test predictions (if test inputs are provided)
└── progress.pdf         # Visual progress report

The main outputs of an AIBuildAI run are the model checkpoints and the script inference.py, which runs predictions with the final model on any data.

Evaluation

In the example protein-ec-prediction task, we provide unlabeled test data in the data folder, so AIBuildAI also generates a predicted-label file submission.csv. To evaluate the predictions against ground-truth labels:

python scripts/eval_protein_ec.py \
  --labels data/labels/protein-ec-prediction.csv \
  --submission /path/to/playground/code/protein-ec-prediction/timestamp/submission.csv

Other tasks

We provide additional task markdowns in the tasks/ folder. You can also write your own task markdown and point --data-dir to your own dataset.


Command line options

To see all available options, run:

aibuildai -h

Interactive form mode

Alternatively, you can run AIBuildAI using the interactive form interface by running without --no-form:

aibuildai

This will launch a TUI (Text User Interface) where you can fill in the required parameters interactively.


License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.


Citation

@misc{zhang2026aibuildai,
  title={AIBuildAI: An AI agent that automatically builds AI models},
  author={Ruiyi Zhang and Peijia Qin and Qi Cao and Li Zhang and Pengtao Xie},
  year={2026}
}

About

AIBuildAI – An AI agent that automatically builds AI models (#1 on OpenAI MLE-Bench)

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors