Tabular ML Modeling Skill for Numerai Data

A Claude/Codex skill for ensembling gradient boosting models on large-scale tabular datasets, optimized for the Numerai competition.

Source

Built from the Kaggle Grandmasters Playbook: 7 Battle-Tested Modeling Techniques for Tabular Data. I wanted to see how much of that accumulated wisdom could be distilled into a skill that Codex CLI could actually execute on.

Skill Generation

Generated using Claude Opus 4.5 with Anthropic's official skill-creator.

Numerai-specific logic

Claude recommended embedding Numerai specializations (era-based validation, embargo handling, multi-target ensembling) in the skill instead of the prompt.

Runtime Environment

The skill assumes a Colab Pro+ environment with an A100-80GB GPU. I wanted to give the agent unfettered runway to go deep (30K trees on GPU frameworks). Adjust these assumptions if you're working with different hardware.

If you are using your ChatGPT subscription for codex instead of an API key, you will need to copy the ~/.codex/auth.json from your local computer to colab after installing codex cli in the colab terminal.

Data

The skill and prompt expect a preprocessed dataset:

Merged train.parquet and validation.parquet into a single file
Eras 0200–1000 only
Six target columns

You'll need to generate a similar data file or modify the skill to work with the official Numerai data directly.

Usage

This isn't a finished, ready-to-use project. It's an experiment on what AI skills can accomplish. That said, you should be able to copy the tabular-ml-modeling folder into .codex/skills/ or .claude/skills/ and reference it in your prompt. The included numerai_prompt.md shows how I ran the task. And yes, the prompt was also generated by Claude.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
tabular-ml-modeling		tabular-ml-modeling
LICENSE		LICENSE
README.md		README.md
numerai_prompt.md		numerai_prompt.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tabular ML Modeling Skill for Numerai Data

Source

Skill Generation

Numerai-specific logic

Runtime Environment

Data

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Tabular ML Modeling Skill for Numerai Data

Source

Skill Generation

Numerai-specific logic

Runtime Environment

Data

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages