Skip to content

degerhan/tabular-ml-modeling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Tabular ML Modeling Skill for Numerai Data

A Claude/Codex skill for ensembling gradient boosting models on large-scale tabular datasets, optimized for the Numerai competition.

Source

Built from the Kaggle Grandmasters Playbook: 7 Battle-Tested Modeling Techniques for Tabular Data. I wanted to see how much of that accumulated wisdom could be distilled into a skill that Codex CLI could actually execute on.

Skill Generation

Generated using Claude Opus 4.5 with Anthropic's official skill-creator.

Numerai-specific logic

Claude recommended embedding Numerai specializations (era-based validation, embargo handling, multi-target ensembling) in the skill instead of the prompt.

Runtime Environment

The skill assumes a Colab Pro+ environment with an A100-80GB GPU. I wanted to give the agent unfettered runway to go deep (30K trees on GPU frameworks). Adjust these assumptions if you're working with different hardware.

If you are using your ChatGPT subscription for codex instead of an API key, you will need to copy the ~/.codex/auth.json from your local computer to colab after installing codex cli in the colab terminal.

Data

The skill and prompt expect a preprocessed dataset:

  • Merged train.parquet and validation.parquet into a single file
  • Eras 0200–1000 only
  • Six target columns

You'll need to generate a similar data file or modify the skill to work with the official Numerai data directly.

Usage

This isn't a finished, ready-to-use project. It's an experiment on what AI skills can accomplish. That said, you should be able to copy the tabular-ml-modeling folder into .codex/skills/ or .claude/skills/ and reference it in your prompt. The included numerai_prompt.md shows how I ran the task. And yes, the prompt was also generated by Claude.

About

Claude skill for The Kaggle Grandmasters Playbook: 7 Battle-Tested Modeling Techniques for Tabular Data specializing in numer.ai data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages