This project creates NCAA Division I college basketball tournament bracket predictions (men's or women's) using a neural network.
The starting point is a student research paper in research/—Comparing Various Machine Learning Statistical Methods Using Vari—that compares several machine learning and classical statistical approaches to predicting NCAA Division I basketball tournament outcomes from team-level statistics. The paper frames the problem as a classification task (e.g. which side of a matchup wins) and evaluates how different estimators behave on that data.
mmnn turns that idea into a small, usable codebase: it fetches and normalizes tournament data, builds the same style of stat-based features, trains a neural network as one such model, and adds commands to score a full bracket (holdout year) or a single head-to-head matchup. Men’s and women’s tournaments are both supported.
Tournament data for 2010–2026 (men's) is included under data/men/. To add more years, use mmnn data fetch. Use -w / --women on mmnn data … and mmnn nn … to use women's data under data/women/ (Sports Reference URLs use women instead of men in the path).
Basic workflow (men's, default): Processed *-data.csv files are already in data/men/ and data/women/, so you can train immediately. To add or refresh a year, run mmnn data fetch <year> and mmnn data process <year> first.
mmnn nn train
mmnn nn bracket 2025
mmnn nn predict Duke SienaWomen's tournament — add -w or --women to each command below. Bracket evaluation needs at least two processed years total (e.g. another year already in data/women/ plus the year you evaluate):
mmnn data fetch 2025 --women
mmnn data process 2025 --women
mmnn nn train --women
mmnn nn bracket 2025 --women
mmnn nn predict "South Carolina" UConn --womenmmnn data fetch <year>
mmnn data fetch <year> --womenFetch the raw bracket and team stats for the given year from Sports Reference.
Process raw data for a given year into the format needed by the neural network. Reads data/men|women/YEAR-teams.csv and YEAR-games.csv, then writes YEAR-data.csv with per-game delta features and a Winner label.
mmnn data process <year>
mmnn data process <year> --womenDevelopment (with Hatch):
hatch run mmnn data process 2025Train the model on all *-data.csv files in data/men/ or data/women/ (90% train / 10% test split), then save weights to data/men/model.pt or data/women/model.pt:
mmnn nn train
mmnn nn train --womenRetrain the network on every *-data.csv except the bracket year, then predict each game in that year’s tournament and print per-game results plus accuracy, log loss, and related metrics. The model is fit in memory only; it does not read or overwrite data/men/model.pt or data/women/model.pt.
You need at least one other processed year besides the bracket year (mmnn data process <year>), and that year’s {year}-games.csv and {year}-teams.csv must exist.
mmnn nn bracket 2025
mmnn nn bracket 2025 --womenOptional --epochs sets the training epoch count (same default as mmnn nn train). Useful for quicker runs while iterating:
mmnn nn bracket 2025 --epochs 50Predict which team wins (higher- or lower-ranked) given two team names. Team stats are looked up from data/men/2026-teams.csv (or data/women/2026-teams.csv with --women):
mmnn nn predict <team1> <team2>
mmnn nn predict "Ohio State" TCULook in the appropriate 2026-teams.csv for the correct team names to use.
From PyPI (end users):
pip install mmnnFrom source (development):
Development uses Hatch for environments and commands—not pip install -e .. Clone the repo and run the CLI or tests through Hatch:
hatch run mmnn data fetch 2024
hatch run mmnn data process 2024
hatch run mmnn nn train
hatch run mmnn nn bracket 2025
hatch run mmnn nn predict Duke UConn
hatch run test:testtest:test runs the test script in the test environment (pytest over tests/; see [tool.hatch.envs.test] in pyproject.toml). Use hatch shell if you want an interactive shell with the project and its dependencies on the path.
mmnn is distributed under the terms of the MIT license.