Skip to content

Chrislysen/Norwegian-AI-Championship

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

NM i AI 2026 — Team INNBerkeley

Norway's National AI Championship · March 19–22, 2026 · 1,000,000 NOK Prize Pool · 410+ Teams

Solo competitor entry across three AI challenges: grocery shelf object detection, AI-powered accounting agent, and Norse world state prediction.

Task Score Details
Object Detection (NorgesGruppen) 0.8807 [email protected] YOLOv8x + multi-scale TTA + classifier re-ranking
AI Accounting Agent (Tripletex) Scored FastAPI + Gemini 2.5 Flash + Tripletex API
Astar Island Built Viewport observation + prior-based prediction

Repository Structure

nm-ai-2026/
├── object-detection/
│   ├── run.py                  # Competition inference script (ONNX + TTA + WBF + classifier)
│   ├── train.py                # YOLOv8x training at 1280px resolution
│   ├── train_v8l.py            # YOLOv8l backup model training at 1024px
│   ├── export_onnx.py          # PyTorch → ONNX export with FP32 at 1280px
│   ├── prepare_data.py         # COCO → YOLO format conversion
│   ├── build_ref_bank.py       # Reference product image feature extraction
│   ├── eval_submission.py      # Local mAP evaluation against COCO ground truth
│   ├── gpu_sweep.py            # GPU-accelerated parameter sweep (WBF, thresholds)
│   ├── sweep_eval.py           # Multi-scale inference + parameter optimization
│   └── package_submission.py   # Zip packaging with size validation
│
├── tripletex-agent/
│   ├── main.py                 # FastAPI server with /solve and /health endpoints
│   ├── agent.py                # LLM-powered agent with Gemini 2.5 Flash
│   ├── tripletex_client.py     # Tripletex REST API client with Basic Auth
│   ├── prompts.py              # Norwegian accounting system prompt + API schemas
│   ├── requirements.txt        # Python dependencies
│   └── Dockerfile              # Cloud Run deployment container
│
├── astar-island/
│   ├── main.py                 # End-to-end pipeline: observe → predict → submit
│   ├── client.py               # Astar Island REST API client
│   ├── config.py               # Configuration, terrain classes, API settings
│   ├── predictor.py            # Prior + observation blending for probability tensors
│   ├── aggregator.py           # Viewport observation frequency aggregation
│   ├── tiler.py                # Viewport tiling strategy for 40×40 maps
│   └── scores.py               # Score retrieval and display
│
└── README.md

1. Object Detection Pipeline

Challenge

Detect and classify 357 grocery product categories on store shelf images. Scored by [email protected] with a weighted split: 70% detection (bounding box accuracy, category ignored) and 30% classification (correct product identification).

  • Training data: 248 shelf images, ~22,700 COCO-format bounding box annotations
  • Categories: 357 product classes across 4 store sections (Egg, Frokost, Knekkebrod, Varmedrikker)
  • Reference data: 327 individual products with multi-angle photos
  • Sandbox: NVIDIA L4 GPU (24GB), Python 3.11, 300-second timeout, no network access

Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                        INFERENCE PIPELINE                           │
│                                                                     │
│  ┌──────────┐    ┌───────────────────┐    ┌──────────────────────┐  │
│  │          │    │  Multi-Scale TTA   │    │                      │  │
│  │  Input   │───▶│  960px + 1280px    │───▶│   Weighted Box       │  │
│  │  Image   │    │  + 1536px          │    │   Fusion (WBF)       │  │
│  │          │    │                    │    │   IoU=0.55           │  │
│  └──────────┘    └───────────────────┘    └──────────┬───────────┘  │
│                                                      │              │
│                         ┌────────────────────────────┘              │
│                         ▼                                           │
│            ┌─────────────────────────┐                              │
│            │   Section-Aware Filter  │                              │
│            │   Suppress impossible   │                              │
│            │   categories per store  │                              │
│            │   section (4 sections)  │                              │
│            └────────────┬────────────┘                              │
│                         ▼                                           │
│            ┌─────────────────────────┐                              │
│            │  Classifier Re-Ranking  │                              │
│            │  EfficientNet (timm)    │                              │
│            │  crops vs reference     │                              │
│            │  product images         │                              │
│            └────────────┬────────────┘                              │
│                         ▼                                           │
│            ┌─────────────────────────┐                              │
│            │   COCO JSON Output      │                              │
│            │   predictions.json      │                              │
│            └─────────────────────────┘                              │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│                        TRAINING PIPELINE                            │
│                                                                     │
│  ┌──────────┐    ┌────────────────┐    ┌──────────────────────────┐ │
│  │  COCO    │    │  prepare_data  │    │   YOLOv8x Training       │ │
│  │  Dataset │───▶│  .py           │───▶│   imgsz=1280, batch=1    │ │
│  │  248 img │    │  COCO → YOLO   │    │   AdamW, cos_lr          │ │
│  └──────────┘    └────────────────┘    │   epochs=300, patience=80│ │
│                                        └────────────┬─────────────┘ │
│                                                     ▼               │
│  ┌──────────────┐    ┌────────────────┐    ┌────────────────────┐   │
│  │  submission   │    │  package_      │    │  export_onnx.py    │   │
│  │  .zip         │◀──│  submission.py  │◀──│  ONNX @ 1280px     │   │
│  │  324 MB       │    │  verify < 420MB│    │  FP32, opset=17    │   │
│  └──────────────┘    └────────────────┘    └────────────────────┘   │
└─────────────────────────────────────────────────────────────────────┘

Training Iterations

Run Model Resolution Epochs Val [email protected] Notes
1 YOLOv8l 640 81 (early stop) 0.764 First successful run, OOM debugging
2 YOLOv8x 960 117 (interrupted) 0.763 Auto-batch tuning
3 YOLOv8l 1024 67 (early stop) 0.765 Backup model for ensemble experiments
4 YOLOv8x 1280 ~200 0.8807* Overnight training, final submission

*Competition leaderboard score with full inference pipeline

Optimization Journey

Optimization Impact Description
Baseline (single scale) 0.7202 YOLOv8l @ 1024, single inference pass
+ Section restriction +0.005 Suppress impossible categories per store section
+ Multi-scale TTA +0.036 Inference at 960 + 1280 + 1536px with WBF fusion
+ Classifier re-ranking +0.003 EfficientNet crop classifier vs reference product images
+ WBF parameter tuning +0.009 conf_type=box_and_model_avg, removing dedup NMS
Training at 1280 resolution ~+0.10 Eliminating train/inference resolution mismatch

Final Inference Configuration

TTA_SCALES = [960, 1280, 1536]
TTA_HFLIP = False

WBF_IOU = 0.55
CONF_TYPE = "box_and_model_avg"
FINAL_SCORE_THRES = 0.001
DEDUP_IOU = 1.0

CLS_OVERRIDE_SIM = 0.55
CLS_YOLO_CONF_THRES = 0.60
CLS_BOOST = 1.30

Sandbox Constraints

The competition sandbox blocks dangerous imports (os, sys, subprocess, pickle, yaml, shutil, multiprocessing, threading, etc.). All file operations use pathlib, configuration uses json. Maximum submission size: 420MB with at most 3 weight files and 10 Python files.


2. AI Accounting Agent (Tripletex)

Challenge

Build an autonomous AI agent exposed as an HTTPS endpoint that receives accounting tasks in Norwegian natural language, parses them, and executes the correct API calls against Tripletex — Norway's leading accounting platform. Scored on field-by-field correctness with an efficiency bonus for fewer API calls.

  • 30 unique task types across 3 difficulty tiers
  • 300-second timeout per task
  • Fresh API credentials provided with each submission
  • Norwegian language prompts with domain-specific accounting terminology

Architecture

┌──────────────────────────────────────────────────────────────────────┐
│                     TRIPLETEX AI AGENT                                │
│                                                                      │
│  Competition                                                         │
│  Platform        ┌─────────────┐                                     │
│     │            │  FastAPI     │                                     │
│     │  POST      │  /solve      │                                     │
│     ├───────────▶│  endpoint    │                                     │
│     │            └──────┬──────┘                                     │
│     │                   │                                            │
│     │            ┌──────▼──────────────────────────┐                 │
│     │            │     API Discovery Layer          │                 │
│     │            │  GET /vatType, /currency,         │                 │
│     │            │  /costCategory, /paymentType,     │                 │
│     │            │  /department, /employee            │                 │
│     │            └──────┬──────────────────────────┘                 │
│     │                   │ cached IDs injected                        │
│     │            ┌──────▼──────────────────────────┐                 │
│     │            │     Gemini 2.5 Flash             │                 │
│     │            │  Parse Norwegian prompt →         │                 │
│     │            │  Structured tool calls            │                 │
│     │            └──────┬──────────────────────────┘                 │
│     │                   │ function calls                             │
│     │            ┌──────▼──────────────────────────┐                 │
│     │            │  Tripletex API Client            │                 │
│     │            │  Basic Auth (0:session_token)     │                 │
│     │            │  POST /employee, /customer,       │                 │
│     │            │  /invoice, /order, etc.            │                 │
│     │            └──────┬──────────────────────────┘                 │
│     │                   │                                            │
│     │  {"status":       │                                            │
│     │   "completed"}    │                                            │
│     │◀──────────────────┘                                            │
└──────────────────────────────────────────────────────────────────────┘

How It Works

  1. Request arrives with a Norwegian prompt, optional file attachments (PDFs, images), and fresh Tripletex API credentials
  2. Discovery layer queries Tripletex to cache VAT types, currencies, cost categories, departments, and employees
  3. Gemini 2.5 Flash parses the prompt using the full OpenAPI schema (3.6MB parsed) and returns structured function calls
  4. API client executes calls against the Tripletex proxy using Basic Auth (0:session_token)
  5. Agent returns {"status": "completed"} — the competition platform then verifies field-by-field correctness

Norwegian Accounting Terminology

Norwegian English API Endpoint
Opprett en ansatt Create employee POST /employee
Opprett en kunde Create customer POST /customer
Lag en faktura Create invoice POST /invoice
Registrer betaling Register payment POST /invoice/{id}/payment
Kreditnota Credit note POST /creditNote
Reiseregning Travel expense POST /travelExpense
Bokfør bilag Post voucher POST /ledger/voucher
Bankavstemming Bank reconciliation POST /bankReconciliation
Kontoadministrator Admin user userType: "EXTENDED"

Technical Decisions

  • Gemini 2.5 Flash for LLM inference — free via GCP, 3-6 second responses, no rate limits
  • OpenAPI spec parsing — extracted exact field names from 3.6MB Tripletex Swagger spec to eliminate validation errors
  • Dynamic ID discovery — pre-fetched real entity IDs on each request so the LLM operates on concrete values
  • 240-second hard timeout — enforced below 300s competition limit to prevent silent timeouts

3. Astar Island — Norse World Prediction

Challenge

Observe a procedurally generated Norse civilization simulator through limited 15×15 viewports (50 queries shared across 5 seeds) and predict the final terrain probability distribution across a 40×40 grid with 6 terrain types. Scored by entropy-weighted KL divergence.

Approach

  • Tiled observation: 9 viewport tiles cover the full 40×40 map per seed (10 queries per seed from 50 budget)
  • Prior initialization: Initial terrain type boosted by 0.5 probability weight
  • Observation blending: Smoothed empirical frequency merged with prior for observed cells, pure prior for unobserved
  • Probability floor: Minimum 0.01 per terrain type to avoid KL divergence explosions
  • Validation: Per-cell probability normalization verified before submission

Installation

Object Detection

cd object-detection/

# Training environment
pip install ultralytics==8.1.0 torch==2.6.0 torchvision==0.21.0 --index-url https://download.pytorch.org/whl/cu124
pip install onnx onnxruntime-gpu onnxsim ensemble-boxes timm==0.9.12

# Prepare data (COCO → YOLO format)
python prepare_data.py

# Train
python train.py

# Export to ONNX
python export_onnx.py

# Evaluate locally
python eval_submission.py

# Package submission
python package_submission.py

Tripletex Agent

cd tripletex-agent/

pip install -r requirements.txt

# Set environment variable
export GEMINI_API_KEY="your-gemini-api-key"

# Run locally
python main.py  # Starts on port 8080

# Deploy to Google Cloud Run
gcloud run deploy tripletex-agent \
  --source . \
  --region europe-north1 \
  --allow-unauthenticated \
  --memory 1Gi \
  --timeout 300

Astar Island

cd astar-island/

pip install requests numpy

# Edit config.py with your JWT token
python main.py

Competition Context

NM i AI 2026 is Norway's national AI championship — a 69-hour competition bringing together 3,000+ participants across 410+ teams competing for a 1,000,000 NOK prize pool. The competition featured three independent AI challenges spanning computer vision, autonomous agents, and probabilistic prediction, with the overall score calculated as the average of normalized scores across all tasks.

This was my first AI competition. I competed solo, building all three systems from scratch over a single weekend — debugging CUDA memory constraints at 3am, iterating on model architectures, parsing Norwegian accounting terminology, and managing three parallel codebases under extreme time pressure. The experience was an intensive crash course in applied ML engineering: every decision was a real-time tradeoff between model quality, inference speed, and the ticking clock.

Tech Stack

Category Tools
ML Frameworks PyTorch, Ultralytics YOLOv8, ONNX Runtime, timm
LLM APIs Google Gemini 2.5 Flash
Backend FastAPI, Docker
Infrastructure Google Cloud Platform (Cloud Run, Vertex AI)
Hardware NVIDIA RTX 5080 Laptop (16GB), NVIDIA L4 (competition sandbox)

License

MIT


Built in 48 hours as a solo competitor. A first AI competition — and definitely not the last.

About

nm-ai-2026

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors