This repository contains a small experimental test harness for applying the
EML operator from EML.tex to Transformer learning-rate schedules.
The foundation identity is:
eml(x, y) = exp(x) - log(y)
test_modeling.py uses that operator to build an EML-shaped warmup and decay
schedule, then compares it against common Transformer schedules:
- cosine decay with warmup
- linear decay with warmup
- inverse-sqrt schedule
- constant schedule with warmup
The model is a tiny BertForSequenceClassification created from
transformers.BertConfig, so the tests do not download pretrained weights.
Dropout is disabled to make the scheduler comparison deterministic. The
training and validation batches are synthetic but separate; both use the same
simple token-label rule.
Project Files
EML.tex: source research note describing the EML operator and the symbolic regression/master-formula idea.test_modeling.py: pytest file with the EML operator, LR scheduler builders, schedule trace tests, and tiny Transformer training comparison.pyproject.toml: Python project metadata and dependencies.main.py: placeholder entry point.
Setup
This project is configured for uv.
uv syncThe declared dependencies are:
pytest
torch
transformers
pyproject.toml currently requires Python >=3.14. If your local Torch or
Transformers build does not support that Python version yet, use a compatible
Python version and adjust requires-python accordingly.
Run Tests
uv run pytest -qOr run the modeling test directly:
uv run pytest -q test_modeling.pyIf torch or transformers are missing, test_modeling.py skips cleanly via
pytest.importorskip.
Show Scheduler Logs
The comparison tests emit LR traces, training losses, and held-out validation losses through Python logging. Use pytest log output to inspect them:
uv run pytest -q test_modeling.py -o log_cli=true --log-cli-level=INFOExpected log records include entries like:
schedule=eml lr_trace_start=[...] lr_trace_end=[...] peak=... final=...
schedule=cosine step=0 train_loss=... val_loss=... lr=...
schedule=linear initial_train_loss=... best_val_loss=... final_val_loss=...
What This Measures
These tests are smoke tests and instrumentation, not proof that the EML schedule is better in general. They verify that:
- the EML log identity matches
torch.logon positive real inputs - each schedule produces finite LR values
- the EML schedule warms up and then decays over a 32-point trace
- a tiny Transformer can run 32 training steps under each schedule
- comparable LR, training-loss, and held-out validation-loss logs are available for inspection
Current Toy Result
On the current 32-step synthetic setup, the observed ranking is:
constant > EML > inverse-sqrt > linear > cosine
The most recent logged run produced:
schedule final_train_loss final_val_loss
constant 0.138098 0.126626
EML 0.292432 0.290060
inverse-sqrt 0.352382 0.343020
linear 0.363872 0.363223
cosine 0.367602 0.367854
Interpretation:
- constant LR wins overall because the task is easy and benefits from continued high learning rate
- EML is currently the best decaying schedule in this harness
- cosine and linear decay too aggressively for this short run
- inverse-sqrt keeps a larger late LR than EML but performs worse on this synthetic rule
This is still not a real benchmark. To evaluate whether EML is actually better, run larger ablations with the same model, data, optimizer, seeds, and training budget while changing only the LR schedule. Useful next stress tests are higher base LR, 128+ steps, multiple seeds, and train-only label noise with clean validation.