Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling
-
Updated
May 17, 2025 - Python
Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling
[CVPR2025] CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
Implements the paper "Wukong: Towards a Scaling Law for Large-Scale Recommendation" from Meta.
Navigating Model Phase Transitions to Enable Extreme Lossless Compression: A Perspective
This is an official repository for "Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources" (NeurIPS 2023).
(NeurIPS 2025) Official Code for L²M: Mutual Information Scaling Law for Long-Context Language Modeling
Scaling Law of Neural Koopman Operators
Theory-derived PyTorch optimizer matching Adam with zero tuning. τ* = κ√(σ²/λ) — validated on CIFAR-10 and CIFAR-100
Scaling laws for positron production in laser–electron-beam collisions
Optimize learning rates dynamically with the Syntonic optimizer, enhancing performance beyond Adam without the need for hyperparameter tuning.
Bond Order Scaling Law: 100% accurate molecular vs crystalline prediction for binary compounds. Cascade classifier L4→L5→L6 covering 76K JARVIS-DFT materials. No DFT required.
Add a description, image, and links to the scaling-law topic page so that developers can more easily learn about it.
To associate your repository with the scaling-law topic, visit your repo's landing page and select "manage topics."