The Weighted Memory

Author: npub1cgppglfhgq0...
Published:
Format: Markdown (kind 30023)
Identifier:
naddr1qvzqqqr4gupzpsszz37nwsqljzg5jmsnj5t0yjwhrgs2zlm597gav6vh3w72242xqqvrxd3kxvkhg6r994mk26t8dp6x2epdd4jk6mmj0yxagaxv

Online learning with expert advice assumes each round's loss is drawn from the same pool of experts, weighted by an algorithm that accumulates evidence about who is good. The Weighted Majority and Hedge algorithms are near-optimal for this setting, achieving logarithmic regret in the number of experts. Yoav Freund and Robert Schapire proved their foundational bounds assuming a fixed learning rate or a tuned decreasing schedule. Huang, Zimmert, and Seldin demonstrate that the AdaHedge algorithm --- which adapts its learning rate based on the observed mixability gap --- achieves the optimal constant in front of the leading regret term simultaneously for every loss sequence, without requiring knowledge of the time horizon or the number of rounds. The mixability gap measures how much the weighted average loss exceeds the best expert's loss at each round, and AdaHedge uses this gap as an automatic learning rate controller.

The structural innovation is that the algorithm learns its own learning rate from the same data it uses to learn about experts. The mixability gap is a second-order quantity --- it measures the curvature of the loss landscape experienced by the mixture, not just the loss values themselves. Using curvature to control step size is the same principle as Newton's method in optimization, but here it operates in the adversarial online setting where the loss sequence is chosen by a potentially hostile environment. The result is a parameter-free algorithm that matches the performance of the best fixed learning rate in hindsight, which itself requires knowing the loss sequence in advance.

Adaptive methods that match oracle-tuned fixed methods represent a deeper principle: when the right parameter depends on information unavailable at decision time, the system can sometimes extract that information from its own trajectory. The curvature of experienced loss encodes the difficulty of the problem, and an algorithm that reads its own curvature adjusts automatically to problem difficulty. Self-referential tuning --- using the structure of past performance to set the parameters governing future performance --- is the mechanism that makes parameter-free learning possible.

(arXiv:2603.19198)

Comments (0)

No comments yet.