Welcome to PyLO !

Overview

GitHub Repository GitHub stars License arXiv Paper

PyLo is a PyTorch-based learned optimizer library that enables researchers and practitioners to implement, experiment with, and share learned optimizers. It bridges the gap found in the research of learned optimizers and using it for actual practical scenarios.

Checkout our paper here: arXiv

Note

New to PyLo? Check out our Usage Guide guide and explore complete training examples at pylo_examples.

Key Features

  • Pre-trained learned optimizers ready for production use

  • Seamless integration with PyTorch optim library and training loops

  • Comprehensive benchmarking utilities against standard optimizers

  • Supports sharing model weights through Hugging Face Hub

Quick Example

import torch
from pylo.optim import VeLO_CUDA

# Initialize a model
model = torch.nn.Linear(10, 2)

# Create a learned optimizer instance
optimizer = VeLO_CUDA(model.parameters())

# Use it like any PyTorch optimizer
for epoch in range(10):
    optimizer.zero_grad()
    loss = loss_fn(model(input), target)
    loss.backward()
    optimizer.step(loss) # pass the loss

More Examples

Looking for complete, runnable examples? Check out the pylo_examples repository which includes:

  • Image Classification - Training Vision Transformers (ViT) and ResNets on ImageNet and CIFAR-10

  • Language Modeling - Training GPT-2 models

  • Distributed Training - Multi-GPU examples with FSDP and DDP

Each example includes detailed setup instructions, training scripts, and configuration files to help you get started quickly.

Documentation

Development:

How to Cite

If you use PyLo in your research, please cite:

@article{pylo,
title={PyLO: Towards Accessible Learned Optimizers in PyTorch},
author={Janson, Paul and Therien, Benjamin and Anthony, Quentin and Huang, Xiaolong and Moudgil, Abhinav and Belilovsky, Eugene},
journal={arXiv preprint arXiv:2506.10315},
year={2025}
}