This repository contains the official implementation of the methods proposed in our ICLR 2026 paper: Learning Dynamics of Logits Debiasing for Long-Tailed Semi-Supervised Learning
Yue Cheng*1, 2, Jiajun Zhang*1, Xiaohui Gao3, Weiwei Xing✉1, Zhanxing Zhu✉4
*Equal Contribution, ✉Corresponding Author
1Beijing Jiaotong University 2Ant Group
3Northwest Polytechnical University 4University of Southampton
(I) Learning Dynamics Analysis. We analyze long-tailed semi-supervised learning from the perspective of learning dynamics and show that class imbalance induces accumulated logit bias that dominates model predictions.
(II) Baseline Image as Bias Indicator. We introduce a task-irrelevant baseline image and theoretically show that its logits converge to the class prior, providing a direct and interpretable indicator of accumulated class bias.
(III) Unified View of Debiasing Methods. Within this framework, existing debiasing strategies such as logit adjustment, reweighting, and resampling are unified as mechanisms that reshape gradient dynamics to counteract bias accumulation.
(IV) Dynamic Dataset Pruning for LTSSL. Based on these insights, we propose DyTrim, a dynamic dataset pruning framework reallocates gradient budget via class-aware pruning on labeled data and confidence-based soft pruning on unlabeled data, outperforming existing state-of-the-art baselines.
conda create -n dytrim python=3.10
conda activate dytrim
git clone https://github.com/Jiajun0425/DyTrim.git
cd DyTrim
pip install -r requirements.txt
⚠️ Note: Please usepip install git+https://github.com/ildoonet/pytorch-randaugmentto install RandAugment.
If you want to use Small-ImageNet-127, follow the instructions in prepare_small_imagenet_127/README.md.
Below are the scripts for training the model on different datasets.
CIFAR-10:
python train.py --num_max 1500 --num_max_u 3000 --imb_ratio 100 --imb_ratio_u 100 --dataset cifar10 --gpu 0 --manualSeed 0CIFAR-100:
python train.py --num_max 150 --num_max_u 300 --imb_ratio 50 --imb_ratio_u 50 --dataset cifar100 --gpu 0 --manualSeed 0STL-10:
python train.py --num_max 450 --num_max_u 1 --imb_ratio 10 --imb_ratio_u 1 --dataset stl10 --gpu 0 --manualSeed 0Small-ImageNet-127:
python train.py --img-size 32 --imb_ratio 1 --imb_ratio_u 1 --dataset smallimagenet --gpu 0 --manualSeed 0If you find this code useful, please cite our paper:
@inproceedings{cheng2026dytrim,
title = {Learning Dynamics of Logits Debiasing for Long-Tailed Semi-Supervised Learning},
author = {Cheng, Yue and Zhang, Jiajun and Gao, Xiaohui and Xing, Weiwei and Zhu, Zhanxing},
booktitle = {International Conference on Learning Representations},
year = {2026}
}Our code is based on CDMAD and InfoBatch.
For questions or issues, please:
- Open an issue on GitHub, or
- Contact:
jiajunzhang@bjtu.edu.cn
