🧶 Improving Representation for Imbalanced Regression
through Geometric Constraints

Improve Representation for Imbalanced Regression through Geometric Constraints (CVPR 2025)
Zijian Dong^1*, Yilei Wu^1*, Chongyao Chen^2*, Yingtian Zou¹, Yichi Zhang¹, Juan Helen Zhou¹
¹National University of Singapore, ²Duke University, ^*Equal contribution

Illustration of our geometric constraint-based approach

💡 Introduction

Our paper addresses representation learning for imbalanced regression by introducing two geometric constraints: enveloping loss, which encourages representations to uniformly occupy a hypersphere's surface, and homogeneity loss, which ensures evenly spaced representations along a continuous trace. Unlike classification-based methods that cluster features into distinct groups, our approach preserves the continuous and ordered nature essential for regression tasks. We integrate these constraints into a Surrogate-driven Representation Learning (SRL) framework. Experiments on several datasets demonstrate significant performance improvements, especially in regions with limited data.

🔧 Usage

An example dataset is provided as follows.

💻 Pretrained Weights

We provide our model weights trained on DIR benchmark datasets:

📂 File Structure

The repository is organized as follows:

imbalanced-regression/
├── sts-b-dir/             # STS-B dataset for semantic textual similarity regression
│   ├── preprocess.py      # Preprocessing and data preparation for STS-B
│   ├── dfr.py             # Method implementation
│   ├── evaluate.py        # Evaluation scripts for model performance
│   ├── models.py          # Model architectures for the regression tasks
│   ├── tasks.py           # Task-specific configurations and operations
│   ├── trainer.py         # Training and evaluation pipelines
│   ├── train.py           # Script to initiate the training process
│   └── glue_data/         # Directory containing raw and preprocessed STS-B 
├── imdb-wiki-dir/         # IMDB-WIKI dataset for age estimation
│   ├── dataset.py         # Preprocessing and data preparation for IMDB-WIKI
│   ├── data               # dataset directory
│   ├── dfr.py             # Method implementation
│   ├── resnet.py          # Network implementation
│   ├── evaluate.py        # Evaluation pipelines
│   └── utils.py/          # Directory containing utility functions
├── agedb-dir/             # AgeDB dataset for age estimation
│   ├── evaluate.py        # Evaluation pipelines
│   ├── data               # dataset directory
│   ├── dfr.py             # Method implementation
│   ├── resnet.py          # Network implementation
│   ├── dataset.py         # Preprocessing and data preparation for IMDB-WIKI
│   └── utils.py/          # Directory containing utility functions

🧑🏻‍💻 Running (STS-B-DIR)

Download GloVe word embeddings (840B tokens, 300D vectors) using

python glove/download_glove.py

We use the standard file (./glue_data/STS-B) provided by DIR, which is used to set up balanced STS-B-DIR dataset. To reproduce the results in the paper, please directly use this file. If you want to try different balanced splits, you can delete the folder ./glue_data/STS-B and run

python glue_data/create_sts.py

The required dependencies for this task are quite different to other three tasks, so it's better to create a new environment for this task. If you use conda, you can create the environment and install dependencies using the following commands:

conda create -n sts python=3.6
conda activate sts
# PyTorch 0.4 (required) + Cuda 9.2
conda install pytorch=0.4.1 cuda92 -c pytorch
# other dependencies
pip install -r requirements.txt
# The current latest "overrides" dependency installed along with allennlp 0.5.0 will now raise error. 
# We need to downgrade "overrides" version to 3.1.0
pip install overrides==3.1.0

training

python train.py --dfr --w1 1e-4 --w2 1e-2 --w3 1e-4 --temp 0.1

🧑🏻‍ Evaluating

python evaluate.py --evaluate --resume <path_to_evaluation_ckpt> #agedb-dir & imdb-wiki-dir

python evaluate.py --evaluate --eval_model <path_to_evaluation_ckpt> #sts-b-dir

Acknowledgment

Our codebase was built on DIR and RankSim. Thanks for their wonderful work!

Citation

If you find this repository useful in your research, please consider giving a star ⭐️ and a citation:

@inproceedings{dong2025improve,
  title={Improve Representation for Imbalanced Regression through Geometric Constraints},
  author={Dong, Zijian and Wu, Yilei and Chen, Chongyao and Zou, Yingtian and Zhang, Yichi and Zhou, Juan Helen},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
agedb-dir		agedb-dir
imdb-wiki-dir		imdb-wiki-dir
sts-b-dir		sts-b-dir
README.md		README.md
SRL.png		SRL.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧶 Improving Representation for Imbalanced Regression
through Geometric Constraints

💡 Introduction

🔧 Usage

💻 Pretrained Weights

📂 File Structure

🧑🏻‍💻 Running (STS-B-DIR)

🧑🏻‍ Evaluating

Acknowledgment

Citation

About

Uh oh!

Releases

Packages

Languages

hzlab/imbalanced-regression

Folders and files

Latest commit

History

Repository files navigation

🧶 Improving Representation for Imbalanced Regression through Geometric Constraints

💡 Introduction

🔧 Usage

💻 Pretrained Weights

📂 File Structure

🧑🏻‍💻 Running (STS-B-DIR)

🧑🏻‍ Evaluating

Acknowledgment

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

🧶 Improving Representation for Imbalanced Regression
through Geometric Constraints

Packages