TestNUC: Enhancing Test-Time Computing Approaches and Scaling through Neighboring Unlabeled Data Consistency
This repository contains the implementation of the paper:
TestNUC: Enhancing Test-Time Computing Approaches and Scaling through Neighboring Unlabeled Data Consistency [Paper] [ACL Anthology] [arXiv]
The 63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
Henry Peng Zou, Zhengyao Gu, Yue Zhou, Yankai Chen, Weizhi Zhang, Liancheng Fang, Yibo Wang, Yangning Li, Kay Liu, Philip S. Yu
TL;DR: Simple Test-Time Scaling with Unlabeled Data
Test-time computing approaches, which leverage additional computational resources during inference, have been proven effective in enhancing large language model performance. This work introduces a novel, linearly scaling approach, TestNUC, that improves test-time predictions by leveraging the local consistency of neighboring unlabeled data-it classifies an input instance by considering not only the model's prediction on that instance but also on neighboring unlabeled instances. TestNUC scales effectively with increasing amounts of unlabeled data and performs robustly across different embedding models, making it practical for real-world applications.
To get started, create a new environment and install dependancy:
conda create -n testnuc python=3.10 -y
conda activate testnuc
pip install -r requirements.txtSimply run the aggregation_num_unlabeled_data.ipynb notebook to reproduce and visualize test-time scaling results with varying amounts of unlabeled data.
This section provides a straightforward and efficient way to implement the TestNUC algorithm. As an example, we demonstrate it using gpt4o-mini with NV-Embed-v2 on the Banking77 dataset. You may also apply this approach to other datasets, embedders, and LLMs.
You can download pre-extracted embeddings from this Google Drive Folder:
gdown --folder https://drive.google.com/drive/folders/1Y6QhJW9nb3objSHQue0rA3pmT4cvcg2f -O ./data/You can also choose to extract embeddings by yourself from any embedder on Huggingface Embedder Leaderboard by following their insturction, e.g., Qwen Embedder Usage.
Extract inital LLM predictions by running pseudo_labeling.ipynb in scripts_llm folder -> example_file_link.
Simply run aggregation_num_unlabeled_data.ipynb in scripts_llm folder -> example_file_link.
If you have any questions related to the code or the paper, feel free to email Henry Peng Zou (pzou3@uic.edu). If you encounter any problems when using the code, or want to report a bug, you can open an issue. Please try to specify the problem with details so we can help you better and quicker!
If you find this repository helpful, please consider citing our paper 💕:
@misc{zou2025testnucenhancingtesttimecomputing,
title={TestNUC: Enhancing Test-Time Computing Approaches and Scaling through Neighboring Unlabeled Data Consistency},
author={Henry Peng Zou and Zhengyao Gu and Yue Zhou and Yankai Chen and Weizhi Zhang and Liancheng Fang and Yibo Wang and Yangning Li and Kay Liu and Philip S. Yu},
year={2025},
eprint={2502.19163},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.19163},
}