[04.08.2025] The refined and documented version of the code will be made available in a few weeks.
In this repository, we provide the code and the datasets for RadlER, a novel solution to produce clean samples from data containing duplicates according to a target distribution.
You can find all details about deduplicated sampling on-demand with RadlER in our research paper, which will be presented jointly with the related demonstration at VLDB in London (September 1-5, 2025):
@article{radler,
author = {Luca {Zecchini} and Vasilis {Efthymiou} and Felix {Naumann} and Giovanni {Simonini}},
title = {{Deduplicated Sampling On-Demand}},
journal = {{Proceedings of the VLDB Endowment (PVLDB)}},
volume = {18},
number = {8},
pages = {2482--2495},
year = {2025},
doi = {10.14778/3742728.3742742}
}
You can also take a look at our demonstration:
@article{radler_demo,
author = {Luca {Zecchini} and Ziawasch {Abedjan} and Vasilis {Efthymiou} and Giovanni {Simonini}},
title = {{RadlER: Deduplicated Sampling On-Demand}},
journal = {{Proceedings of the VLDB Endowment (PVLDB)}},
volume = {18},
number = {12},
pages = {5319--5322},
year = {2025},
doi = {10.14778/3750601.3750661}
}
