Should give distilled variant of ModernBert a try for all sub-modeling tasks: https://huggingface.co/blog/modernbert