Skip to content

Adapting a LM (ex. roBERTa-base) to mitigate social biases using stereoset dataset. The adaptation is performed through LoRA on both tasks MLM and NSP, visualizing quantitative and qualitative measurements.

Notifications You must be signed in to change notification settings

Abudo-S/MindTheGap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mind the gap

This project aims to identify, measure, and mitigate social biases, such as gender, race, or profession-related stereotypes, in lightweight transformer models through hands-on fine-tuning and evaluation on targeted NLP tasks. More specifically, the project should implement a four-step methodology, defined as follows:

  1. Choose a lightweight pre-trained transformer model (e.g., DistilBERT, ALBERT, RoBERTa-base) suitable for local fine-tuning and evaluation.
  2. Evaluate the presence and extent of social bias (e.g., gender, racial, or occupational stereotypes) using dedicated benchmark datasets. Both quantitative metrics and qualitative outputs should be evaluated.
  3. Apply a bias mitigation technique, such as fine-tuning on curated counter-stereotypical data, integrating adapter layers, or employing contrastive learning, while keeping the solution computationally efficient and transparent.
  4. Re-assess the model using the same benchmark(s) to measure improvements. We should compare pre- and post-intervention results, discuss trade-offs (e.g., performance vs. fairness), and visualize the impact of their approach.

Dataset

References

  • StereoSet: Measuring stereotypical bias in pretrained language models 2021.acl-long.416.
  • Zhang, Y., & Zhou, F. (2024). Bias mitigation in fine-tuning pre-trained models for enhanced fairness and efficiency. arXiv preprint arXiv:2403.00625.
  • Fu, C. L., Chen, Z. C., Lee, Y. R., & Lee, H. Y. (2022). Adapterbias: Parameter-efficient token-dependent representation shift for adapters in nlp tasks. arXiv preprint arXiv:2205.00305.
  • Park, K., Oh, S., Kim, D., & Kim, J. (2024, June). Contrastive Learning as a Polarizer: Mitigating Gender Bias by Fair and Biased sentences. In Findings of the Association for Computational Linguistics: NAACL 2024 (pp. 4725-4736).

About

Adapting a LM (ex. roBERTa-base) to mitigate social biases using stereoset dataset. The adaptation is performed through LoRA on both tasks MLM and NSP, visualizing quantitative and qualitative measurements.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published