Max J. Tsai
β Email: mt8168@gmail.com
π GitHub: https://github.com/JANQLIANGTSAI
Β© 2025 Max J. Tsai. All rights reserved.
π This project is released under the MIT License, which allows modification, distribution, and private use with minimal restrictions.
This Python program scans webpages for Diversity, Equity, Inclusion, and Accessibility (DEIA) terms to assist with compliance under Executive Order 14151. It extracts webpage content, identifies DEIA-related terms, and reports the number of occurrences for each URL.
β Scrapes webpage content using BeautifulSoup
- adding option to exluse sections, i.e. <footer>β Tokenizes and matches DEIA terms and their synonyms 4
- to-do: STEM for matching; staCy does not work with Python 3.13 (anyone can help?)β Displays URLs with the count of DEIA-related term occurrences
Make sure you have Python 3.x installed, then install the required dependencies:
python -m venv .venv (and activate)
pip install requirements.txt