Skip to content

Sideloading-Research/the-anonymiser-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Text Acronymizer Module

A lightweight Python module for converting named entities and capitalized phrases into acronyms using natural language processing and regular expressions.

πŸ“Œ Purpose

This tool helps standardize and compress text by replacing proper names and formal phrases with acronyms. It's useful for:

  • Text summarization
  • Preprocessing for NLP tasks
  • Creating anonymized or encoded corpora

βš™οΈ Features

  • Named Entity Recognition via spaCy
  • Regex-based detection of capitalized phrases
  • Chainable transformations (NER β†’ Capital Phrases)
  • Simple integration with file processing

🧠 How It Works

  1. Detect named entities like Barack Obama or New York.
  2. Convert them into acronyms: BO, NY.
  3. Detect multi-word capitalized phrases: Artificial Intelligence System β†’ AIS.
  4. Output the transformed text.

πŸš€ Usage

python acronymizer.py

About

A lightweight Python module for converting named entities and capitalized phrases into acronyms using natural language processing and regular expressions.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages