Nepali Unicoder

A robust Python package for converting Romanized Nepali text and Preeti font text into Unicode Devanagari script. It uses a greedy matching algorithm for Roman transliteration and a two-phase conversion process for Preeti with contextual rules.

Read the Full Documentation for detailed usage guides, Preeti mapping references, and API details.

Features

Accurate Transliteration: Uses a greedy matching algorithm to prioritize longer phonetic matches (e.g., 'kha' is matched before 'k' and 'h').
Preeti Font Support: Full support for Preeti to Unicode conversion with 30+ contextual rules for accurate transformation.
Smart Vowel Handling: Distinguishes between independent vowels (e.g., 'aa' -> 'आ') and vowel signs/matras (e.g., 'ka' -> 'क', 'kaa' -> 'का').
Contextual Rules: Handles complex Devanagari rules like reph positioning, matra reordering, and special character combinations.
Mixed Content Support: Allows keeping English words or specific text in Roman script using {} blocks.
Customizable: Supports custom word-level overrides via word_maps.json.
CLI Support: Can be used directly from the command line.

Installation

You can install the package locally:

pip install nepali-unicoder

Usage

Command Line Interface (CLI)

You can use the converter directly from the terminal:

# Using the installed command
nepali-unicoder "namaste"
# Output: नमस्ते

# Or using python module
python -m nepali_unicoder "namaste"
# Output: नमस्ते

# Pipe input
echo "mero naam sanjeev ho" | nepali-unicoder
# Output: मेरो नाम सन्जीव् हो

Python API

from nepali_unicoder.convert import Converter

converter = Converter()

# Basic conversion
text = "namaste nepal"
print(converter.convert(text))
# Output: नमस्ते नेपाल

# Using 'as-is' blocks for English text
mixed_text = "mero naam {Sanjeev} ho"
print(converter.convert(mixed_text))
# Output: मेरो नाम Sanjeev हो

Preeti Mode

Convert Preeti font text to Unicode with full support for contextual rules:

from nepali_unicoder.convert import Converter

# Create converter in Preeti mode
preeti_converter = Converter(mode="preeti")

# Basic conversion
preeti_text = "s{sf"  # Preeti characters
print(preeti_converter.convert(preeti_text))
# Output: र्कर्का

# The converter handles:
# - Reph positioning: { → र् (moves before consonant)
# - Matra reordering: l (ि) moves after consonant
# - Special m transformations
# - Vowel combinations
# - Literal brackets: { and } are treated as normal characters in Preeti mode

Preeti Character Examples

Preeti	Unicode	Description
`s`	`क`	Consonant ka
`s{`	`र्क`	Reph + ka (contextual)
`sl`	`कि`	ka + short i (reordered)
`qm`	`क्र`	Special m transformation
`!@#`	`१२३`	Nepali numbers
`Ù` / `Ú`	`;` / `:`	Literal punctuation
`«` / `»`	`्र`	Ra-foot (for ट, ठ, ड, ढ)
`¿`	`रू`	Combined ruu
`å`	`द्व`	Combined dva
`ˆ`	`फ्`	Half ph
`ª`	`ङ`	Consonant nga
`æ` / `Æ`	`“` / `”`	Curly quotes
`¥`	`र्`	Half ra
`¶`	`ठ्ठ`	Combined thth
`§`	`ट्ट`	Combined tt
`£`	`घ्`	Half gh
`Ë` / `Í`	`ङ्ग` / `ङ्क`	Combined nga-ga / nga-ka
`‰`	`झ्`	Half jh

CLI for Preeti

python -m nepali_unicoder --preeti "s{sf"
# Output: र्कर्का

Transliteration Rules

Consonants: k -> क्, ka -> क, kh -> ख्, kha -> ख
Vowels: a -> अ, aa -> आ, i -> इ, u -> उ
Matras: ki -> कि, ko -> को
Special: . -> ।, .. -> ॥
Numbers: 0-9 -> ०-९ (Decimal points are preserved: 1.5 -> १.५)

Advanced Usage

Handling Complex Text

The converter handles mixed content gracefully. You can use {} to keep text as-is (e.g., for English words or code snippets).

text = "mero naam {Sanjeev} ho ra ma 12.5 barsa ko bhaye."
print(converter.convert(text))
# Output: मेरो नाम Sanjeev हो र म १२.५ बर्स को भए।

Configuration

The package uses word_maps.json for custom word-level overrides, located in the src/nepali_unicoder directory.

word_maps.json: Defines custom word-level overrides. Use this for words that don't follow standard phonetic rules.

Example word_maps.json:

{
    "nepal": "नेपाल",
    "kathamandu": "काठमाडौँ"
}

Contribution

We welcome contributions! Here's how you can help:

Clone the repository:

git clone https://github.com/realsanjeev/nepali_unicoder.git
cd nepali_unicoder

Set up a virtual environment:

python3 -m venv .venv
source .venv/bin/activate
pip install -e .

Run tests:
```
python -m unittest discover tests
```
Submit a Pull Request: Create a new branch, make your changes, and submit a PR.

Development

To run tests:

python -m unittest discover tests

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github/workflows		.github/workflows
docs		docs
src/nepali_unicoder		src/nepali_unicoder
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
DEV.md		DEV.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nepali Unicoder

Features

Installation

Usage

Command Line Interface (CLI)

Python API

Preeti Mode

Preeti Character Examples

CLI for Preeti

Transliteration Rules

Advanced Usage

Handling Complex Text

Configuration

Contribution

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Nepali Unicoder

Features

Installation

Usage

Command Line Interface (CLI)

Python API

Preeti Mode

Preeti Character Examples

CLI for Preeti

Transliteration Rules

Advanced Usage

Handling Complex Text

Configuration

Contribution

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages