Note (2.0+): The
str/extramodule is now internal (str/internal/extra). All functions documented here are available viaimport str. Usestr.function_name()in your code.
The str library provides practical utilities for:
- Converting Unicode text to ASCII equivalents
- Generating URL-friendly slugs
- Common naming convention transformations (camelCase, snake_case, kebab-case)
Pragmatic over Perfect: This module prioritizes practical, deterministic results for common use cases.
- Scope: Optimized for Latin-based scripts, common ligatures, and frequently-used symbols
- Approach: Uses curated replacement tables rather than algorithmic transformation
- Extensibility: For comprehensive transliteration (Cyrillic, Arabic, CJK), integrate an external library
Converts Unicode text to ASCII by applying character replacements and removing combining marks.
Process:
- Apply replacement table (é → e, ß → ss, æ → ae, etc.)
- Decompose common Latin characters
- Remove combining marks (diacritics)
- Reapply replacements for any decomposed forms
Examples:
ascii_fold("café") // "cafe"
ascii_fold("Münchner") // "Munchner"
ascii_fold("naïve") // "naive"
ascii_fold("Crème Brûlée") // "Creme Brulee"Ligatures:
ascii_fold("æon") // "aeon"
ascii_fold("Æsir") // "AEsir"
ascii_fold("straße") // "strasse"Applies only the replacement table without decomposition.
Use case: When you want to preserve combining marks for further processing.
Example:
ascii_fold_no_decompose("café") // "cafe"
// But decomposed input like "cafe\u{0301}" is preserved as-isAccepts a custom normalizer function for production Unicode handling.
Example:
import unicode_helpers // Your OTP wrapper module
pub fn fold_production(text: String) -> String {
ascii_fold_with_normalizer(text, unicode_helpers.nfd)
}Variant without internal decomposition, relying solely on the provided normalizer.
Creates a URL-friendly slug with default settings:
- Separator:
- - No token limit
- ASCII output only
Examples:
slugify("Hello, World!") // "hello-world"
slugify("Café & Bar") // "cafe-bar"
slugify("2025 — New Year!") // "2025-new-year"Configurable slug generation.
Parameters:
max_len: Maximum number of tokens (0 = unlimited)sep: Token separator (typically-or_)preserve_unicode: IfTrue, keeps Unicode characters; ifFalse, applies ASCII folding
Process:
- Trim and normalize whitespace
- Apply ASCII folding (if
preserve_unicodeisFalse) - Convert to lowercase
- Replace non-alphanumeric sequences with separator
- Collapse consecutive separators
- Trim separators from ends
- Limit to
max_lentokens (if specified)
Examples:
// Token limit
slugify_opts("one two three four", 2, "-", False)
// "one-two"
// Custom separator
slugify_opts("Hello World", 0, "_", False)
// "hello_world"
// Preserve Unicode
slugify_opts("Café ❤️ Gleam", 0, "-", True)
// "café-❤️-gleam"
// ASCII only
slugify_opts("Café ❤️ Gleam", 0, "-", False)
// "cafe-gleam"Convenience alias for slugify_opts_with_normalizer using defaults.
slugify_opts_with_normalizer(text: String, max_len: Int, sep: String, preserve_unicode: Bool, normalizer: fn(String) -> String) -> String
Full control over slugification with custom normalization.
Example:
import unicode_helpers
pub fn create_slug(title: String, max_words: Int) -> String {
slugify_opts_with_normalizer(
title,
max_words,
"-",
False, // ASCII output
unicode_helpers.nfd
)
}Converts text to kebab-case (lowercase with hyphens).
Example:
to_kebab_case("Hello World") // "hello-world"
to_kebab_case("getUserById") // "getuserbyid"Converts text to snake_case (lowercase with underscores).
Example:
to_snake_case("Hello World") // "hello_world"
to_snake_case("getUserById") // "getuserbyid"Converts text to camelCase (lowercase first word, capitalize subsequent words).
Example:
to_camel_case("hello world") // "helloWorld"
to_camel_case("get user by id") // "getUserById"Converts text to PascalCase (capitalize all words, no separator).
Example:
to_pascal_case("hello world") // "HelloWorld"
to_pascal_case("get user by id") // "GetUserById"Converts text to Title Case (capitalize all words, separated by spaces).
Example:
to_title_case("hello world") // "Hello World"
to_title_case("get user by id") // "Get User By Id"The internal replacement table covers:
- Latin accents: À, Á, Â, Ã, Ä, Å, È, É, Ê, Ë, etc.
- Ligatures: æ, Æ, œ, Œ, ß
- Special Latin: Ð, Þ, Ø, ð, þ, ø
- Common symbols: ©, ®, ™, €, £, ¥
The internal Latin decomposer handles:
- Common precomposed characters (é, ñ, ü, etc.)
- Base + combining mark sequences (e + ´, n + ˜, etc.)
Note: Coverage is optimized for Western European languages. For comprehensive Unicode support, use an external transliteration library.
import str
pub fn create_post_slug(title: String) -> String {
str.slugify(title)
}import str
import unicode_helpers
pub fn create_url_slug(title: String, max_words: Int) -> String {
str.slugify_opts_with_normalizer(
title,
max_words,
"-",
False,
unicode_helpers.nfd
)
}import str
pub fn sanitize_filename(name: String) -> String {
name
|> str.ascii_fold()
|> str.slugify_opts(0, "_", False)
}import str
pub fn to_variable_name(text: String) -> String {
str.to_snake_case(text)
}
pub fn to_function_name(text: String) -> String {
str.to_camel_case(text)
}- Script Coverage: Optimized for Latin scripts; limited support for Cyrillic, Greek, Arabic, CJK
- Semantic Preservation: Transliteration is lossy; "café" and "cafe" become indistinguishable
- Bidirectional Text: No special handling for RTL scripts
- Contextual Rules: Simple character-by-character replacement without linguistic context
To add custom replacements:
- Edit
src/str/internal/translit.gleam - Add entries to the
replacements()table - Regenerate documentation:
python3 scripts/generate_character_tables.py - Test with real-world examples
- str — Grapheme-aware core utilities
- OTP Integration Guide — Unicode normalization setup
- Examples — Integration patterns
- Character Tables — Machine-readable replacement data