Skip to content

NDF Escaping and Edge Cases

paulmothapo edited this page Jan 3, 2026 · 1 revision

NDF Escaping and Edge Cases

This document specifies how to handle special characters and edge cases in NDF without turning it into "YAML: the sequel."

Design Philosophy

Principle: Keep escaping simple and predictable. Use quoting only when necessary.

Goal: Avoid complex escape sequences. Prefer explicit quoting over implicit escaping.

When Quoting is Required

A value must be quoted if it contains:

  1. Structural characters: :, ,, #, [, ], {, }
  2. Quotes: ", ' (to avoid ambiguity)
  3. Leading/trailing whitespace
  4. Reserved keywords: yes, no, true, false, null, none, -
  5. Leading special prefixes: $ (unless it's a reference), @ (unless it's a type hint)
  6. Numeric strings: Values that look like numbers but should be strings

Quoting Rules

Double Quotes (Preferred)

Use double quotes (") for values that need quoting.

# Colon in value
time: "10:30:00"
url: "https://example.com:8080"

# Comma in value
message: "Hello, world"

# Hash in value
color: "#ff0000"

# Reserved keyword
status: "yes"  # String, not boolean

Single Quotes (Alternative)

Single quotes (') work the same but are less common. Prefer double quotes for consistency.

message: 'Hello, world'

Escaping Inside Quotes

Inside quoted strings, escape these characters:

  • \n → newline
  • \t → tab
  • \r → carriage return
  • \\ → backslash
  • \" → double quote (when using ")
  • \' → single quote (when using ')
  • \: → colon (optional, colon is safe inside quotes)
  • \, → comma (optional, comma is safe inside quotes)

Note: : and , don't need escaping inside quotes, but escaping is supported for clarity.

# Escaped newline
message: "Line 1\nLine 2"

# Escaped quote
quote: "He said \"Hello\""

# Escaped backslash
path: "C:\\Users\\Name"

Edge Cases

Colons in Values

Problem: Colon is the key-value separator.

Solution: Quote the value.

# ❌ Invalid (colon interpreted as separator)
time: 10:30:00

# ✅ Valid (quoted)
time: "10:30:00"

# ✅ Valid (multiline)
time: |
  10:30:00

Commas in Values

Problem: Comma is used for list separation.

Solution: Quote the value.

# ❌ Invalid (comma splits into list)
greeting: Hello, world

# ✅ Valid (quoted)
greeting: "Hello, world"

# ✅ Valid (multiline)
greeting: |
  Hello, world

Commas in Lists

Problem: How to include a comma in a list item?

Solution: Quote the item.

# List with comma in item
tags: "tag, with comma", "another tag", "third tag"

# Or use multiline array
tags:
  - "tag, with comma"
  - "another tag"
  - "third tag"

Hashes in Values

Problem: Hash starts a comment.

Solution: Quote the value.

# ❌ Invalid (hash starts comment)
color: #ff0000

# ✅ Valid (quoted)
color: "#ff0000"

Brackets and Braces

Problem: Brackets/braces indicate arrays/objects.

Solution: Quote the value.

# ❌ Invalid (interpreted as array)
pattern: [a-z]

# ✅ Valid (quoted)
pattern: "[a-z]"

# ❌ Invalid (interpreted as object)
template: {name}

# ✅ Valid (quoted)
template: "{name}"

Quotes in Values

Problem: Quotes delimit strings.

Solution: Escape inside quotes, or use opposite quote type.

# Escaped double quote
message: "He said \"Hello\""

# Or use single quotes
message: 'He said "Hello"'

# Escaped single quote
message: 'It\'s great'

# Or use double quotes
message: "It's great"

Leading/Trailing Whitespace

Problem: Whitespace is significant for indentation.

Solution: Quote the value.

# ❌ Invalid (whitespace trimmed)
name:  Alice  

# ✅ Valid (quoted preserves whitespace)
name: "  Alice  "

Empty Strings

Problem: Empty value is interpreted as null.

Solution: Quote empty string.

# ❌ Invalid (interpreted as null)
empty: 

# ✅ Valid (explicit empty string)
empty: ""

Reserved Keywords

Problem: Keywords have special meaning.

Solution: Quote to use as string.

# Boolean keywords
status: "yes"     # String, not boolean
enabled: "true"   # String, not boolean

# Null keywords
value: "null"     # String, not null
empty: "none"     # String, not null
placeholder: "-"  # String, not null

Numeric Strings

Problem: Numbers are parsed as numbers, not strings.

Solution: Quote to preserve as string.

# ❌ Invalid (parsed as number)
zipcode: 01234

# ✅ Valid (quoted as string)
zipcode: "01234"

# ❌ Invalid (parsed as number)
id: 00123

# ✅ Valid (quoted as string)
id: "00123"

Leading $ and @

Problem: $ indicates references, @ indicates type hints.

Solution: Quote to use literally.

# ❌ Invalid (interpreted as reference)
price: $100

# ✅ Valid (quoted)
price: "$100"

# ❌ Invalid (interpreted as type hint)
tag: @important

# ✅ Valid (quoted)
tag: "@important"

Newlines in Values

Problem: Newlines end key-value pairs.

Solution: Use multiline string (|) or escape (\n).

# Multiline (preferred)
description: |
  Line 1
  Line 2
  Line 3

# Escaped (alternative)
description: "Line 1\nLine 2\nLine 3"

Special Characters in Keys

Problem: Keys can contain most characters, but some are problematic.

Solution: Quote keys with special characters.

# Valid unquoted key
name: Alice

# Valid quoted key (if needed)
"key:with:colons": value
"key,with,commas": value
"key with spaces": value

Note: Keys with colons/commas are discouraged. Prefer using valid identifiers.

Nested Quotes

Problem: Quotes inside quoted strings.

Solution: Escape or use opposite quote type.

# Escaped
message: "He said \"Hello\" and she said \"Hi\""

# Mixed quotes
message: "He said 'Hello' and she said 'Hi'"

Backslashes

Problem: Backslash is escape character.

Solution: Always escape backslashes.

# Escaped backslash
path: "C:\\Users\\Name"

# Backslash at end
text: "Ends with backslash\\"

Automatic Quoting Detection

The parser uses needsQuoting() to determine if a value should be quoted:

function needsQuoting(str: string): boolean {
  // Empty string
  if (str === '') return true;
  
  // Leading/trailing whitespace
  if (str !== str.trim()) return true;
  
  // Special characters
  if (/[:#\[\]{},"']/.test(str)) return true;
  
  // Reserved keywords
  if (/^(yes|no|true|false|null|none|-)$/i.test(str)) return true;
  
  // Numbers (to preserve as string)
  if (/^-?\d+(\.\d+)?([eE][+-]?\d+)?$/.test(str)) {
    // Only quote if it looks like it should be a string
    // (e.g., leading zeros, but this is context-dependent)
    return false; // Numbers are usually not quoted
  }
  
  // Leading $ or @
  if (str.startsWith('$') || str.startsWith('@')) return true;
  
  return false;
}

Serialization Rules

When serializing, use quoteIfNeeded():

function quoteIfNeeded(str: string): string {
  if (needsQuoting(str)) {
    const escaped = escapeString(str);
    return `"${escaped.replace(/"/g, '\\"')}"`;
  }
  return str;
}

Common Patterns

URLs

# URL with port (contains colon)
url: "https://example.com:8080"

# URL with query (contains ampersand, etc.)
api: "https://api.example.com/v1?key=value&format=json"

File Paths

# Windows path (contains backslashes)
path: "C:\\Users\\Name\\Documents"

# Unix path (usually no quoting needed)
path: /home/user/documents

JSON-like Values

# JSON string (contains quotes and braces)
json: "{\"key\": \"value\"}"

# Or use multiline
json: |
  {"key": "value"}

Regular Expressions

# Regex pattern (contains brackets, etc.)
pattern: "[a-z]+"

Code Snippets

# Code with special characters
code: |
  function test() {
    return "hello: world";
  }

Avoiding "YAML: The Sequel"

To keep NDF simple, we:

  1. Limit escape sequences: Only \n, \t, \r, \\, \", \'
  2. No complex escaping: No Unicode escapes, no octal, no hex
  3. Explicit over implicit: Prefer quoting over complex rules
  4. Predictable behavior: Same input always produces same output
  5. Clear error messages: When escaping fails, show why

Testing Edge Cases

Test these scenarios:

# All special characters
special: ":,[]{}#\"'$@"

# Empty and whitespace
empty: ""
spaces: "  hello  "

# Reserved words as strings
keywords: "yes no true false null none -"

# Numbers as strings
numeric: "01234"

# Mixed quotes
mixed: "He said 'Hello'"

# Escaped sequences
escaped: "Line 1\nLine 2\tTabbed"

Summary

Rule of thumb: If a value contains any character that could be interpreted as syntax, quote it.

Keep it simple: Use double quotes, escape only what's necessary, prefer multiline for complex values.

Avoid complexity: Don't add YAML-like features (anchors, aliases, complex escaping). Keep NDF focused on simplicity.