Skip to content

Conversation

@Mjboothaus
Copy link
Contributor

Phase 2.3: Indentation Tracking - LEXER COMPLETE!

This completes Phase 2 of the YAML Lite implementation. The lexer now has full indentation tracking.

Implementation Details

Core Logic:

  • Track indentation level at start of each line using indent_stack
  • Emit INDENT token when indentation increases
  • Emit DEDENT token(s) when indentation decreases
  • Handle edge cases: blank lines, comment-only lines, EOF

Features:

  • ✅ Multiple nesting levels
  • ✅ Multiple DEDENT tokens (jumping back multiple levels)
  • ✅ Mixed structures (lists + mappings)
  • ✅ Indentation validation (detect misaligned dedents)
  • ✅ Blank line handling (don't affect indentation)
  • ✅ Comment line handling (don't affect indentation)
  • ✅ EOF handling (emit remaining DEDENTs)

Testing

10 new indentation tests:

  • Simple indent/dedent
  • Multiple nesting levels
  • Multiple dedents at once
  • Blank lines ignored
  • Comment lines ignored
  • List with indented items
  • Same indent (no tokens)
  • DEDENT at EOF
  • Mixed indentation with lists

Updated 2 structural tests to account for INDENT/DEDENT tokens

Test Results

  • 74 total tests (10 basic + 15 scalars + 7 structural + 10 indentation + 17 values + 15 fixtures)
  • 100% passing

Phase 2 Complete! ✅

  • ✅ Phase 2.1: Scalar tokenisation
  • ✅ Phase 2.2: Structural tokens
  • ✅ Phase 2.3: Indentation logic

Next Steps

Phase 3 will implement the parser to build YamlValue structures from the token stream.

…kens

Phase 2 lexer complete! Indentation logic is the hardest part of YAML parsing.

Implementation:
- Track indentation level at start of each line
- Emit INDENT token when indentation increases
- Emit DEDENT token(s) when indentation decreases
- Handle blank lines and comment-only lines (don't change indentation)
- Emit remaining DEDENT tokens at EOF
- Validate indentation consistency (no misaligned dedents)

Features:
- Support for multiple nesting levels
- Multiple DEDENT tokens when jumping back multiple levels
- Proper handling of mixed structures (lists + mappings)
- Edge case handling (blank lines, comments, EOF)

Testing:
- 10 new indentation-specific tests
- Updated 2 structural tests to account for INDENT/DEDENT
- Total: 74 tests passing (10 basic + 15 scalars + 7 structural + 10 indentation + 17 values + 15 fixtures)

Phase 2 Lexer Status: COMPLETE ✅
- Phase 2.1: Scalar tokenisation ✅
- Phase 2.2: Structural tokens ✅
- Phase 2.3: Indentation logic ✅

Next: Phase 3 - Parser implementation

Co-Authored-By: Warp <agent@warp.dev>
@semanticdiff-com
Copy link

Review changes with  SemanticDiff

@Mjboothaus Mjboothaus merged commit 257a967 into main Jan 14, 2026
0 of 2 checks passed
@Mjboothaus Mjboothaus deleted the feature/phase-2.3-indentation-logic branch January 14, 2026 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants