Skip to content

Implemented Specs v3.0#17

Closed
krlosflipdev wants to merge 10 commits intotoon-format:mainfrom
krlosflipdev:feature/spec-3.0-implementation
Closed

Implemented Specs v3.0#17
krlosflipdev wants to merge 10 commits intotoon-format:mainfrom
krlosflipdev:feature/spec-3.0-implementation

Conversation

@krlosflipdev
Copy link
Contributor

@krlosflipdev krlosflipdev commented Nov 27, 2025

Description

This PR implements full compliance with TOON Specification v3.0, including critical bug fixes, test improvements, and comprehensive documentation updates. The implementation now passes all 371 tests (100% pass rate, 0 skipped) with complete specification coverage and is production-ready for .NET 8.0 and 9.0.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Performance improvement
  • Test coverage improvement

Related Issues

Closes #9 - IDictionary check unreachable due to IEnumerable ordering

Changes Made

1. SPEC v3.0 Section 10 Compliance - Objects as List Items

Fixed critical indentation bug in encoder:

  • Objects with tabular arrays as first field now correctly indent tabular rows at depth +2 (was depth +1)
  • Sibling fields correctly appear at depth +1 relative to hyphen line
  • File: src/ToonFormat/Internal/Encode/Encoders.cs:451
  • Change: WriteTabularRows(objects, header, writer, depth + 2, options) (was depth + 1)

Example of fixed output:

items[1]:
  - users[2]{id,name}:
      1,Ada          # Now at depth +2 (6 spaces)
      2,Bob          # Correctly indented
    status: active   # Sibling field at depth +1 (4 spaces)

2. Path Expansion Feature (SPEC Section 13.4)

Added complete path expansion implementation:

  • New file: src/ToonFormat/Internal/Decode/PathExpansion.cs (187 lines)
  • Expands dotted keys like "a.b.c" into nested objects {"a":{"b":{"c":...}}}
  • Added ExpandPaths option to ToonDecodeOptions ("off" or "safe")
  • Implements conflict detection for strict mode
  • Implements LWW (Last-Write-Wins) for non-strict mode
  • Supports deep merging of expanded paths
  • NEW: Implements quoted key tracking to preserve literal dotted keys
    • Quoted keys like "c.d" remain as literal "c.d" instead of expanding
    • Unquoted keys like a.b expand to nested objects {"a":{"b":...}}
    • Tracks quoted state through KeyParseResult.WasQuoted property

Usage:

var options = new ToonDecodeOptions { ExpandPaths = "safe" };
var result = ToonDecoder.Decode("a.b.c: 1", options);
// Result: {"a":{"b":{"c":1}}}

// Quoted keys remain literal
var mixed = ToonDecoder.Decode("a.b: 1\n\"c.d\": 2", options);
// Result: {"a":{"b":1},"c.d":2}

3. Scientific Notation Fix (SPEC v3.0 Section 2)

Fixed number encoding to prevent scientific notation:

  • Updated FormatNumber() in Primitives.cs to convert scientific notation to decimal form
  • Numbers like 0.000001 now output as "0.000001" instead of "1E-06"
  • Implements -00 normalization
  • Preserves up to 16 significant digits

4. Dictionary Ordering Bug Fix (Issue #9)

Fixed unreachable IDictionary check:

  • Reordered type checks in Normalize.cs to check IDictionary BEFORE IEnumerable
  • Fixed in both NormalizeValue() methods (lines 76 and 168)
  • Dictionaries now correctly serialize as objects instead of arrays

5. Spec Generator Improvements

Fixed specgen.sh script issues:

  • Converted CRLF line endings to LF (Unix-style)
  • Added .csproj extension to dotnet build command
  • Enhanced GitTool.cs to check git clone exit codes and capture errors
  • Now properly handles git clone failures with descriptive error messages

6. Test Suite Enhancements

Fixed all 12 initially failing tests:

  • Added ExpandPaths = "safe" to all path expansion tests
  • Updated test expectations to match SPEC v3.0 Section 10 requirements
  • Fixed conflict detection tests
  • Implemented quoted key tracking to pass previously skipped test
  • Final result: 371 passing, 0 failed, 0 skipped (100% pass rate)

New test files:

  • tests/ToonFormat.Tests/Encode/ArraysObjectsManual.cs (137 lines)
  • tests/ToonFormat.Tests/PerformanceBenchmark.cs (131 lines)

Updated test files:

  • Fixed indentation in all Section 10-related tests (ArraysNested.cs, ArraysObjects.cs)
  • Updated expectations to match depth +2 for tabular rows
  • Added comprehensive edge case coverage

7. Documentation Updates

XML Documentation:

  • Added complete XML documentation to all 22 public members in Constants.cs
  • Eliminated all CS1591 warnings
  • Added SPEC references to encoder/decoder methods

README.md Enhancements:

  • Expanded from ~60 lines to ~275 lines
  • Added comprehensive API documentation with 10+ code examples
  • Added Type Conversions table showing .NET type handling
  • Added Quick Start section
  • Added Installation section
  • Added delimiter options examples (comma, tab, pipe)
  • Added key folding examples
  • Added path expansion examples
  • Added round-trip conversion examples
  • Added Project Status section
  • Added Documentation links section
  • Now aligned with Java implementation README structure

SPEC Compliance

  • This PR implements/fixes spec compliance
  • Spec section(s) affected: Section 2 (normalization), Section 10 (list items), Section 13.4 (path expansion)
  • Spec version: v3.0 (2025-11-24)

Conformance Checklist Status (Section 13.1 & Section 13.2)

Encoder Compliance:

  • UTF-8 output with LF line endings (Section 5)
  • Consistent indentation (Section 12)
  • String escaping (Section 7.1)
  • Array length markers (Section 6, Section 9)
  • Key order preservation (Section 2)
  • Number normalization without scientific notation (Section 2)
  • -0 → 0 conversion (Section 2)
  • NaN/±Infinity → null (Section 3)
  • No trailing spaces/newlines (Section 12)
  • Objects as list items indentation (Section 10) ← FIXED
  • Key folding when enabled (Section 13.4)

Decoder Compliance:

  • Array header parsing (Section 6)
  • Delimiter splitting (Section 11)
  • String unescaping (Section 7.1)
  • Type inference (Section 4)
  • Strict mode enforcement (Section 14)
  • Order preservation (Section 2)
  • Path expansion (Section 13.4) ← NEW
  • Conflict detection in strict mode
  • LWW conflict resolution
  • Objects as list items parsing (Section 10) ← VERIFIED

Testing

  • All existing tests pass
  • Added new tests for changes
  • Tested on .NET 8.0
  • Tested on .NET 9.0

Test Results:

Passed! - Failed: 0, Passed: 371, Skipped: 0, Total: 371
Build succeeded with 0 warnings

Test Coverage:

  • 371 specification tests (100% pass rate, 0 skipped)
  • All SPEC v3.0 encoder requirements tested
  • All SPEC v3.0 decoder requirements tested
  • Edge cases covered (empty arrays, nested structures, conflicts)
  • Quoted key tracking tested and working
  • Performance benchmarks added

Checklist

  • My code follows the project's coding standards
  • I have run dotnet format
  • I have added tests that prove my fix/feature works
  • New and existing tests pass locally
  • I have updated documentation (if needed)
  • My changes do not introduce new dependencies

Additional Context

Performance Impact

  • No performance regressions introduced
  • Path expansion is opt-in via ExpandPaths option
  • Number formatting optimized to avoid scientific notation without sacrificing precision
  • Added performance benchmark tests for future regression detection

Quoted Key Tracking Implementation

Path expansion now correctly handles quoted dotted keys:

  • Modified Parser.KeyParseResult to include WasQuoted property
  • Updated Decoders.DecodeObject to track quoted keys during parsing
  • Enhanced PathExpansion.ExpandPaths to skip expansion for quoted keys
  • Modified ToonDecoder.Decode to pass quoted keys set through the pipeline

Behavior:

  • Input: a.b: 1 (unquoted) → Expands to {"a":{"b":1}}
  • Input: "c.d": 2 (quoted) → Remains as {"c.d":2}
  • Mixed input preserves both behaviors correctly

Files Changed

Core Implementation (29 files, +1205/-164 lines):

  • 13 source files modified/added
  • 16 test files modified/added
  • 3 documentation files updated

Key Files Modified:

  • src/ToonFormat/Internal/Decode/PathExpansion.cs (added quotedKeys parameter)
  • src/ToonFormat/Internal/Decode/Decoders.cs (tracking quoted keys)
  • src/ToonFormat/Internal/Decode/Parser.cs (WasQuoted property)
  • src/ToonFormat/ToonDecoder.cs (passing quoted keys to expansion)
  • src/ToonFormat/Internal/Encode/Encoders.cs (Section 10 fix)
  • src/ToonFormat/Internal/Encode/Primitives.cs (scientific notation fix)
  • src/ToonFormat/Internal/Encode/Normalize.cs (Issue IDictionary Normalize value #9 fix)
  • tests/ToonFormat.Tests/Decode/PathExpansion.cs (unskipped test)
  • README.md (comprehensive rewrite)

Migration Notes

This PR is backward compatible with one exception:

  • The Section 10 indentation fix changes the encoder output for objects as list items with tabular arrays
  • Old output had rows at depth +1, new output has rows at depth +2
  • This is a correction to match the specification, not a breaking change to the API
  • Decoders handle both formats correctly

@ghost1face ghost1face self-assigned this Nov 28, 2025
@ghost1face ghost1face added the enhancement New feature or request label Nov 28, 2025
Copy link
Contributor

@ghost1face ghost1face left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @krlosflipdev, thanks for your contribution and help to get this up to spec! I've made some initial comments and will have to look into this deeper.

…ansion.cs`

- Replaced dot (.) with `Constants.DOT` in `PathExpansion.cs`
- Added `ToonPathExpansionException`
- Updated spec generator aligned to v3.0.0
Copy link
Contributor

@johannschopplich johannschopplich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the README overhaul. Thanks! Only one minor suggestion.

Co-authored-by: Johann Schopplich <johann@schopplich.com>
Copy link
Contributor

@johannschopplich johannschopplich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (README-wise) – the overall code has to be reviewed by @toon-format/python-maintainers .

Thank you for the contribution!

- Updated `specgen.sh` as same to ps1 file
- Added exception type validation in `FixtureWriter.cs`
- Make Encode windows-friendly
- Make SpecGenerator make a clean of tests folder and re-generate the files
- Updated `specgen.sh` aligned to .ps1 file
This reverts commit f3c8e11.
@krlosflipdev
Copy link
Contributor Author

Hi, I've added more commits with updates, I've basically made that the SpecGenerator recreate the entire test folder, tested on Windows and MacOS and got 371/371 test passed.

For Windows run

.\specgen.ps1; dotnet build; dotnet format; dotnet test

For MacOS run

./specgen.sh && dotnet build && dotnet format && dotnet test

@ghost1face
Copy link
Contributor

Overall this looks good and I do appreciate the changes. I may take these changes and rebase them while applying some refactoring and some cleanup.

@johannschopplich if you don't mind I'd like to request higher level access to be able to contribute directly to push this over the finish line and get a package prepped for release.

@ghost1face
Copy link
Contributor

Looks like you've got some code that won't compile, the LengthMarker has been removed. Can you remove it here in this PR?

@ghost1face ghost1face mentioned this pull request Dec 24, 2025
21 tasks
@ghost1face
Copy link
Contributor

Opened #23 which supersedes this PR ensuring compilation and cleaning up namespaces, code generation and formatting. Thanks for your contribution!

@ghost1face ghost1face closed this Dec 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

IDictionary Normalize value

3 participants