Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Nov 25, 2025

  • Analyze the TOON v2 specification changes
  • Review the current goon implementation
  • Identify required changes:
    • Change [N|] format to [N] for default comma delimiter
    • Use inline format for primitive arrays instead of list format
    • Ensure proper quoting of strings per TOON v2 spec
  • Update encoder.go:
    • Add quoting logic for strings that look like numbers/booleans
    • Use [N]: format for comma delimiter (instead of [N|])
    • Use inline format for primitive arrays (key[N]: v1,v2,v3)
    • Keep list format for arrays of objects
    • Add Version flag to EncodeOptions (V1 or V2)
    • Add V1 encoder functions for legacy format output
    • Fix isPrimitive to handle JsonObject and JsonArray types
    • Fix empty object handling (write just "-" without trailing space)
  • Update parser.go and parser_utils.go:
    • Add ParseArrayHeaderLineTOONv2 for new format
    • Handle inline primitive arrays in decoding
    • Fix sibling field depth detection in object-as-list-item parsing
    • Fix escape sequence handling to process char by char
  • Update test files to match new TOON v2 format
  • Update marshal_test.go to reflect new behavior
  • Add escape sequence tests
  • Add comprehensive tests for both V1 and V2 formats (version_test.go):
    • V2 format encoding tests (inline arrays)
    • V1 format encoding tests (list arrays)
    • Decoding tests for both formats
    • Round-trip tests for both formats
    • Edge case tests (empty, single element, special chars)
    • Struct round-trip tests
  • Run and verify all tests pass
  • Address PR review comments
  • Run code review
  • Run security scan
Original prompt

update to support toon v2 that was just released


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Summary by CodeRabbit

  • New Features

    • Introduced TOON v2 format with updated array syntax and inline value handling
    • Added version control option allowing users to choose between V1 (legacy) and V2 formats
    • Improved string handling with proper escape sequence support and smart quoting for numeric-like values
  • Tests

    • Comprehensive test suite added for v1/v2 format compatibility and round-trip integrity

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 25, 2025

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

This PR introduces TOON v2 format support alongside v1 backward compatibility. Changes include: new inline array header syntax ([N]: format), string quoting for numeric-like values, version-aware encoding/decoding paths, updated parser utilities to handle both formats, and test data conversions to v2 format.

Changes

Cohort / File(s) Summary
Encoding Logic Enhancement
pkg/toon/encoder.go
Added primitive detection helpers (isPrimitive, isPrimitiveArray), string quoting utilities (needsQuoting, quoteString), and new array encoding paths (encodeArrayWithKey, encodeArrayV1, encodeArray) with v1/v2 branching. Refactored encodeValue and encodeObject to inline key handling and route array values through version-specific encoders.
Parsing Infrastructure
pkg/toon/parser.go
Added TOON v2 root array detection and parsing via ParseArrayHeaderLineTOONv2, with branching for inline values, tabular arrays, and list arrays. Enhanced key-value decoding (decodeKeyValue) to recognize v2 array headers. Improved list item object decoding (decodeObjectFromListItem) with corrected sibling depth handling. Maintained backward-compatibility paths for legacy formats.
Parser Utilities
pkg/toon/parser_utils.go
Enhanced ParsePrimitiveToken to handle quoted strings with escape sequences (\n, \t, \", \\). Added new function ParseArrayHeaderLineTOONv2 to parse v2 array headers with support for numeric length, delimiters, and optional fields. Preserved existing helpers for v1 compatibility.
Versioning Support
pkg/toon/types.go
Added public type TOONVersion (int), introduced version constants V2 (default/iota) and V1 (legacy), and extended EncodeOptions struct with new Version field.
Test Coverage & Updates
pkg/toon/marshal_test.go, pkg/toon/version_test.go
Updated existing tests to reflect v2 behavior (quoted numeric strings, inline array headers). Added comprehensive test suite covering v1/v2 encoding variants, round-trip compatibility, escape sequences, edge cases, and struct-based marshaling.
Test Data Conversion
tests/test.toon, tests/test_advanced.toon, tests/test_parity.toon
Converted test files from v1 to v2 format: replaced array syntax from [N|] to [N]: (inline headers), converted YAML-style lists to comma-separated indexed arrays, and adjusted nesting/indentation accordingly.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Encoding/Decoding Logic: Multiple new conditional paths with v1/v2 branching require careful verification of correctness, especially nested array/object handling and quoting behavior
  • Parser Enhancements: New array header parsing with fallback chains and depth tracking in list items need thorough validation
  • Backward Compatibility: V1 format support alongside v2 requires checking that legacy tests and edge cases are properly handled
  • Test Coverage: Comprehensive version_test.go should be reviewed for adequate edge case coverage and correct assertion of format signatures

Poem

🐰 A rabbit's tale of versions spun,
From v1 dusty, v2 is fun,
Arrays inline with colon's grace,
Quoted strings in their right place,
Backward hops still work just fine—
Format evolution, by design!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.65% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately describes the main change: adding TOON v2 format support while maintaining v1 compatibility, which aligns with the primary objective and scope of the changes across encoder, parser, tests, and sample files.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI and others added 3 commits November 25, 2025 01:22
Co-authored-by: tnfssc <29162020+tnfssc@users.noreply.github.com>
Co-authored-by: tnfssc <29162020+tnfssc@users.noreply.github.com>
Co-authored-by: tnfssc <29162020+tnfssc@users.noreply.github.com>
Copilot AI changed the title [WIP] Update to support Toon V2 release Add TOON v2 format support Nov 25, 2025
Copilot AI requested a review from tnfssc November 25, 2025 01:29
@tnfssc
Copy link
Owner

tnfssc commented Nov 25, 2025

@copilot write and test comphrensively for both v1 and v2 formats of TOON. Default the output to v2 but add a flag that can output v1 as well

Co-authored-by: tnfssc <29162020+tnfssc@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Nov 25, 2025

@copilot write and test comphrensively for both v1 and v2 formats of TOON. Default the output to v2 but add a flag that can output v1 as well

Added comprehensive tests for both V1 and V2 formats and a Version flag in EncodeOptions. V2 is now the default output.

Usage:

// V2 (default) - inline primitive arrays
data, _ := toon.Marshal(obj, toon.EncodeOptions{IndentSize: 2})
// Output: tags[3]: a,b,c

// V1 - list-style arrays  
data, _ := toon.Marshal(obj, toon.EncodeOptions{IndentSize: 2, Version: toon.V1})
// Output: tags: [3|]\n  - a\n  - b\n  - c

New tests in version_test.go cover encoding, decoding, round-trips, and edge cases for both formats.

Commit: 2184b1c

@tnfssc tnfssc marked this pull request as ready for review November 28, 2025 04:48
@tnfssc
Copy link
Owner

tnfssc commented Nov 28, 2025

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Nov 28, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
pkg/toon/encoder.go (1)

337-431: Code duplication acknowledged but acceptable for now.

There's substantial structural similarity between the V1 and V2 encoding functions (encodeArrayWithKey vs encodeArrayWithKeyV1, encodeArray vs encodeArrayV1). The PR description notes this was reviewed and marked acceptable. Consider extracting shared logic into helper functions in a future refactor to improve maintainability.

pkg/toon/marshal_test.go (1)

350-377: Consider adding edge case for trailing backslash.

The escape sequence tests are comprehensive, but consider adding a test case for a trailing backslash at the end of the string (e.g., "hello\\""hello\"), which could expose edge cases in the character-by-character parsing loop.

{"trailing backslash", `"hello\\"`, "hello\\"},
pkg/toon/version_test.go (1)

288-306: Consider extracting repeated array conversion logic into a helper.

The same type-switch pattern for handling []interface{} vs JsonArray is repeated ~15 times throughout the file. A helper function would reduce duplication and improve maintainability.

// toSlice converts JsonArray or []interface{} to []interface{}
func toSlice(t *testing.T, v interface{}) []interface{} {
	t.Helper()
	switch arr := v.(type) {
	case []interface{}:
		return arr
	case JsonArray:
		result := make([]interface{}, len(arr))
		for i, item := range arr {
			result[i] = item
		}
		return result
	default:
		t.Fatalf("Expected array type, got %T", v)
		return nil
	}
}
pkg/toon/parser.go (1)

192-197: Dead code: decodeArrayFromHeader inline handling is now unreachable.

The inline array path is now handled directly in decodeValueFromLines (lines 40-48) and decodeKeyValue (lines 133-141) before decodeArrayFromHeader is called. The TODO and error on lines 194-196 are never reached.

Consider removing the dead inline handling or updating the function signature:

 func decodeArrayFromHeader(header *ArrayHeaderInfo, inlineValues string, cursor *LineCursor, baseDepth int, options DecodeOptions) (JsonArray, error) {
-	if inlineValues != "" {
-		// Inline primitive array
-		// TODO: Implement inline array decoding
-		return nil, fmt.Errorf("inline arrays not yet implemented")
-	}
-
 	if len(header.Fields) > 0 {
 		return decodeTabularArray(header, cursor, baseDepth, options)
 	}
 
 	return decodeListArray(header, cursor, baseDepth, options)
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 84a7d55 and 2184b1c.

📒 Files selected for processing (9)
  • pkg/toon/encoder.go (6 hunks)
  • pkg/toon/marshal_test.go (3 hunks)
  • pkg/toon/parser.go (4 hunks)
  • pkg/toon/parser_utils.go (1 hunks)
  • pkg/toon/types.go (2 hunks)
  • pkg/toon/version_test.go (1 hunks)
  • tests/test.toon (1 hunks)
  • tests/test_advanced.toon (1 hunks)
  • tests/test_parity.toon (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
pkg/toon/encoder.go (1)
pkg/toon/types.go (3)
  • EncodeOptions (42-45)
  • V1 (32-32)
  • IndentSize (12-12)
pkg/toon/marshal_test.go (2)
pkg/toon/marshal.go (1)
  • Unmarshal (31-38)
pkg/toon/types.go (2)
  • DecodeOptions (36-39)
  • IndentSize (12-12)
pkg/toon/parser.go (2)
pkg/toon/parser_utils.go (6)
  • ParseArrayHeaderLineTOONv2 (79-147)
  • ParseDelimitedValues (264-310)
  • ParsePrimitiveToken (18-74)
  • ParseArrayHeaderLine (152-261)
  • ParseInlineArray (313-334)
  • IsArrayHeaderAfterHyphen (337-340)
pkg/toon/types.go (1)
  • JsonArray (23-23)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: CodeQL analysis (go)
  • GitHub Check: Agent
🔇 Additional comments (20)
tests/test_parity.toon (1)

3-7: LGTM!

The tabular array header format update from [3|name|role] to [3|]{name|role}: correctly follows the TOON v2 specification changes described in the PR objectives. The data rows remain properly formatted with pipe-delimited values.

tests/test.toon (1)

1-11: LGTM!

The test file correctly demonstrates TOON v2 format features:

  • contributors[2]: uses the v2 array header syntax
  • features[3]: parser,encoder,cli shows inline primitive array encoding
  • Object and scalar values follow proper v2 formatting
pkg/toon/encoder.go (5)

44-87: LGTM!

The needsQuoting function comprehensively covers edge cases for string quoting: empty strings, whitespace, reserved literals, numeric-like values, special characters, and control characters. The ordering of checks is logical.


89-98: LGTM!

The escape sequence order is correct—backslash is escaped first (line 92) before other characters, which prevents double-escaping issues.


170-193: LGTM!

The value type branching is well-structured:

  • Objects: colon followed by newline and nested content
  • Arrays: delegated to encodeArrayWithKey for version-aware handling
  • Primitives: colon-space followed by value

100-119: LGTM!

The encodePrimitiveValue function correctly handles all expected primitive types with appropriate formatting. The default case provides a reasonable fallback, and callers are expected to filter non-primitives via isPrimitive checks.


433-438: LGTM!

The version branching correctly defaults to V2 format (since V2 = iota means zero-value defaults to V2), with V1 requiring explicit opt-in. This aligns with the PR requirement to "default output to v2."

tests/test_advanced.toon (1)

1-7: LGTM!

The test file effectively demonstrates TOON v2 format features:

  • Line 1: Inline primitive array with unquoted string values
  • Line 2: String values that look like numbers are properly quoted ("1", "3.5")
  • Line 3: Tabular array header with column definitions
  • Line 7: Nested inline array within an object

This provides good coverage for v2 encoding behavior.

pkg/toon/types.go (2)

25-33: LGTM!

The TOONVersion type design is well thought out:

  • V2 = iota ensures zero-value defaults to V2 (the new default format)
  • Clear documentation distinguishes V2 (inline arrays) from V1 (list-style arrays)
  • Dedicated type provides type safety for version handling

41-45: LGTM!

The Version field addition to EncodeOptions is clean and non-breaking. Existing callers that don't specify Version will automatically get V2 format due to the zero-value default.

pkg/toon/marshal_test.go (3)

124-128: LGTM - TOON v2 quoting behavior validated.

The test correctly expects numeric-like string "123" to be quoted in the output, aligning with TOON v2 spec requirements.


230-234: LGTM - Updated to TOON v2 array header format.

The test now expects [5]: instead of the legacy [5|] format, correctly reflecting V2 default behavior.


237-270: Good coverage for both V2 inline and V1 legacy array formats.

The tests validate:

  • V2 inline format ([3]: 10,20,30) in TestUnmarshalArray
  • Legacy list format ([3|] with - item) in TestUnmarshalArrayLegacyList

This ensures backward compatibility while testing the new format.

pkg/toon/parser_utils.go (1)

76-147: LGTM - Well-structured TOON v2 header parser.

The function correctly:

  • Extracts length from [N] bracket content
  • Determines delimiter (comma default, pipe, or tab)
  • Parses optional {field1,field2} segment
  • Enforces trailing colon requirement

The implementation aligns with the V2 specification described in the PR objectives.

pkg/toon/version_test.go (3)

702-719: LGTM - Version constant validation is sound.

The tests correctly verify:

  • V2 == 0 (default zero value for Go's iota)
  • V1 == 1
  • Default EncodeOptions.Version equals V2

This ensures the zero-value behavior defaults to V2 as intended by the PR.


725-739: Good edge case coverage for special characters.

The test validates that strings containing commas, colons, and quotes are properly quoted in V2 output, ensuring delimiter/syntax conflicts are handled.


807-821: I'll help you verify whether there's a corresponding decode test for null arrays. Let me search the codebase for relevant test coverage.
<function_calls>


#!/bin/bash

Check for decode tests related to null arrays

rg -n "null,null" --type=go -C 3


</function_calls>

Let me also search for Unmarshal or decode tests that might handle this case:
<function_calls>


#!/bin/bash

Search for Unmarshal tests in the test file

rg -n "Unmarshal.*null" --type=go -C 2


</function_calls>

Let me get a broader view of the test file structure:
<function_calls>


#!/bin/bash

Find the test file and check its structure

fd "version_test.go" --type f


</function_calls>

Let me search for tests that specifically decode or unmarshal null array formats:
<function_calls>


#!/bin/bash

Search for test functions that handle null decoding/unmarshaling

rg -n "func Test.[Uu]nmarshal.[Nn]ull|func Test.[Dd]ecode.[Nn]ull" --type=go


</function_calls>

Let me search more broadly for Unmarshal tests:
<function_calls>


#!/bin/bash

Get all test functions in version_test.go that involve Unmarshal

rg -n "func Test.Unmarshal|Unmarshal.[.*null" --type=go -A 5


</function_calls>

pkg/toon/parser.go (3)

30-70: LGTM - Well-structured fallback chain for root array parsing.

The implementation correctly attempts parsing in order:

  1. TOON v2 format ([N]: with inline values or list/tabular)
  2. Legacy format ([N|])
  3. Inline array ([item1, item2])

This ensures backward compatibility while prioritizing the new format.


111-152: LGTM - Clean integration of V2 array header detection in key-value parsing.

The logic correctly:

  • Detects key[N]: or key[N]{fields}: patterns
  • Extracts the key before the bracket
  • Delegates to ParseArrayHeaderLineTOONv2 for header parsing
  • Handles inline values vs list/tabular formats appropriately

274-299: LGTM - Correct sibling depth calculation for list-item objects.

The fix correctly sets siblingDepth = baseDepth + 1 to account for the indentation after - . The added check on line 285 properly terminates object parsing when encountering a new list item at the same base depth.

Comment on lines 228 to 235
if obj, ok := item.(map[string]interface{}); ok {
// Special handling for object in list
if len(obj) == 0 {
// Empty object - just the hyphen (remove the space we added)
// Actually, per spec, empty object is just "-" on its own line
// We already wrote "- ", so we need to handle this better
// Let's trim the space we added
} else {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Empty object handling leaves trailing whitespace.

The comment on lines 231-234 acknowledges that empty objects should be formatted as just - without a trailing space, but the code block is empty and doesn't actually trim the space written on line 226. This results in - (with trailing space) instead of - for empty objects.

Consider implementing the trim or adjusting the write logic:

-			sb.WriteString("- ")
-
 			if obj, ok := item.(map[string]interface{}); ok {
-				// Special handling for object in list
 				if len(obj) == 0 {
-					// Empty object - just the hyphen (remove the space we added)
-					// Actually, per spec, empty object is just "-" on its own line
-					// We already wrote "- ", so we need to handle this better
-					// Let's trim the space we added
+					sb.WriteString("-")
 				} else {
+					sb.WriteString("- ")
 					keys := make([]string, 0, len(obj))

Note: This same pattern appears in encodeArrayWithKeyV1 (line 350), encodeArray (line 464), and encodeArrayV1 (line 572).

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In pkg/toon/encoder.go around lines 228 to 235, the code comments note empty
objects should render as "-" but the code doesn't remove the trailing space
already written, leaving "- " in output; update the logic to avoid writing the
extra space for empty objects (either detect empty map before writing the " - "
prefix and write "\n-" alone, or write the "- " as now and explicitly remove the
last space byte when obj is empty), and apply the same fix to the equivalent
blocks in encodeArrayWithKeyV1 (around line 350), encodeArray (around line 464),
and encodeArrayV1 (around line 572) so empty objects produce a single "-" with
no trailing space.

Comment on lines +36 to 71
// Handle quoted strings with escape sequences
if strings.HasPrefix(token, "\"") && strings.HasSuffix(token, "\"") && len(token) >= 2 {
inner := token[1 : len(token)-1]
// Unescape: process escape sequences character by character
var result strings.Builder
i := 0
for i < len(inner) {
if inner[i] == '\\' && i+1 < len(inner) {
switch inner[i+1] {
case '\\':
result.WriteByte('\\')
i += 2
case '"':
result.WriteByte('"')
i += 2
case 'n':
result.WriteByte('\n')
i += 2
case 'r':
result.WriteByte('\r')
i += 2
case 't':
result.WriteByte('\t')
i += 2
default:
// Unknown escape, keep as is
result.WriteByte(inner[i])
i++
}
} else {
result.WriteByte(inner[i])
i++
}
}
return result.String()
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Byte-level iteration may corrupt multi-byte UTF-8 characters.

The escape sequence parser iterates over inner using byte indices (inner[i]), but non-ASCII characters in the string occupy multiple bytes. If a backslash precedes a multi-byte character or appears mid-sequence, the logic may misinterpret bytes.

Consider using []rune for character-by-character iteration:

-		inner := token[1 : len(token)-1]
-		// Unescape: process escape sequences character by character
-		var result strings.Builder
-		i := 0
-		for i < len(inner) {
-			if inner[i] == '\\' && i+1 < len(inner) {
-				switch inner[i+1] {
+		inner := []rune(token[1 : len(token)-1])
+		var result strings.Builder
+		i := 0
+		for i < len(inner) {
+			if inner[i] == '\\' && i+1 < len(inner) {
+				switch inner[i+1] {
 				case '\\':
-					result.WriteByte('\\')
+					result.WriteRune('\\')
 					i += 2
 				case '"':
-					result.WriteByte('"')
+					result.WriteRune('"')
 					i += 2
 				// ... similar changes for other cases
 				default:
-					result.WriteByte(inner[i])
+					result.WriteRune(inner[i])
 					i++
 				}
 			} else {
-				result.WriteByte(inner[i])
+				result.WriteRune(inner[i])
 				i++
 			}
 		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Handle quoted strings with escape sequences
if strings.HasPrefix(token, "\"") && strings.HasSuffix(token, "\"") && len(token) >= 2 {
inner := token[1 : len(token)-1]
// Unescape: process escape sequences character by character
var result strings.Builder
i := 0
for i < len(inner) {
if inner[i] == '\\' && i+1 < len(inner) {
switch inner[i+1] {
case '\\':
result.WriteByte('\\')
i += 2
case '"':
result.WriteByte('"')
i += 2
case 'n':
result.WriteByte('\n')
i += 2
case 'r':
result.WriteByte('\r')
i += 2
case 't':
result.WriteByte('\t')
i += 2
default:
// Unknown escape, keep as is
result.WriteByte(inner[i])
i++
}
} else {
result.WriteByte(inner[i])
i++
}
}
return result.String()
}
// Handle quoted strings with escape sequences
if strings.HasPrefix(token, "\"") && strings.HasSuffix(token, "\"") && len(token) >= 2 {
inner := []rune(token[1 : len(token)-1])
var result strings.Builder
i := 0
for i < len(inner) {
if inner[i] == '\\' && i+1 < len(inner) {
switch inner[i+1] {
case '\\':
result.WriteRune('\\')
i += 2
case '"':
result.WriteRune('"')
i += 2
case 'n':
result.WriteRune('\n')
i += 2
case 'r':
result.WriteRune('\r')
i += 2
case 't':
result.WriteRune('\t')
i += 2
default:
// Unknown escape, keep as is
result.WriteRune(inner[i])
i++
}
} else {
result.WriteRune(inner[i])
i++
}
}
return result.String()
}
🤖 Prompt for AI Agents
In pkg/toon/parser_utils.go around lines 36 to 71, the escape-sequence
unescaping iterates bytes which can corrupt multi-byte UTF-8 runes; convert
inner to a []rune and iterate rune-by-rune so you treat Unicode characters as
single code points, perform escape lookahead on the next rune (not next byte),
write runes to the strings.Builder (use WriteRune) and when encountering unknown
escapes emit the backslash and the following rune unchanged; ensure you advance
the rune index correctly for two-rune escapes and handle end-of-input safely.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements support for TOON v2 format specification, introducing inline primitive array syntax while maintaining backward compatibility with TOON v1. The default behavior now uses V2 format (key[3]: a,b,c), but V1 format (key: [3|] with list items) is still available via the Version option.

Key changes:

  • V2 format uses inline arrays for primitives ([N]: values) and list format only for complex types
  • Added string quoting logic to prevent ambiguity with numbers/booleans
  • Implemented character-by-character escape sequence processing for proper \\n vs \n handling

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pkg/toon/types.go Added TOONVersion type with V1/V2 constants and Version field to EncodeOptions
pkg/toon/encoder.go Implemented V2 inline array encoding, string quoting logic, and separate V1 encoding functions
pkg/toon/parser.go Added V2 format parsing with inline array support and improved object-in-list depth handling
pkg/toon/parser_utils.go Added ParseArrayHeaderLineTOONv2 function and character-by-character escape sequence handling
pkg/toon/marshal_test.go Updated tests for V2 format expectations and added escape sequence tests
pkg/toon/version_test.go Comprehensive test suite covering V1/V2 encoding, decoding, and round-trip scenarios
tests/*.toon Updated test files to use V2 format syntax

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 24 to 32
// isPrimitive checks if a value is a primitive (not object or array)
func isPrimitive(v interface{}) bool {
switch v.(type) {
case map[string]interface{}, []interface{}:
return false
default:
return true
}
}
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The isPrimitive function doesn't handle JsonObject and JsonArray types, which are defined in types.go. Since these are type aliases for map[string]JsonValue and []JsonValue, they won't match the case statement and will incorrectly be classified as primitives. Add cases for JsonObject and JsonArray types.

Copilot uses AI. Check for mistakes.
Comment on lines 226 to 234
sb.WriteString("- ")

if obj, ok := item.(map[string]interface{}); ok {
// Special handling for object in list
if len(obj) == 0 {
// Empty object - just the hyphen (remove the space we added)
// Actually, per spec, empty object is just "-" on its own line
// We already wrote "- ", so we need to handle this better
// Let's trim the space we added
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment indicates incomplete handling of empty objects. The code writes \"- \" on line 226 but doesn't remove the trailing space for empty objects as the comment suggests. Either implement the intended behavior or remove the misleading comment.

Suggested change
sb.WriteString("- ")
if obj, ok := item.(map[string]interface{}); ok {
// Special handling for object in list
if len(obj) == 0 {
// Empty object - just the hyphen (remove the space we added)
// Actually, per spec, empty object is just "-" on its own line
// We already wrote "- ", so we need to handle this better
// Let's trim the space we added
if obj, ok := item.(map[string]interface{}); ok {
// Special handling for object in list
if len(obj) == 0 {
// Empty object - just the hyphen, per spec
sb.WriteString("-")

Copilot uses AI. Check for mistakes.
Comment on lines +195 to +196
// encodeArrayWithKey encodes an array as a value for a key (key already written, need to write header and content)
func encodeArrayWithKey(sb *strings.Builder, arr []interface{}, depth int, options EncodeOptions) error {
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is significant code duplication between encodeArrayWithKey (lines 195-335), encodeArray (lines 433-557), and their V1 counterparts. The object-in-list encoding logic (lines 228-297, 463-524, etc.) is nearly identical across multiple functions. Consider extracting this into a shared helper function to improve maintainability and reduce the risk of inconsistencies.

Copilot uses AI. Check for mistakes.
{"simple tab", `"hello\tworld"`, "hello\tworld"},
{"simple quote", `"hello\"world"`, "hello\"world"},
{"simple backslash", `"hello\\world"`, "hello\\world"},
{"backslash then n", `"hello\\nworld"`, "hello\\nworld"},
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Test case name 'backslash then n' is ambiguous. This tests the sequence \\n which should be interpreted as a literal backslash followed by 'n', not a newline. Consider renaming to 'escaped backslash then n' or 'literal backslash-n' to clarify the intent.

Suggested change
{"backslash then n", `"hello\\nworld"`, "hello\\nworld"},
{"literal backslash-n", `"hello\\nworld"`, "hello\\nworld"},

Copilot uses AI. Check for mistakes.
Comment on lines 461 to 466
sb.WriteString("- ")

if obj, ok := item.(map[string]interface{}); ok {
if len(obj) == 0 {
// Empty object - remove the trailing space from "- "
} else {
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to Comment 2, the code writes \"- \" but the comment indicates the trailing space should be removed for empty objects. This same incomplete handling appears in multiple places (lines 230-235, 464-466, 572-574). Implement consistent empty object handling across all occurrences.

Suggested change
sb.WriteString("- ")
if obj, ok := item.(map[string]interface{}); ok {
if len(obj) == 0 {
// Empty object - remove the trailing space from "- "
} else {
if obj, ok := item.(map[string]interface{}); ok {
if len(obj) == 0 {
// Empty object - remove the trailing space from "- "
sb.WriteString("-")
} else {
sb.WriteString("- ")

Copilot uses AI. Check for mistakes.
@tnfssc
Copy link
Owner

tnfssc commented Nov 28, 2025

@copilot Address the comments dropped in the PR

… test name

Co-authored-by: tnfssc <29162020+tnfssc@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Nov 28, 2025

@copilot Address the comments dropped in the PR

Addressed all PR review comments:

  1. isPrimitive function - Added JsonObject and JsonArray types to the type switch to prevent them from being incorrectly classified as primitives.

  2. Empty object handling - Fixed in all 4 encoder functions. Now writes just "-" (no trailing space) for empty objects in list items.

  3. Test case name - Renamed "backslash then n" to "literal backslash-n" for clarity.

Commit: df22f92

@tnfssc tnfssc merged commit 7b2c771 into develop Nov 29, 2025
13 checks passed
@tnfssc tnfssc deleted the copilot/update-to-support-toon-v2 branch November 29, 2025 00:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants