Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.1.9] - 2026-01-09

### Changed

- Updated pragmatica-lite dependency: 0.9.0 → 0.9.10

## [0.1.8] - 2025-12-31

### Changed
Expand Down
10 changes: 7 additions & 3 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

## Project Status: FEATURE COMPLETE

## Agent Usage

**IMPORTANT:** Use ONLY `jbct-coder` agent for ALL coding and fixing tasks in this project.

## Overview

Java implementation of PEG (Parsing Expression Grammar) parser inspired by [cpp-peglib](https://github.com/yhirose/cpp-peglib).
Expand Down Expand Up @@ -144,7 +148,7 @@ Sum <- Number '+' Number { return (Integer)$1 + (Integer)$2; }
- [x] Advanced error recovery with Rust-style diagnostics
- [x] Generated parser ErrorReporting (BASIC/ADVANCED) for optional Rust-style diagnostics
- [x] Cut operator (^/↑) - commits to current choice, prevents backtracking
- [x] 305 passing tests
- [x] 308 passing tests

### Remaining Work
- [ ] Performance optimization
Expand Down Expand Up @@ -285,7 +289,7 @@ error: unexpected input
### Recovery Points
Parser recovers at: `,`, `;`, `}`, `)`, `]`, newline

## Test Coverage (305 tests)
## Test Coverage (308 tests)

### Grammar Parser Tests (17 tests)
- Simple rules, actions, sequences, choices
Expand Down Expand Up @@ -415,6 +419,6 @@ The `Keyword` rule should only include hard keywords. Contextual keywords are ma

```bash
mvn compile # Compile
mvn test # Run tests (268 passing)
mvn test # Run tests (308 passing)
mvn verify # Full verification
```
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ A PEG (Parsing Expression Grammar) parser library for Java, inspired by [cpp-peg
<dependency>
<groupId>org.pragmatica-lite</groupId>
<artifactId>peglib</artifactId>
<version>0.1.8</version>
<version>0.1.9</version>
</dependency>
```

Expand Down
4 changes: 2 additions & 2 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

<groupId>org.pragmatica-lite</groupId>
<artifactId>peglib</artifactId>
<version>0.1.8</version>
<version>0.1.9</version>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n pom.xml | head -60

Repository: siy/java-peglib

Length of output: 2543


🏁 Script executed:

# Search for all dependency declarations in pom.xml
rg -A 3 "<dependency>" pom.xml

Repository: siy/java-peglib

Length of output: 547


🌐 Web query:

pragmatica-lite 0.9.10 maven central release

💡 Result:

I couldn't find pragmatica-lite version 0.9.10 on Maven Central. The Maven Central group listing and package pages show other versions (e.g. tcp 0.9.8, core up to ~0.9.0 / 0.8.x), but no 0.9.10 release. [1][2][3]

Sources:

  • Maven Central group index for org.pragmatica-lite. [1]
  • mvnrepository listing for org.pragmatica-lite tcp (shows 0.9.8). [2]
  • javadoc.io / core package (latest shown ~0.9.0). [3]

Want me to watch for a 0.9.10 release and notify you when it appears?


🌐 Web query:

pragmatica-lite releases changelog 0.9.0 0.9.10

💡 Result:

Do you mean the pragmatica-lite "pragmatica-lite" artifact (or a specific module like core, tcp, etc.) — and do you want the full changelog entries for versions 0.9.0 and 0.9.10 or just a short summary?


🏁 Script executed:

#!/bin/bash
# Query Maven Central for pragmatica-lite:core versions
echo "Checking Maven Central for org.pragmatica-lite:core versions:"
curl -s "https://search.maven.org/solrsearch/select?q=g:org.pragmatica-lite+AND+a:core&core=gav&rows=100&wt=json" | jq '.response.docs[] | {v: .v}' | sort -u

Repository: siy/java-peglib

Length of output: 120


🏁 Script executed:

#!/bin/bash
# Also check the GitHub releases for pragmatica-lite project
echo "Searching for pragmatica-lite releases:"
curl -s https://api.github.com/repos/siy/pragmatica-lite/releases 2>/dev/null | jq '.[] | {tag_name, published_at}' | head -40

Repository: siy/java-peglib

Length of output: 720


Update pragmatica-lite:core dependency version to 0.9.0 per coding guideline.

The pom.xml specifies pragmatica-lite:core 0.9.10, but the coding guideline requires version 0.9.0 as the only runtime dependency. While 0.9.10 has been released (2026-01-07), the dependency version must align with the stated requirement.

pragmatica-lite:core is correctly the only runtime dependency; only the version number requires adjustment.

🤖 Prompt for AI Agents
In @pom.xml at line 9, Update the pragmatica-lite:core dependency version in
pom.xml from 0.9.10 to 0.9.0; locate the <version> element under the
pragmatica-lite:core dependency declaration and change its value to "0.9.0" so
the only runtime dependency matches the coding guideline.

<packaging>jar</packaging>

<name>Peglib</name>
Expand Down Expand Up @@ -38,7 +38,7 @@
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.release>25</maven.compiler.release>
<pragmatica-lite.version>0.9.0</pragmatica-lite.version>
<pragmatica-lite.version>0.9.10</pragmatica-lite.version>
</properties>

<dependencies>
Expand Down
51 changes: 29 additions & 22 deletions src/main/java/org/pragmatica/peg/action/ActionCompiler.java
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
package org.pragmatica.peg.action;

import org.pragmatica.lang.Option;
import org.pragmatica.lang.Result;
import org.pragmatica.peg.error.ParseError;
import org.pragmatica.peg.grammar.Grammar;
Expand All @@ -21,12 +22,20 @@
/**
* Compiles inline Java actions from grammar rules.
* Uses the JDK Compiler API for runtime compilation.
*
* <p><strong>Security Note:</strong> This class compiles and executes arbitrary Java code
* provided in grammar action blocks. Only use with trusted grammar sources. Never compile
* actions from untrusted user input as this enables arbitrary code execution.
*
* <p>For sandboxed execution of untrusted grammars, use source generation mode
* ({@link org.pragmatica.peg.generator.ParserGenerator}) and review generated code
* before compilation.
*/
public final class ActionCompiler {
private static final AtomicInteger COUNTER = new AtomicInteger(0);
private static final String PACKAGE = "org.pragmatica.peg.action.generated";

private final ClassLoader parentLoader;
private final AtomicInteger counter = new AtomicInteger(0);

private ActionCompiler(ClassLoader parentLoader) {
this.parentLoader = parentLoader;
Expand All @@ -50,8 +59,8 @@ public Result<Map<String, Action>> compileGrammar(Grammar grammar) {
for (var rule : grammar.rules()) {
if (rule.hasAction()) {
var result = compileAction(rule);
if (result.isFailure()) {
return result.fold(Result::failure, _ -> null);
if (result instanceof Result.Failure< ? > f) {
return Result.failure(f.cause());
}
actions.put(rule.name(), result.unwrap());
}
Expand Down Expand Up @@ -82,7 +91,7 @@ public Result<Action> compileAction(Rule rule) {
* Compile action code string.
*/
public Result<Action> compileActionCode(String ruleName, String actionCode, SourceLocation location) {
var className = "Action_" + sanitize(ruleName) + "_" + COUNTER.incrementAndGet();
var className = "Action_" + sanitize(ruleName) + "_" + counter.incrementAndGet();
var fullClassName = PACKAGE + "." + className;
// Transform action code: $0 -> sv.token(), $1 -> sv.get(0), etc.
var transformedCode = transformActionCode(actionCode);
Expand Down Expand Up @@ -146,17 +155,16 @@ private Result<Action> compileAndLoad(String className, String sourceCode, Sourc
return Result.failure(new ParseError.SemanticError(
location, "No Java compiler available. Run with JDK, not JRE."));
}
var fileManager = new InMemoryFileManager(
compiler.getStandardFileManager(null, null, null));
var sourceFile = new StringJavaFileObject(className, sourceCode);
var diagnostics = new StringWriter();
var task = compiler.getTask(
diagnostics, fileManager, null, List.of("--release", "25"), null, List.of(sourceFile));
if (!task.call()) {
return Result.failure(new ParseError.SemanticError(
location, "Action compilation failed: " + diagnostics));
}
try{
try (var standardFileManager = compiler.getStandardFileManager(null, null, null)) {
var fileManager = new InMemoryFileManager(standardFileManager);
var sourceFile = new StringJavaFileObject(className, sourceCode);
var diagnostics = new StringWriter();
var task = compiler.getTask(
diagnostics, fileManager, null, List.of("--release", "25"), null, List.of(sourceFile));
if (!task.call()) {
return Result.failure(new ParseError.SemanticError(
location, "Action compilation failed: " + diagnostics));
}
var classLoader = new InMemoryClassLoader(fileManager, parentLoader);
var actionClass = classLoader.loadClass(className);
var action = (Action) actionClass.getDeclaredConstructor()
Expand Down Expand Up @@ -228,11 +236,9 @@ public JavaFileObject getJavaFileForOutput(Location location,
return fileObject;
}

byte[] getClassBytes(String className) {
var file = classFiles.get(className);
return file != null
? file.getBytes()
: null;
Option<byte[] > getClassBytes(String className) {
return Option.option(classFiles.get(className))
.map(ByteArrayJavaFileObject::getBytes);
}
}

Expand All @@ -246,10 +252,11 @@ private static class InMemoryClassLoader extends ClassLoader {

@Override
protected Class< ? > findClass(String name) throws ClassNotFoundException {
var bytes = fileManager.getClassBytes(name);
if (bytes == null) {
var bytesOpt = fileManager.getClassBytes(name);
if (bytesOpt.isEmpty()) {
throw new ClassNotFoundException(name);
}
var bytes = bytesOpt.unwrap();
return defineClass(name, bytes, 0, bytes.length);
}
}
Expand Down
29 changes: 28 additions & 1 deletion src/main/java/org/pragmatica/peg/action/SemanticValues.java
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
package org.pragmatica.peg.action;

import org.pragmatica.lang.Option;
import org.pragmatica.lang.Result;
import org.pragmatica.lang.parse.Number;
import org.pragmatica.peg.tree.SourceSpan;

import java.util.List;
Expand Down Expand Up @@ -60,6 +62,30 @@ public double toDouble() {
return Double.parseDouble(matchedText.trim());
}

/**
* Parse matched text as integer safely.
* Returns Result with parsed value or parsing error.
*/
public Result<Integer> toIntSafe() {
return Number.parseInt(matchedText.trim());
}

/**
* Parse matched text as long safely.
* Returns Result with parsed value or parsing error.
*/
public Result<Long> toLongSafe() {
return Number.parseLong(matchedText.trim());
}

/**
* Parse matched text as double safely.
* Returns Result with parsed value or parsing error.
*/
public Result<Double> toDoubleSafe() {
return Number.parseDouble(matchedText.trim());
}

/**
* Get source span of the match.
*/
Expand Down Expand Up @@ -141,9 +167,10 @@ public <T> Option<T> getOpt(int index) {

/**
* Get all child values.
* Returns a defensive copy to preserve immutability.
*/
public List<Object> values() {
return values;
return List.copyOf(values);
}

/**
Expand Down
13 changes: 11 additions & 2 deletions src/main/java/org/pragmatica/peg/generator/ParserGenerator.java
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@
* The generated parser depends only on pragmatica-lite:core.
*/
public final class ParserGenerator {
private static final int INITIAL_BUFFER_SIZE = 32_000;
private static final int MAX_RECURSION_DEPTH = 100;

// Shared generated code fragments
private static final String MATCHES_WORD_METHOD = """
private boolean matchesWord(String word, boolean caseInsensitive) {
Expand Down Expand Up @@ -113,7 +116,7 @@ public static ParserGenerator create(Grammar grammar,
}

public String generate() {
var sb = new StringBuilder();
var sb = new StringBuilder(INITIAL_BUFFER_SIZE);
generatePackage(sb);
generateImports(sb);
generateClassStart(sb);
Expand All @@ -130,7 +133,7 @@ public String generate() {
* The generated parser preserves all source information including whitespace and comments.
*/
public String generateCst() {
var sb = new StringBuilder();
var sb = new StringBuilder(INITIAL_BUFFER_SIZE);
generatePackage(sb);
generateCstImports(sb);
generateCstClassStart(sb);
Expand Down Expand Up @@ -380,6 +383,9 @@ private void generateExpressionCode(StringBuilder sb,
String resultVar,
int indent,
int[] counter) {
if (indent > MAX_RECURSION_DEPTH) {
throw new IllegalStateException("Grammar expression nesting exceeds maximum depth of " + MAX_RECURSION_DEPTH);
}
var pad = " ".repeat(indent);
int id = counter[0]++ ;
// Get unique ID for this expression
Expand Down Expand Up @@ -2111,6 +2117,9 @@ private void generateCstExpressionCode(StringBuilder sb,
boolean addToChildren,
int[] counter,
boolean inWhitespaceRule) {
if (indent > MAX_RECURSION_DEPTH) {
throw new IllegalStateException("Grammar expression nesting exceeds maximum depth of " + MAX_RECURSION_DEPTH);
}
var pad = " ".repeat(indent);
var id = counter[0]++ ;
// Get unique ID for this expression
Expand Down
21 changes: 14 additions & 7 deletions src/main/java/org/pragmatica/peg/grammar/GrammarLexer.java
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@
* Lexer for PEG grammar syntax.
*/
public final class GrammarLexer {
private static final int MAX_INPUT_SIZE = 1_000_000;
private static final int DEFAULT_TOKEN_CAPACITY = 32;

Comment on lines +13 to +15
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid throwing in tokenize(): return a lexer error token to keep Result-based flow.

tokenize() throwing IllegalArgumentException will bypass GrammarParser.parse()’s Result<Grammar> contract on oversized inputs. Prefer emitting GrammarToken.Error so the caller can surface a ParseError consistently (as per learnings/guidelines about Result/Option usage).

Proposed fix (keep signature, no exception)
 public static List<GrammarToken> tokenize(String input) {
-    if (input.length() > MAX_INPUT_SIZE) {
-        throw new IllegalArgumentException(
-            "Grammar input exceeds maximum size of " + MAX_INPUT_SIZE + " characters");
-    }
-    return new GrammarLexer(input).tokenizeAll();
+    if (input.length() > MAX_INPUT_SIZE) {
+        var loc = SourceLocation.at(1, 1, 0);
+        var span = SourceSpan.at(loc);
+        return List.of(
+            new GrammarToken.Error(span, "Grammar input exceeds maximum size of " + MAX_INPUT_SIZE + " characters"),
+            new GrammarToken.Eof(span)
+        );
+    }
+    return new GrammarLexer(input).tokenizeAll();
 }

Also applies to: 29-33

🤖 Prompt for AI Agents
In @src/main/java/org/pragmatica/peg/grammar/GrammarLexer.java around lines 13 -
15, The current tokenize() method throws IllegalArgumentException when input
exceeds MAX_INPUT_SIZE, which breaks the Result-based parsing flow; instead,
modify tokenize() to never throw for oversized input but to emit a
GrammarToken.Error token (or sequence) indicating the oversize condition (use
the existing GrammarToken.Error variant), so GrammarParser.parse() continues to
return a Result<Grammar> and can map that token to a ParseError; update any loop
or early-return paths in tokenize() (and places referencing
DEFAULT_TOKEN_CAPACITY if relevant) to push a GrammarToken.Error with an
appropriate message/context and then stop tokenization rather than throwing an
exception.

private final String input;
private int pos;
private int line;
Expand All @@ -23,6 +26,10 @@ private GrammarLexer(String input) {
}

public static List<GrammarToken> tokenize(String input) {
if (input.length() > MAX_INPUT_SIZE) {
throw new IllegalArgumentException(
"Grammar input exceeds maximum size of " + MAX_INPUT_SIZE + " characters");
}
return new GrammarLexer(input).tokenizeAll();
}

Expand Down Expand Up @@ -75,7 +82,7 @@ private GrammarToken nextToken() {
}

private GrammarToken scanIdentifier(SourceLocation start) {
var sb = new StringBuilder();
var sb = new StringBuilder(DEFAULT_TOKEN_CAPACITY);
while (!isAtEnd() && isIdentifierPart(peek())) {
sb.append(advance());
}
Expand All @@ -85,7 +92,7 @@ private GrammarToken scanIdentifier(SourceLocation start) {
private GrammarToken scanDirective(SourceLocation start) {
advance();
// skip %
var sb = new StringBuilder();
var sb = new StringBuilder(DEFAULT_TOKEN_CAPACITY);
while (!isAtEnd() && isIdentifierPart(peek())) {
sb.append(advance());
}
Expand All @@ -94,7 +101,7 @@ private GrammarToken scanDirective(SourceLocation start) {

private GrammarToken scanStringLiteral(SourceLocation start) {
char quote = advance();
var sb = new StringBuilder();
var sb = new StringBuilder(DEFAULT_TOKEN_CAPACITY);
while (!isAtEnd() && peek() != quote) {
if (peek() == '\\' && pos + 1 < input.length()) {
advance();
Expand Down Expand Up @@ -125,7 +132,7 @@ private GrammarToken scanCharClass(SourceLocation start) {
negated = true;
advance();
}
var sb = new StringBuilder();
var sb = new StringBuilder(DEFAULT_TOKEN_CAPACITY);
while (!isAtEnd() && peek() != ']') {
if (peek() == '\\' && pos + 1 < input.length()) {
advance();
Expand Down Expand Up @@ -163,7 +170,7 @@ private GrammarToken scanCharClass(SourceLocation start) {
private GrammarToken scanActionCode(SourceLocation start) {
advance();
// skip {
var sb = new StringBuilder();
var sb = new StringBuilder(DEFAULT_TOKEN_CAPACITY);
int braceDepth = 1;
while (!isAtEnd() && braceDepth > 0) {
char c = peek();
Expand Down Expand Up @@ -192,7 +199,7 @@ private GrammarToken scanActionCode(SourceLocation start) {
}

private String scanJavaString() {
var sb = new StringBuilder();
var sb = new StringBuilder(DEFAULT_TOKEN_CAPACITY);
char quote = advance();
sb.append(quote);
while (!isAtEnd() && peek() != quote) {
Expand All @@ -208,7 +215,7 @@ private String scanJavaString() {
}

private GrammarToken scanNumber(SourceLocation start) {
var sb = new StringBuilder();
var sb = new StringBuilder(DEFAULT_TOKEN_CAPACITY);
while (!isAtEnd() && isDigit(peek())) {
sb.append(advance());
}
Expand Down
12 changes: 6 additions & 6 deletions src/main/java/org/pragmatica/peg/grammar/GrammarParser.java
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,8 @@ private Result<Grammar> parseGrammar() {
if (token instanceof GrammarToken.Directive directive) {
advance();
var result = parseDirective(directive);
if (result.isFailure()) {
return result.fold(Result::failure, _ -> null);
if (result instanceof Result.Failure< ? > f) {
return Result.failure(f.cause());
}
var expr = result.unwrap();
switch (directive.name()) {
Expand All @@ -59,8 +59,8 @@ private Result<Grammar> parseGrammar() {
}
}else if (token instanceof GrammarToken.Identifier) {
var result = parseRule();
if (result.isFailure()) {
return result.fold(Result::failure, _ -> null);
if (result instanceof Result.Failure< ? > f) {
return Result.failure(f.cause());
}
rules.add(result.unwrap());
}else if (token instanceof GrammarToken.Eof) {
Expand Down Expand Up @@ -110,8 +110,8 @@ private Result<Rule> parseRule() {
"'<-'"));
}
var exprResult = parseExpression();
if (exprResult.isFailure()) {
return exprResult.fold(Result::failure, _ -> null);
if (exprResult instanceof Result.Failure< ? > f) {
return Result.failure(f.cause());
}
var expression = exprResult.unwrap();
// Check for action and/or error_message
Expand Down
Loading