Learning Context-Free Grammars (CFG) and Visibly Pushdown Languages (VPL)

A Python implementation for learning Context-Free Grammars and Visibly Pushdown Languages using tree automata and the TL* algorithm.

Overview

This project implements algorithms for learning formal languages, specifically:

Tree Automata Learning: Implementation of the TL* algorithm for learning tree automata from examples
Visibly Pushdown Language (VPL) Learning: Extension of tree automata learning to VPLs using VPL* algorithm
Context-Free Grammar Learning: Learning CFGs through tree automata representations

The project provides a complete framework for:

Tree automata construction and manipulation
Observation table-based learning algorithms
Tree-to-sequence and sequence-to-tree encoding/decoding
Well-formedness checking for VPL sequences
Various examples demonstrating the learning algorithms

Project Structure

learning-cfg/
├── base/                          # Core data structures
│   ├── alphabet.py               # Alphabet definitions (RankedAlphabet, VPAlphabet)
│   ├── state.py                  # State definitions
│   ├── tree.py                   # Tree and Context data structures
│   └── vpl.py                    # Abstract VPL base class
├── tree_automata/                # Tree automata implementation
│   ├── tree_automata.py          # Main tree automata class
│   ├── tree_comparator.py        # Tree automata comparison utilities
│   ├── tree_generator.py         # Tree generation utilities
│   └── tree_state.py             # Tree automata state definitions
├── tree_automata_extraction/     # Learning algorithms
│   ├── observation_table.py      # Observation table implementation
│   └── tl_star.py               # TL* learning algorithm
├── vpl_extraction/               # VPL learning algorithms
│   ├── vpl_star.py              # VPL* learning algorithm
│   ├── vpl_star_oracle.py       # VPL oracle implementation
│   ├── vpl_star_comparator.py   # VPL comparison utilities
│   └── vpl_star_random_comparator.py  # Random comparator for VPLs
├── models/                       # Model implementations
│   ├── tree_automata_vpl.py     # Tree automata to VPL conversion
│   └── vpa.py                   # Visibly Pushdown Automata
├── utils/                        # Utility functions
│   ├── encoding.py              # Tree-sequence encoding/decoding
│   └── well_formed.py           # Well-formedness checking
├── examples/                     # Example implementations
│   ├── tree_automata_example.py # Tree automata examples
│   ├── tl_star_example.py       # TL* algorithm examples
│   ├── ot_example.py            # Observation table examples
│   ├── context_example.py       # Context examples
│   └── dyck1.py                 # Dyck1 language implementation
├── run.py                       # Main execution script
└── encoding_tests.ipynb         # Jupyter notebook with encoding tests

Key Features

1. Tree Automata

Tree Structure: Implementation of ranked trees with symbols of varying arity
Tree Automata: Deterministic tree automata with state transitions
Tree Processing: Efficient tree traversal and acceptance checking
Tree Comparison: Equivalence checking between tree automata

2. TL* Learning Algorithm

Observation Table: Maintains examples and their observations
Context Management: Handles tree contexts for learning
Counterexample Processing: Extends observation table with counterexamples
Automata Synthesis: Constructs tree automata from observation tables

3. VPL Learning (VPL*)

VPL Support: Learning Visibly Pushdown Languages
Tree-to-VPL Conversion: Converting learned tree automata to VPLs
Well-formedness Checking: Validating VPL sequences
Encoding/Decoding: Converting between trees and VPL sequences

4. Utility Functions

Alphabet Management: Ranked alphabets and VPA alphabets
Tree Operations: Context application, tree manipulation
Sequence Processing: Well-formedness validation for VPL sequences

Usage

Running Examples

The project includes several examples that can be run using the main script:

# Run all examples
python run.py

# Run specific examples
python run.py 1  # TL* example
python run.py 2  # Observation table example
python run.py 3  # Tree automata example
python run.py 4  # Context example

Basic Tree Automata Usage

from tree_automata.tree_automata import TreeAutomata, TreeAutomataState
from base.tree import Tree, TreeSymbol
from base.alphabet import RankedAlphabet

# Define states
q0 = TreeAutomataState('q0')
q1 = TreeAutomataState('q1')

# Define alphabet
a = TreeSymbol('a', 2)  # Binary symbol
b = TreeSymbol('b', 0)  # Leaf symbol
alphabet = RankedAlphabet({a, b}, "TestAlphabet")

# Define transitions
transitions = {
    TreeAutomataTransitionKey(b, []): q1,
    TreeAutomataTransitionKey(a, [q1, q1]): q0,
}

# Create automaton
automaton = TreeAutomata({q0, q1}, alphabet, {q0}, transitions)

# Test a tree
tree = Tree(a, [Tree(b, []), Tree(b, [])])
is_accepted = automaton.is_accepted(tree)

TL* Learning Example

from tree_automata_extraction.tl_star import TLStar
from examples.tree_automata_example import simple_tree_automaton

# Create TL* learner
tl_star = TLStar(simple_tree_automaton)

# Learn the automaton
learned_automaton = tl_star.learn()
print("Learned automaton:", learned_automaton)

VPL Learning Example

from vpl_extraction.vpl_star import VPLStar
from examples.dyck1 import Dyck1

# Create VPL oracle
dyck1 = Dyck1()

# Create VPL* learner
vpl_star = VPLStar(dyck1)

# Learn the VPL
learned_vpl = vpl_star.learn()

Examples Included

Tree Automata Examples: Basic tree automata construction and testing
TL Examples*: Learning tree automata using the TL* algorithm
Observation Table Examples: Demonstrating observation table operations
Context Examples: Tree context manipulation and application
Dyck1 Language: Implementation of the Dyck1 language (balanced parentheses)

Dependencies

The project uses only Python standard library modules:

typing
collections
abc
sys

No external dependencies are required.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Juan Pedro da Silva

Research Context

This implementation is designed for research in formal language learning, specifically:

Learning Context-Free Grammars through tree automata
Visibly Pushdown Language learning
Active learning algorithms for formal languages
Tree automata synthesis from examples

The project provides a foundation for experimenting with different learning algorithms and formal language representations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning Context-Free Grammars (CFG) and Visibly Pushdown Languages (VPL)

Overview

Project Structure

Key Features

1. Tree Automata

2. TL* Learning Algorithm

3. VPL Learning (VPL*)

4. Utility Functions

Usage

Running Examples

Basic Tree Automata Usage

TL* Learning Example

VPL Learning Example

Examples Included

Dependencies

License

Author

Research Context

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
base		base
data		data
examples		examples
experiments		experiments
models		models
tree_automata		tree_automata
tree_automata_extraction		tree_automata_extraction
utils		utils
vpl_extraction		vpl_extraction
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
TransformerExtractorExperiments.ipynb		TransformerExtractorExperiments.ipynb
VPLStart_test.ipynb		VPLStart_test.ipynb
dyck_k1_causal_dataset.txt		dyck_k1_causal_dataset.txt
encoding_tests.ipynb		encoding_tests.ipynb
run.py		run.py
run_experiments.py		run_experiments.py

License

JuanCommits/learning-cfg

Folders and files

Latest commit

History

Repository files navigation

Learning Context-Free Grammars (CFG) and Visibly Pushdown Languages (VPL)

Overview

Project Structure

Key Features

1. Tree Automata

2. TL* Learning Algorithm

3. VPL Learning (VPL*)

4. Utility Functions

Usage

Running Examples

Basic Tree Automata Usage

TL* Learning Example

VPL Learning Example

Examples Included

Dependencies

License

Author

Research Context

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages