Alignment JSON Format

I find it useful to use JSON to represent AMR alignments. The package includes tools for converting AMR alignments from and to JSON like the following.

{'amr1':
    [{'type':'isi', 'tokens':[0], 'nodes':['1.1'], 'edges':[]},
    {'type':'isi', 'tokens':[1], 'nodes':['1'], 'edges':[['1',':ARG0','1.1'],['1',':ARG1','1.2']]},
    {'type':'isi', 'tokens':[2], 'nodes':['1.2'], 'edges':[]},
    ...
    ],
'amr2':
    [{'type':'isi', 'tokens':[0], 'nodes':['1'], 'edges':[]},
    {'type':'isi', 'tokens':[1], 'nodes':[], 'edges':[['1',':ARG0','1.1']]},
    {'type':'isi', 'tokens':[2], 'nodes':['1.1'], 'edges':[]},
    ...
    ],

The JSON is a dictionary mapping AMR ids to a list of alignments. Each alignment is a dictionary with attributes 'type', 'tokens', 'nodes', and 'edges'. The 'type' attribute allows us to distinguish different categories of alignments or other information. Since the unique ids stored in an alignment have to match ids in AMR to be interpreted, we default to using LDC/ISI style node ids.

The advantages of using JSON are:

Easy to load and save (No need to write a special script for reading some esoteric format)
Can store additional information in a type to distinguish different types of alignments
Can easily store multiple sets of alignments separately, without needing to modify an AMR file. That makes it easy to compare different sets of alignments or aligning different information in different layers of alignment.

To read alignments from a JSON file do:

reader = AMR_Reader()
alignments = reader.load_alignments_from_json(alignments_file)

To save alignments to a JSON file do:

reader = AMR_Reader()
reader.save_alignments_to_json(alignments_file, alignments)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alignment JSON Format

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally