Add the AMR graph constrction and RGCN in example by cminus01 · Pull Request #578 · graph4ai/graph4nlp

cminus01 · 2022-09-09T08:48:34Z

Description

Checklist

Please feel free to remove inapplicable items for your PR.

The PR title starts with [$CATEGORY] (such as [Doc], [Feature]])
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage
Code is well-documented
To the my best knowledge, examples are either not affected by this change,
or have been fixed to be compatible with this change
Related issue is referred in this PR
If the PR is for a new model/paper, I've updated the example index here.

Changes

AlanSwift

Please check the comments.

AlanSwift · 2022-09-09T09:06:22Z

graph4nlp/pytorch/data/data.py

        return self.nodes[:].features

+    @property
+    def ntypes(self) -> List[str]:


If the graph is homogeneous, it should be an exception.

AlanSwift · 2022-09-09T09:07:55Z

graph4nlp/pytorch/data/data.py

        ), "The number of nodes to be added should be greater than 0. (Got {})".format(node_num)

+        if not self.is_hetero:
+            assert ntypes is None, "The graph is homogeneous, ntypes should be None."


please raise exceptions

AlanSwift · 2022-09-09T09:14:15Z

graph4nlp/pytorch/data/data.py

        return EdgeView(self)

+    @property
+    def etypes(self) -> List[Tuple[str, str, str]]:


When it is homogeneous, self._etypes is not assigned.

AlanSwift · 2022-09-09T09:18:26Z

graph4nlp/pytorch/data/data.py

        return self._edge_attributes

    # Conversion utility functions
+    def make_data_dict(self) -> Dict[Tuple[str, str, str], Tuple[torch.Tensor, torch.Tensor]]:


I think this could be a utility function.

AlanSwift · 2022-09-09T09:23:49Z

graph4nlp/pytorch/data/dataset.py

+                                edge_token = graph.edge_attributes[edge_idx]["token"]
+                                s.add(edge_token)
+                except Exception as e:
+                    pass


please handle the catched exception

AlanSwift · 2022-09-09T09:25:06Z

graph4nlp/pytorch/data/dataset.py

            if "val" in self.__dict__:
                self.val = self.build_topology(self.val)
+            # build_edge_vocab and save
+            if self.init_edge_vocab:


It should be a new function similar to "build_vocab()". E.g., build_edge_vocab()

AlanSwift · 2022-09-09T09:26:14Z

graph4nlp/pytorch/data/dataset.py

        for_inference=False,
        reused_vocab_model=None,
        nlp_processor_args=None,
+        init_edge_vocab=False,


init_edge_vocab is not clear. It covers 1.build edge vocab, 2. an indicator to use heterogeneous graph.
TODO: discuss

build_edge_vocab: bool

is_hetero: bool

AlanSwift · 2022-09-09T09:32:49Z

graph4nlp/pytorch/inference_wrapper/base.py

        for i in range(len(data_item_collect)):
            data_item_collect[i] = self.dataset._vectorize_one_dataitem(
-                data_item_collect[i], self.vocab_model, use_ie=use_ie
+                data_item_collect[i], self.vocab_model, use_ie=use_ie, edge_vocab=self.edge_vocab


But the api in dataset._vectorize_one_dataitem doesn't have the parameter: "edge_vocab", please check it.
_vectorize_one_dataitem

move the edge vocab to the library code

unify edge_vocab with vocab_model

AlanSwift · 2022-09-09T09:34:36Z

graph4nlp/pytorch/modules/graph_embedding_initialization/embedding_construction.py

            "w2v_bert",
            "w2v_bert_bilstm",
            "w2v_bert_bigru",
+            "w2v_amr",


Why amr shoud be a general embedding strategy?
I don't think it should be regarded as a new general embedding strategy.

AlanSwift · 2022-09-09T09:35:16Z

graph4nlp/pytorch/modules/graph_embedding_initialization/embedding_construction.py

            rnn_input_size = word_emb_size

+        if "pos" in word_emb_type:
+            self.word_emb_layers["pos"] = WordEmbedding(37, 50)


please don't use magic numbers.

cminus01 force-pushed the dev_amr_rgnn_demo branch 4 times, most recently from 52d81ee to dee086d Compare September 9, 2022 09:05

AlanSwift requested changes Sep 9, 2022

View reviewed changes

AlanSwift mentioned this pull request Sep 18, 2022

[Roadmap] Graph4NLP v0.6 plan #496

Open

xiao03 and others added 23 commits October 8, 2022 00:34

rgcn training/inference

826a7c4

Add amrgraph construction

ebfde71

add support for heterogeneous graph in from_dgl

e3f021f

add support for heterogeneous graph in from_dgl

76f1618

temp commit

9dcdf6c

rgcn training/inference

e4fcda9

fix

6f38015

Added batch graph attributes support

8baa34d

rgcn first version

f2a3f23

RGCN wrap up

dbb1242

add test_sementic_parsing

f02cbb4

fix some bug in graph embedding

030fff4

fix the embedding construction

17c6ab7

fix

b3db05e

fix

9d93760

fix

66c533d

The number of nodes is not required now.

fd3d524

test amr-graph to rgcn

eda04f1

fix

31eb12c

fix

2092d50

fix

e3831e8

fix the inference

3bbe8ea

fix the library

3cd1114

cminus01 force-pushed the dev_amr_rgnn_demo branch from dee086d to 0d0400c Compare October 7, 2022 16:52

fix the library

0dd8edb

cminus01 force-pushed the dev_amr_rgnn_demo branch from 0d0400c to 0dd8edb Compare October 7, 2022 17:40

fix the bug in data.py

37814d3

Conversation

cminus01 commented Sep 9, 2022

Description

Checklist

Changes

Uh oh!

AlanSwift left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants