Add the AMR graph constrction and RGCN in example#578
Add the AMR graph constrction and RGCN in example#578cminus01 wants to merge 25 commits intograph4ai:developfrom
Conversation
52d81ee to
dee086d
Compare
AlanSwift
left a comment
There was a problem hiding this comment.
Please check the comments.
| return self.nodes[:].features | ||
|
|
||
| @property | ||
| def ntypes(self) -> List[str]: |
There was a problem hiding this comment.
If the graph is homogeneous, it should be an exception.
graph4nlp/pytorch/data/data.py
Outdated
| ), "The number of nodes to be added should be greater than 0. (Got {})".format(node_num) | ||
|
|
||
| if not self.is_hetero: | ||
| assert ntypes is None, "The graph is homogeneous, ntypes should be None." |
| return EdgeView(self) | ||
|
|
||
| @property | ||
| def etypes(self) -> List[Tuple[str, str, str]]: |
There was a problem hiding this comment.
When it is homogeneous, self._etypes is not assigned.
graph4nlp/pytorch/data/data.py
Outdated
| return self._edge_attributes | ||
|
|
||
| # Conversion utility functions | ||
| def make_data_dict(self) -> Dict[Tuple[str, str, str], Tuple[torch.Tensor, torch.Tensor]]: |
There was a problem hiding this comment.
I think this could be a utility function.
graph4nlp/pytorch/data/dataset.py
Outdated
| edge_token = graph.edge_attributes[edge_idx]["token"] | ||
| s.add(edge_token) | ||
| except Exception as e: | ||
| pass |
There was a problem hiding this comment.
please handle the catched exception
graph4nlp/pytorch/data/dataset.py
Outdated
| if "val" in self.__dict__: | ||
| self.val = self.build_topology(self.val) | ||
| # build_edge_vocab and save | ||
| if self.init_edge_vocab: |
There was a problem hiding this comment.
It should be a new function similar to "build_vocab()". E.g., build_edge_vocab()
| for_inference=False, | ||
| reused_vocab_model=None, | ||
| nlp_processor_args=None, | ||
| init_edge_vocab=False, |
There was a problem hiding this comment.
init_edge_vocab is not clear. It covers 1.build edge vocab, 2. an indicator to use heterogeneous graph.
TODO: discuss
There was a problem hiding this comment.
- build_edge_vocab: bool
- is_hetero: bool
| for i in range(len(data_item_collect)): | ||
| data_item_collect[i] = self.dataset._vectorize_one_dataitem( | ||
| data_item_collect[i], self.vocab_model, use_ie=use_ie | ||
| data_item_collect[i], self.vocab_model, use_ie=use_ie, edge_vocab=self.edge_vocab |
There was a problem hiding this comment.
But the api in dataset._vectorize_one_dataitem doesn't have the parameter: "edge_vocab", please check it.
_vectorize_one_dataitem
There was a problem hiding this comment.
- move the edge vocab to the library code
- unify edge_vocab with vocab_model
| "w2v_bert", | ||
| "w2v_bert_bilstm", | ||
| "w2v_bert_bigru", | ||
| "w2v_amr", |
There was a problem hiding this comment.
Why amr shoud be a general embedding strategy?
I don't think it should be regarded as a new general embedding strategy.
| rnn_input_size = word_emb_size | ||
|
|
||
| if "pos" in word_emb_type: | ||
| self.word_emb_layers["pos"] = WordEmbedding(37, 50) |
There was a problem hiding this comment.
please don't use magic numbers.
dee086d to
0d0400c
Compare
0d0400c to
0dd8edb
Compare
Description
Checklist
Please feel free to remove inapplicable items for your PR.
or have been fixed to be compatible with this change
Changes