Learning vectors by composing the embeddings of the constituent words.
The compositional methods rely on the distributional vectors of both the noun compounds and their constituents. They are obtained by training the distributional embeddings, and then learning the composition function.
We implemented 4 composition functions:
- Add (Mitchell and Lapata, 2009) - f(xy) = alpha * x + beta * y for scalars alpha, beta.
- FullAdd (Zanzotto et al., 2010) - f(xy) = A * x + B * y for matrices A, B.
- Matrix (Socher et al., 2012) - f(xy) = g(W * [x ; y] + b) for non-linearity g and matrix W.
- LSTM - f(xy) = LSTM([x ; y]).
To train:
bash train_composition_models.sh
To predict the embeddings of the test set:
bash predict_composition_models.sh