Use word embeddings instead of tokenization for text extraction

Currently we are using a simple `CountVectorizer` for extracting features from text. It might be a good idea to experiment with using word embeddings instead. The rationaly behind this is that count vectorization doesn't factor in the context of the text where there might be semantic context important for the triaging model.