-
Notifications
You must be signed in to change notification settings - Fork 1
Service NLP
Carlos Badenes edited this page Mar 31, 2016
·
1 revision
Some Natural Language Processing (NLP) tasks have been externalized as a service to reuse common functionalities and optimize the use of resources. It takes as existing NLP libraries such as Gate or Stanford-Core, as some particular functionalities.
In short, it offers:
- tokenization: Splits a stream of text into tokens, i.e. words and symbols.
- sentence splitting: Splits a sequence of tokens into sentences.
- lemmatization: Generates the word lemmas for all tokens.
- stemming: Reduces the token to the morphological root of the word.
- part-of-speech: Labels tokens with their POS tag based on both its definition and its context.
- entity recognition: Identifies entities such as Person, Organization, Location, Time and Numerical expressions.
Currently, it is implemented as a internal resource and as a external resource having two interfaces: a WS-REST for public clients and a Thrift-based for internal clients.
work supported by the European Community's Seventh Framework Programme (FP7-ICT-2013-8.1) under grant agreement no: 611383. For further information please see http://DrInventor.eu
