Skip to content

Technical Documentation

Carlos Badenes edited this page Mar 31, 2016 · 1 revision

epnoi is a system composed by a loosely-coupled set of modules connected by an event bus using standardized data protocols and formats.

architecture

It follows a Staged Event-Driven Architecture (SEDA) that decomposes the flow into a set of stages connected by queues. When a new research content is published or modified from a remote source, or when someone upload a new document using the API, a new event is considered in the system to update or to create the relationships between this new information and the existing one.

Each of these modules has specific responsibilities:

  • api: deploy a web interface to allow users to do operations on the system
  • hoarder: download documents to be added to the system
  • harvester: extract text and meta-information from them
  • learner: identify relevant terms and relations as well as create ontologies from the text
  • modeler: create internal models to represent and categorize them
  • comparator: measure the similarity between them according to the model created

This content-based recommender system is updated with both gathered and submitted information handling the following resources:

resources

Each of them has several states and represents the following information:

  • source: a repository of documents
  • domain: a logical grouping of documents
  • document: a logic entity that includes meta-information and textual content
  • item: a textual content of a document
  • part: a section of text
  • topic: a sorted list of the most representative words of a text
  • relation: an associative or semantic link between two words
  • word: a term, entity or any other textual unit that composes a text

Clone this wiki locally