-
Notifications
You must be signed in to change notification settings - Fork 3
Customization
remap can be customized very easily. Whereas many implementations of other platforms like Spark and Hadoop have specific functionality for their intended purpose, remap uses a plugin system that describes what a particular task is supposed to do. This way you can create different monitor and worker processes of many types and implement super-algorithms rather easily.
##Monitor and initiator
For each job that is to be executed, you need to have a coordinating machine started. This machine uses a (Flask) webserver to receive REST API requests to start a particular job. It is within the job description that you choose the application to use for the job and the type of job monitor to construct.
The job monitor then searches for nodes and cores to run the algorithm on.
##Nodes and cores
A physical machine is a node, that can run one or more cores. A core is basically a CPU core.
A node is nothing but a daemon that manages the machine's resources. It can start and stop core processes.
A core starts as an empty 'shell', which does nothing but communicate with the node daemon or disappear if there is no work to do. As soon as the node hands the core some work, the first thing the core does is discover what type of functionality to start. It then creates a plugin to execute that kind of task.
For example, a type may be a generic mapper implementation or a generic reducer implementation or something different.
The actual application is loaded next. The application determines the file type and format that is used to load data, the combiners in use and how data is written out. For other types of jobs, it can determine the stream of in-data, the protocol for communication between cores, etc.
##Where?
You can find the currently known pluggable types for a job monitor in the src/initiator directory.
The same job types have to be present in the src/core directory to be able to instantiate jobs of that type across the network.