-
Notifications
You must be signed in to change notification settings - Fork 18
FlowConfiguration
Executing Flows
GeoBatch is based on actions flows and events.
Events are generated by an Event Generator which builds a queue of events controlled by the Event Dispatcher (producer-consumer). The Event Dispatcher generate an Event Consumer thread which effectively activate actions sequentially as specified in the flow configuration file.
Actually GeoBatch is using Xstream so *Flows are defined as XML files stored into ${GEOBATCH}/src/web/src/main/webapp/WEB-INF/data/* dir. Represents the configuration file which define the ingestion flow and the rules to apply. Here is an example of the entire flow configuration:
<?xml version="1.0" encoding="UTF-8"?>
<FlowConfiguration>
<id></id>
<name></name>
<description></description>
...
<corePoolSize>10</corePoolSize>
<maximumPoolSize>30</maximumPoolSize>
<keepAliveTime>150</keepAliveTime> <!--seconds-->
<workQueueSize>100</workQueueSize>
<!-- keep consumer instance into memory map until they are manually removed -->
<keepConsumers>false</keepConsumers>
<!-- maximum number of consumer instances -->
<maxStoredConsumers>6</maxStoredConsumers>
...
<EventGeneratorConfiguration>
<serviceID></serviceID>
...
</EventGeneratorConfiguration>
<EventConsumerConfiguration>
...
<!-- keep runtime dir when consumer instance is disposed -->
<keepRuntimeDir>[true|false]<keepRuntimeDir>
<ACTION_1_Configuration>
...
</ACTION_1_Configuration>
...
<ACTION_N_Configuration>
...
</ACTION_N_Configuration>
</EventConsumerConfiguration>
<ListenerConfigurations>
...
</ListenerConfigurations>
</FlowConfiguration>The main nodes of this configuration represents and identifiable resource so you have to specify:
<id>FLOW_ID</id>
<name>FLOW_NAME</name>
<description>FLOW_DESCRIPTION</description>F.E.: A file called X_FLOW.xml contains:
<id>X_FLOW</id>- The working directory: Can be: - relative to the GeoBatch working dir - absolute path
<workingDirectory>CONSUMER_WORKING_DIR</workingDirectory>- The Thread pool :
Each Flow configuration is handled by a Flow Manager instance which creates a new ThreadPoolExecutor with the given initial parameters and default thread factory and handler. It may be more convenient to use one of the Executors factory methods instead of this general purpose constructor.
- Parameters:
<corePoolSize>10</corePoolSize> <maximumPoolSize>30</maximumPoolSize> <keepAliveTime>150</keepAliveTime> <!--seconds--> <workQueueSize>100</workQueueSize><?xml version="1.0" encoding="UTF-8"?>
<FlowConfiguration>
<id>FLOW_NAME</id>
<name>NAME</name>
<description>DESCRIPTION</description>
<workingDirectory>geotiff</workingDirectory>
<autorun>true</autorun>
<corePoolSize>10</corePoolSize>
<maximumPoolSize>30</maximumPoolSize>
<keepAliveTime>150</keepAliveTime> <!--seconds-->
<workQueueSize>100</workQueueSize>
...
</FlowConfiguration>- The event generator
<EventGeneratorConfiguration>
<wildCard>*.*</wildCard>
<watchDirectory>geotiff/in</watchDirectory>
<osType>OS_UNDEFINED</osType>
<eventType>FILE_ADDED</eventType>
<interval>10000</interval>
<id>geotiff_id</id>
<serviceID>fsEventGeneratorService</serviceID>
<description>description</description>
<name>geotiff</name>
</EventGeneratorConfiguration>- The consumer configuration
As usually this is an identifiable component so you have to specify:
<id>CONSUMER_ID</id>
<name>CONSUMER_NAME</name>
<description>CONSUMER_DESCRIPTION</description>- The list of listeners used by this consumer specified by ID, (see the listeners configuration section for details):
<listenerId>ConsumerLogger0</listenerId>
...
<listenerId>ConsumerCumulator0</listenerId>- Can be:
-
- true
- false (default)
If this flag is set to true the consumer will work directly on the input data. Please be carefully with this option since the event generator can trigger events on file modification.
<preserveInput>true</preserveInput>Can be:
- relative to the GeoBatch working dir
- absolute path
This is used as temporary directory to work on data to 'consume' (if 'preserveInput' is set to 'false'). Here a directory called as the current timestamp is created and the files matching the rules specified into the 'FileEventRule' list are moved into it.
<workingDirectory>CONSUMER_WORKING_DIR</workingDirectory>- The performBackup flag:
Can be: - true - false (default)
If this variable is set to true a directory called backup is created:
/CONSUMER_WORKING_DIR/TIMESTAMP/backup/
<!--into this directory will be performed a copy of the input files before the work starts.-->
<performBackup>false</performBackup>- The FileEventRules:
...
</FileEventRule>- The actions list:
<ACTION_1_Configuration>
...
</ACTION_1_Configuration>
...
<ACTION_N_Configuration>
...
</ACTION_N_Configuration><EventConsumerConfiguration>
<id>CONSUMER_ID</id>
<name>CONSUMER_NAME</name>
<description>CONSUMER_DESCRIPTION</description>
<listenerId>ConsumerLogger0</listenerId>
...
<listenerId>ConsumerCumulator0</listenerId>
<preserveInput>false</preserveInput>
<workingDirectory>CONSUMER_WORKING_DIR</workingDirectory>
<performBackup>false</performBackup>
<FileEventRule>
...
</FileEventRule>
<ACTION_1_Configuration>
...
</ACTION_1_Configuration>
...
<ACTION_N_Configuration>
...
</ACTION_N_Configuration>
</EventConsumerConfiguration> - File event rule
The FileEventRule list is a list of rules which are checked before ingestion starts.
Each FileEventRule is an identifiable component so you have to specify:
<id></id>
<description></description>
<name></name>- The node optional:
Specify if this rule is mandatory or not to start the ingestion
<optional>false</optional>
Can be:
- true
- false (default)
- The node originalOccurrencies:
Specify the number of file occurrences which should match this rule:
<originalOccurrencies>1</originalOccurrencies>- A positive integer
- The node regex:
Specify the regular expression which should match the input file.
<regex>.*\..*</regex><FileEventRule>
<optional>false</optional>
<originalOccurrencies>1</originalOccurrencies>
<regex>.*\..*</regex>
<id>rule_1_id</id>
<description>description</description>
<name>rule_1</name>
</FileEventRule>
...
<FileEventRule>
...
</FileEventRule>- Event Generator
Actually the only supported* Event Generator is the File System event generator.
<EventGeneratorConfiguration>
<serviceID>fsEventGeneratorService</serviceID>
...
</EventGeneratorConfiguration>The fsEventGeneratorService
It is an identifiable object so you have to specify:
<id></id>
<description></description>
<name></name>Is used to specify the polling interval (in milliseconds). can be:
- a positive >0 and <2e63-1 (It is stored as long)
default:
- 5000 millisec
Example:
<EventGeneratorConfiguration>
<serviceID>fsEventGeneratorService</serviceID>
...
<id></id>
<description></description>
<name></name>
<wildCard>*.*</wildCard>
<watchDirectory>geotiff/in</watchDirectory>
<osType>OS_UNDEFINED</osType>
<eventType>FILE_ADDED</eventType>
<interval>10000</interval>
...
</EventGeneratorConfiguration>The listeners configuration:
- Each listener is referred into the previous explained components using the node:
<listenerId>NAME</listenerId>- which have its counterpart into this list into the node:
<id>NAME</id><serviceID>...</serviceID>Represents an alias id for the class to use and (actually) can be:
- cumulatingListenerService
It is a service that is used to instantiate ProgressCumulatingListener (class), which is used by' graphical interface to send status messages to the graphical interface, and must be configured at the level of consumer.
- statusListenerService
It is a service that is used to instantiate ProgressStatusListener (class), which serve to define lists that are graphical interface used to monitor the status of individual actions accordingly edition should be used only in the configuration of an 'action.
- loggingListenerService
It is a service that is used to instantiate ProgressLoggingListener (class), is used to dallle actions and by the consumer for logging events in progress,
for example:
- Consumer started
- Action started
- Action concluded
<ListenerConfigurations>
<CumulatingProgressListener>
<serviceID>cumulatingListenerService</serviceID>
<id>ConsumerLogger0</id>
</CumulatingProgressListener>
<StatusProgressListener>
<serviceID>statusListenerService</serviceID>
<id>ActionListener0</id>
</StatusProgressListener>
<LoggingProgressListener>
<serviceID>loggingListenerService</serviceID>
<id>ActionListener1</id>
<loggerName>ActionListener1</loggerName>
</LoggingProgressListener>
<LoggingProgressListener>
<serviceID>loggingListenerService</serviceID>
<id>ConsumerLogger0</id>
<loggerName>ConsumerLogger0</loggerName>
</LoggingProgressListener>
</ListenerConfigurations>