Skip to content

Features

Mayank Mishra edited this page Jun 5, 2014 · 2 revisions

Cluster monitor:

![Cluster Monitor](http://i.imgur.com/xrdVRoM.png"Cluster Monitor")

Prominent users: Hadoop Cluster Administrators. It helps the admins to monitor the cluster with node system-level fine grained statistics. They have the option to mark frequently monitored stats as favorites and fetch refreshed results at specified interval of time. Selection of trends for specific stats is also available.

Distinguishing feature: the monitoring is on-demand. It may be turned on-off as per the user requirements. Also, Jumbune is loosely coupled; it can be deployed on a remote machine without requiring additional setup on each cluster machine. The cluster monitoring features can be summarized as below:

  • Node level cluster view to monitor system and Hadoop parameters.

  • Network latency view to detect network latency across nodes in cluster.

  • Data load partition to monitor the data load distribution among the various nodes of the cluster.

  • Replica management view to show the data blocks replications in HDFS.

Job Profiler:

![Job Profiler](http://i.imgur.com/sxHXE35.png"Job Profiler")

Prominent users: Hadoop MapReduce developers and Cluster Administrators. This feature profiles the MapReduce jobs in the cluster and gives insights into CPU and heap dumps of the Hadoop job. Profiler also provides an in-depth graph view of the MapReduce phases: Map, Reduce, Sort, Shuffle, Setup and Cleanup.

Distinguishing feature: the graphical view provides a correlation of parameters that include execution time, CPU consumption, memory usage and data flow rate during these phases. It enables the developers to understand which resources are creating a bottleneck and the MapReduce phases that are consuming more time than estimated and requires optimization to achieve an efficient job performance.

MapReduce Job Flow Debugger:

Prominent users: MapReduce Developers, Quality Engineers Debugger provides code level control flow statistics of MapReduce job. User may apply regex validations or its own user defined validation classes. As per the validations applied, Jumbune’s flow debugger checks the flow of <key,value> data for mapper and reducer respectively.

Distinguishing feature: Jumbune provides a comprehensive table/chart view where the flow of input records is displayed at job level, MR level and instance level. Unmatched keys/values represent the number of erroneous key/value data in the job execution result. Debugger drilldown looks deeper into the code to examine the flow of data for various counters like loops and conditions if, else-if etc.

Data Validation:

Prominent users: MapReduce Developers, Quality Engineers The HDFS data should correspond to a specific pre-defined format. Jumbune provides a simple and easy to use utility, to find discrepancies and errors in the HDFS data. User may check for data violations under various categories like: data type, null values or regular expression values.

Distinguishing feature: User is provided with a detailed view of violations based on the category. Also, the results are displayed for each file present at the specified HDFS path, giving details about violations for each record. It is easy to track down to the actual corrupted records.

Clone this wiki locally