Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 107 additions & 0 deletions README_v1
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# enodo
Utility tool to perform ad-hoc analysis on csv files when `grep`, `sed`, `awk` and `cut` aren't enough.

## Prerequisites
* Java 6 (or later)

## Build
* [Download ZIP](https://github.com/headstar/enodo/archive/master.zip) or `$ git clone https://github.com/headstar/enodo.git`
* `$ cd enodo`
* `$ ./gradlew build`

Resulting jar is found in `./build/libs`.

## Example

Example csv file:

The,quick,brown
fox,jumps
over,the,lazy,dog

package com.headstartech.enodo;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.PrintWriter;

/**
* Convenience base class for a {@link CSVProcessor}.
*/
public abstract class AbstractCSVProcessor implements CSVProcessor {

private Logger logger = LoggerFactory.getLogger(this.getClass());


private PrintWriter out;

@Override
public void setOutputWriter(PrintWriter out) {
this.out = out;
}

@Override
public void afterLastRow() {
// do nothing
}

@Override
public void beforeFirstRow() {
// do nothing
}

protected PrintWriter getOutputWriter() {
return out;
}

protected Logger getLogger() { return logger; }
}



Run the utility:

`$ java -jar enodo-1.0.0.jar -i /tmp/example.csv -s src/test/resources/FieldLengthCount.groovy`


## Misc

### Input
Input will be read from `stdin` if not specified.

The example above can been run as:

`$ cat /tmp/example.csv | java -jar build/libs/enodo-1.0.0.jar -s src/test/resources/FieldLengthCount.groovy`

#### Regular expression filter
Regular expressions are supported when specifying the input.

Example:

If an example produces files with the pattern "myapp.yyyymmdd_hhmmss", the pattern below could be used to process all files written between 23 p.m and midnight in October 2015.

`$ java -jar enodo-1.0.0.jar -s FieldLengthStats.groovy -i "myapp.201510.._(23)+.*"`

**Note the `"` around the input file argument!**

### Output
Output will be written to `stdout` if not specified.

### Logging
The application log location is `/tmp/enodo.log`.

### Useful libraries available at runtime

#### [Guava](https://github.com/google/guava)
Useful if you want to do further splitting, joining etc.

#### [Joda Time](http://www.joda.org/joda-time/)
For easier date & time manipulation.

#### [The Apache Commons Mathematics Library](http://commons.apache.org/proper/commons-math/)
The [statistics package](http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics) contains a number of useful functions:
* arithmetic and geometric means
* variance and standard deviation
* minimum, maximum, median, and percentiles
* ...and more