Skip to content

carob-data/carob

Repository files navigation

Carob

Carob creates reproducible workflows that standardize primary agricultural research data from experiments and surveys. Standardization includes the use of a common file format, variable names, units and accepted values according to the terminag standard. Standardized data sets are aggregated into larger collections that can be used in further research. We do this by writing an R script for each individual dataset. See the website for more information.

Carob is an open access Extract, Transform, and Load (ETL) framework supported by CGIAR to support predictive analytics (machine learning, artifical intelligence) and other types of data analysis.

Contributions are welcome from anyone, and they can be made via pull-requests. Feel free to improve these scripts, or provide new ones. See the instructions on how to write a Carob script described here. You can also raise an issues. A good place to discover new data sets is the Gardian website or our to-do list.

Get the data

Standardized data can be downloaded from carob-data.org (data with a CC license only), or with R package caramba.

You can also compile your own version by cloning this repo and running

remotes::install_github("carob-data/carobiner")
ff <- carobiner::make_carob(path)

where path is the folder of the cloned repo (e.g. "d:/github/carob")

About

Carob: standardizing agricultural research data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 22

Languages