A Rust implementation of the Concise data definition language (CDDL). CDDL is an IETF standard that "proposes a notational convention to express CBOR and JSON data structures." As of 2019-06-12, it is published as RFC 8610 (Proposed Standard) at https://tools.ietf.org/html/rfc8610.
This crate supports the following CDDL-related RFCs:
| RFC | Title | Status |
|---|---|---|
| RFC 8610 | Concise Data Definition Language (CDDL) | ✔️ Full parsing and validation |
| RFC 9165 | Additional Control Operators for CDDL | ✔️ .cat , .det , .plus , .abnf , .abnfb , .feature |
| RFC 9682 | Updates to CDDL (Empty Data Models, \u{hex} Escapes, Non-Literal Tag Numbers) |
✔️ Full grammar and parser support |
| RFC 9741 | Additional Control Operators for Text in CDDL | ✔️ .b64u , .b64c , .hex , .hexlc , .hexuc , .b32 , .h32 , .b45 , .base10 , .printf , .json , .join and sloppy variants |
| draft-bormann-cbor-cddl-csv-08 | Using CDDL for CSV | ✔️ CSV validation via generic data model mapping |
| draft-bormann-cbor-cddl-freezer-17 | CDDL Feature Freezer | ✔️ .pcre (PCRE2), .iregexp (RFC 9485), .bitfield |
This crate uses Pest (a PEG parser generator for Rust) to parse CDDL. The grammar is defined in cddl.pest and closely follows the ABNF grammar in Appendix B. of the spec. All CDDL must use UTF-8 for its encoding per the spec.
This crate supports validation of CBOR, JSON, and CSV data structures. The minimum supported Rust version (MSRV) is 1.88.0.
Also bundled into this repository is a basic language server implementation and extension for Visual Studio Code for editing CDDL. The implementation is backed by the compiled WebAssembly target included in this crate.
- Parse CDDL documents into an AST
- Verify conformance of CDDL documents against RFC 8610
- Validate CBOR data structures
- Validate JSON documents
- Validate CSV data
- Generate Rust types from CDDL
- Generate dummy JSON from conformant CDDL
- As close to zero-copy as possible
- Compile WebAssembly target for browser and Node.js
-
no_stdsupport (lexing and parsing only) - Language server implementation and Visual Studio Code Extension
- Support CBOR diagnostic notation
- I-JSON compatibility
This crate is used in several notable projects:
| Project | Description |
|---|---|
| google/cddlconv | A command-line utility for converting CDDL to TypeScript and Zod schemas |
| input-output-hk/catalyst-core | Core Catalyst governance engine for Cardano, uses this crate for CBOR validation of CIP-36 voter registration data |
Rust is a systems programming language designed around safety and is ideally-suited for resource-constrained systems. CDDL and CBOR are designed around small code and message sizes and constrained nodes, scenarios for which Rust has also been designed.
A CLI is available for various platforms. The tool supports parsing of CDDL files for verifying conformance against RFC 8610. It can also be used to validate JSON documents, CBOR binary files, and CSV files against CDDL documents. Detailed information about the JSON, CBOR, and CSV validation implementation can be found in the sections below.
Binaries for Linux, macOS and Windows can be downloaded from GitHub Releases.
cargo install cddldocker pull ghcr.io/anweiss/cddl-cli:latestInstructions for using the tool can be viewed by executing the help subcommand:
cddl helpIf using Docker:
Replace
<version>with an appropriate release tag. Requires use of the--volumeargument for mounting CDDL documents into the container when executing the command. JSON or CBOR files can either be included in the volume mount or passed into the command via STDIN.
docker run -it --rm -v $PWD:/cddl -w /cddl ghcr.io/anweiss/cddl-cli:<version> helpYou can validate JSON documents, CBOR binary files, and/or CSV files:
cddl validate [OPTIONS] --cddl <CDDL> <--stdin|--json <JSON>...|--cbor <CBOR>...|--csv <CSV>...>For CSV files, use the --csv-header flag if the first row is a header:
cddl validate --cddl schema.cddl --csv data.csv --csv-headerIt also supports validating files from STDIN (if it detects the input as valid UTF-8, it will attempt to validate the input as JSON, otherwise it will treat it as CBOR):
cat reputon.json | cddl validate --cddl reputon.cddl --stdin
cat reputon.cbor | cddl validate --cddl reputon.cddl --stdinor using Docker:
docker run -i --rm -v $PWD:/data -w /data ghcr.io/anweiss/cddl-cli:0.10.4 validate --cddl reputon.cddl --stdin < reputon.jsonYou can also find a simple RFC 8610 conformance tool at https://cddl.anweiss.tech. This same codebase has been compiled for use in the browser via WebAssembly.
An extension for editing CDDL documents with Visual Studio Code has been published to the Marketplace here. You can find more information in the README.
- maps
- structs
- tables
- cuts
- groups
- arrays
- values
- choices
- ranges
- enumeration (building a choice from a group)
- root type
- occurrence
- predefined types
- tags
- unwrapping
- controls
- socket/plug
- generics
- operator precedence
- comments
- numerical int/uint values
- numerical hexfloat values
- numerical values with exponents
- unprefixed byte strings
- prefixed byte strings
Simply add the dependency to Cargo.toml :
[dependencies]
cddl = "0.10.4"JSON, CBOR, and CSV validation all require std .
A few convenience features have been included to make the AST more concise and for enabling additional functionality. You can build with default-features = false for a no_std build and selectively enable any of the features below.
--feature ast-span
Add the Span type to the AST for keeping track of the position of the lexer and parser. Enabled by default.
--feature ast-comments
Include comment strings in the AST. Enabled by default.
--feature ast-parent
Add the ParentVisitor implementation so that the AST can be traversed using parent pointers. Enabled by default.
--feature json
Enable JSON validation. Enabled by default.
--feature cbor
Enable CBOR validation. Enabled by default.
--feature csv-validate
Enable CSV validation per draft-bormann-cbor-cddl-csv-08. Enabled by default.
--feature additional-controls
Enable validation support for the additional control operators defined in RFC 9165 and RFC 9741. Enabled by default.
--feature freezer
Enable control operators from the CDDL Feature Freezer draft. Enabled by default. Includes:
.pcre— PCRE2 regular expressions viafancy-regex(supports lookahead, lookbehind, backreferences). Patterns are anchored on both sides per the spec..iregexp— RFC 9485 interoperable regular expressions. Anchored matching..bitfield— Structured bitfield validation for unsigned integers. The controller is an array of bit widths; the validator checks that the uint value fits within the total declared bit width.
use cddl::parser::cddl_from_str;
let input = r#"myrule = int"#;
assert!(cddl_from_str(input, true).is_ok())The companion crate cddl-derive provides proc macros for generating Rust types from CDDL definitions at compile time.
[dependencies]
cddl-derive = "0.1"
serde = { version = "1", features = ["derive"] }Apply #[cddl(path = "...")] to a stub struct. The macro reads the CDDL file,
finds the matching rule (by converting the struct name from PascalCase to
kebab-case), and replaces the struct with fully populated fields:
use cddl_derive::cddl;
// schema.cddl contains: person = { name: tstr, age: uint, ? email: tstr }
#[cddl(path = "schema.cddl")]
struct Person;This expands at compile time to:
use serde::{Deserialize, Serialize};
#[derive(Clone, Debug, Deserialize, Serialize)]
pub struct Person {
pub name: String,
pub age: u64,
#[serde(skip_serializing_if = "Option::is_none")]
pub email: Option<String>,
}Use the rule attribute to target a specific CDDL rule name when the struct
name does not match:
#[cddl(path = "schema.cddl", rule = "person-record")]
struct MyPerson;cddl_typegen! generates Rust types for every rule in a CDDL file:
use cddl_derive::cddl_typegen;
cddl_typegen!("schema.cddl");
// Generates: struct Person { ... }, struct Address { ... }, etc.| CDDL construct | Rust output |
|---|---|
Map types { key: type, ... } |
struct with named fields |
Type choices a / b / c |
enum with variants |
Simple type references foo = tstr |
type alias |
Array types [* T] |
Vec<T> |
Table types { * tstr => T } |
HashMap<String, T> |
Optional fields ? key: type |
Option<T> with skip_serializing_if |
Nullable types T / null |
Option<T> |
Hyphenated names my-field |
snake_case field + #[serde(rename = "my-field")] |
Rust keywords type, match |
Escaped with trailing _ + serde rename |
All standard prelude types (tstr, uint, int, float, bstr, bool, null, any, etc.) are mapped to their idiomatic Rust equivalents. All generated types include Serialize and Deserialize derives for use with Serde.
use cddl::validate_json_from_str;
let cddl = r#"person = {
name: tstr,
age: uint,
address: tstr,
}"#;
let json = r#"{
"name": "John",
"age": 50,
"address": "1234 Lakeshore Dr"
}"#;
assert!(validate_json_from_str(cddl, json).is_ok())This crate uses the Serde framework, and more specifically, the serde_json crate, for parsing and validating JSON. Serde was chosen due to its maturity in the ecosystem and its support for serializing and deserializing CBOR via the ciborium crate.
As outlined in Appendix E. of the standard, only the JSON data model subset of CBOR can be used for validation. The limited prelude from the spec has been included below for brevity:
any = #
uint = #0
nint = #1
int = uint / nint
tstr = #3
text = tstr
number = int / float
float16 = #7.25
float32 = #7.26
float64 = #7.27
float16-32 = float16 / float32
float32-64 = float32 / float64
float = float16-32 / float64
false = #7.20
true = #7.21
bool = false / true
nil = #7.22
null = nil
Furthermore, the following data types from the standard prelude can be used for validating JSON strings and numbers:
tdate = #6.0(tstr)
uri = #6.32(tstr)
b64url = #6.33(tstr)
time = #6.1(number)
The first non-group rule defined by a CDDL data structure definition determines the root type, which is subsequently used for validating the top-level JSON data type.
The following types and features of CDDL are supported by this crate for validating JSON:
| CDDL | JSON |
|---|---|
| structs | objects |
| arrays | arrays1 |
text / tstr |
string |
uri |
string (valid RFC3986 URI) |
tdate |
string (valid RFC3339 date/time) |
b64url |
string (base64url-encoded) |
time |
number (valid UNIX timestamp integer in seconds) |
number / int / float |
number2 |
bool / true / false |
boolean |
null / nil |
null |
any |
any valid JSON |
| byte strings | not yet implemented |
unwrap ( ~ ) |
any JSON that matches unwrapped type from map, array or tag |
CDDL groups, generics, sockets/plugs and group-to-choice enumerations can all be used when validating JSON.
Since JSON objects only support keys whose types are JSON strings, when validating JSON, member keys defined in CDDL structs must use either the colon syntax ( mykey: tstr or "mykey": tstr ) or the double arrow syntax provided that the member key is either a text string value ( "mykey" => tstr ) or a bareword that resolves to either a string data type ( text or tstr ) or another text string value ( * tstr => any ).
Occurrence indicators can be used to validate key/value pairs in a JSON object and the number of elements in a JSON array; depending on how the indicators are defined in a CDDL data definition.
Below is the table of supported control operators:
| Control operator | Supported |
|---|---|
.pcre |
✔️3 |
.regex |
✔️3 (alias for .pcre ) |
.size |
✔️ |
.bits |
Ignored when validating JSON |
.cbor |
Ignored when validating JSON |
.cborseq |
Ignored when validating JSON |
.within |
✔️ |
.and |
✔️ |
.lt |
✔️ |
.le |
✔️ |
.gt |
✔️ |
.ge |
✔️ |
.eq |
✔️ |
.ne |
✔️ |
.default |
✔️ |
1: When groups with multiple group entries are used to validate arrays, occurrence indicators are "greedy" in that only the first occurrence indicator that is come across is used in the validation. Subsequent entries with occurrence indicators are ignored due to complexities involved with processing these ambiguities. For proper JSON validation, avoid writing CDDL that looks like the following: [ * a: int, b: tstr, ? c: int ] .
2: While JSON itself does not distinguish between integers and floating-point numbers, this crate does provide the ability to validate numbers against a more specific numerical CBOR type, provided that its equivalent representation is allowed by JSON. Refer to Appendix E. of the standard for more details on the implications of using CDDL with JSON numbers.
3: Due to Perl-Compatible Regular Expressions (PCREs) being more widely used than XSD regular expressions, this crate also provides support for the .pcre control operator from the CDDL Feature Freezer. When the freezer feature is enabled, .pcre uses the fancy-regex crate for full PCRE2 support (lookahead, lookbehind, backreferences). Patterns are anchored on both sides per the spec. When freezer is disabled, .pcre falls back to the same XSD regex engine as .regexp.
If you've enabled the additional-controls feature, the control operators from RFC 9165 below are also available for use:
| Control operator | Supported |
|---|---|
.plus |
✔️ |
.cat |
✔️ |
.det |
✔️ |
.abnf |
✔️ |
.abnfb |
Ignored when validating JSON |
.feature |
✔️ |
The text content control operators from RFC 9741 are also available:
| Control operator | Supported |
|---|---|
.b64u |
✔️ Validates base64url-encoded text against byte string controller |
.b64u-sloppy |
✔️ Lenient base64url decoding (tolerates non-zero trailing bits and padding) |
.b64c |
✔️ Validates base64 classic-encoded text against byte string controller |
.b64c-sloppy |
✔️ Lenient base64 classic decoding |
.hex |
✔️ Validates hex-encoded text (case-insensitive) |
.hexlc |
✔️ Validates lowercase hex-encoded text |
.hexuc |
✔️ Validates uppercase hex-encoded text |
.b32 |
✔️ Validates base32-encoded text |
.h32 |
✔️ Validates base32hex-encoded text |
.b45 |
✔️ Validates base45-encoded text |
.base10 |
✔️ Validates decimal integer text representation |
.printf |
✔️ Validates text against printf-style format string with arguments |
.json |
✔️ Validates text as JSON matching a CDDL type |
.join |
✔️ Validates text as concatenation of array element values |
If you've enabled the freezer feature, the control operators from the CDDL Feature Freezer draft are also available:
| Control operator | Supported |
|---|---|
.pcre |
✔️ PCRE2 regex via fancy-regex (lookahead, lookbehind, backreferences) |
.iregexp |
✔️ RFC 9485 interoperable regular expressions |
.bitfield |
✔️ Bitfield validation for unsigned integers (CBOR only) |
You can activate features during validation as follows:
use cddl::validate_json_from_str;
let cddl = r#"
v = JC<"v", 2>
JC<J, C> = C .feature "cbor" / J .feature "json"
"#;
let json = r#""v""#;
assert!(validate_json_from_str(cddl, json, Some(&["json"])).is_ok())use cddl::validate_csv_from_str;
let cddl = r#"person-file = [*person-record]
person-record = [name: text, age: uint]"#;
let csv = "Alice,30\nBob,25\n";
assert!(validate_csv_from_str(cddl, csv, None, None).is_ok())This crate implements CSV validation as described in draft-bormann-cbor-cddl-csv-08. CSV data is parsed according to RFC 4180 and mapped to the CDDL generic data model:
csv = [?header, *record]
header = [+header-field]
record = [+field]
header-field = text
field = text
Each row becomes an array of fields, and the entire CSV becomes an array of rows. Fields are text strings by default, but the validator coerces them to their JSON representation when the CDDL schema specifies application-level types such as uint , int , or float .
The has_header parameter controls whether the first row is treated as a header row (kept as text strings without numeric coercion). When using the CLI, pass --csv-header to enable this.
The first non-group rule in the CDDL definition determines the root type for validation, which should describe the entire CSV as an array of records.
CDDL, JSON schema and JSON schema language can all be used to define JSON data structures. However, the approaches taken to develop each of these are vastly different. A good place to find past discussions on the differences between these formats is the IETF mail archive, specifically in the JSON and CBOR lists. The purpose of this crate is not to argue for the use of CDDL over any one of these formats, but simply to provide an example implementation in Rust.
use cddl::validate_cbor_from_slice;
let cddl = r#"rule = false"#;
let cbor = b"\xF4";
assert!(validate_cbor_from_slice(cddl, cbor).is_ok())This crate also uses Serde and ciborium for validating CBOR data structures. CBOR validation is done via the loosely typed ciborium::value::Value enum. In addition to all of the same features implemented by the JSON validator, this crate also supports validating CBOR tags (e.g. #6.32(tstr) ), CBOR major types (e.g. #1.2 ), table types (e.g. { [ + tstr ] => int } ) and byte strings. The .bits , .cbor and .cborseq control operators are all supported as well.
The following tags are supported when validating CBOR:
| Tag | Supported |
|---|---|
tdate = #6.0(tstr) |
✔️ |
time = #6.1(number) |
✔️ |
biguint = #6.2(bstr) |
✔️ |
bignint = #6.3(bstr) |
✔️ |
decfrac = #6.4([e10: int, m: integer]) |
not yet implemented |
bigfloat = #6.5([e2: int, m: integer]) |
not yet implemented |
eb64url = #6.21(any) |
✔️ |
eb64legacy = #6.22(any) |
✔️ |
eb16 = #6.23(any) |
✔️ |
encoded-cbor = #6.24(bstr) |
✔️ |
uri = #6.32(tstr) |
✔️ |
b64url = #6.33(tstr) |
✔️ |
b64legacy = #6.34(tstr) |
✔️ |
regexp = #6.35(tstr) |
✔️ |
mime-message = #6.36(tstr) |
✔️ |
cbor-any = #6.55799(any) |
✔️ |
If you've enabled the additional-controls feature, the control operators from RFC 9165 below are also available for use:
| Control operator | Supported |
|---|---|
.plus |
✔️ |
.cat |
✔️ |
.det |
✔️ |
.abnf |
✔️ |
.abnfb |
✔️ |
.feature |
✔️ |
The text content control operators from RFC 9741 are also available for CBOR validation:
| Control operator | Supported |
|---|---|
.b64u / .b64u-sloppy |
✔️ |
.b64c / .b64c-sloppy |
✔️ |
.hex / .hexlc / .hexuc |
✔️ |
.b32 / .h32 |
✔️ |
.b45 |
✔️ |
.base10 |
✔️ |
.printf |
✔️ |
.json |
✔️ |
.join |
✔️ |
If you've enabled the freezer feature, the CDDL Feature Freezer operators are also available:
| Control operator | Supported |
|---|---|
.pcre |
✔️ PCRE2 regex via fancy-regex |
.iregexp |
✔️ RFC 9485 interoperable regular expressions |
.bitfield |
✔️ Bitfield width validation for uints |
You can activate features during validation by passing a slice of feature strings as follows:
use cddl::validate_cbor_from_slice;
let cddl = r#"
v = JC<"v", 2>
JC<J, C> = C .feature "cbor" / J .feature "json"
"#;
let cbor = b"\x02";
assert!(validate_cbor_from_slice(cddl, cbor, Some(&["cbor"])).is_ok())The lexer and parser can be used in a no_std context (including bare-metal targets such as thumbv7em-none-eabihf) provided that a heap allocator is available. This is enabled by opting out of the default features in your Cargo.toml file:
[dependencies]
cddl = { version = "0.10.4", default-features = false }When the std feature is disabled, the crate declares #![no_std] and relies on core and alloc only. All std-only dependencies (codespan-reporting, hexf-parse, regex, simplelog, log, pest_vm, pest_meta, abnf_to_pest) are gated behind the std feature flag and will not be compiled.
Zero-copy parsing is implemented to the extent that is possible. Allocation is required for error handling, diagnostics, and AST construction.
Note: hexadecimal float literals (hexfloat) are not supported in no_std mode since they depend on the hexf-parse crate which requires std.
JSON, CBOR, and CSV validation are dependent on their respective heap allocated Value types, but since these types aren't supported in a no_std context, they subsequently aren't supported by this crate in no_std.