The cerberus-cpp documentation¶
Getting Started¶
Installation¶
Cerberus-cpp is header-only, so it should be fairly easy to get up and running. It requires the following software to be available:
A C++14-compliant C++ compiler
CMake >= 3.11
The yaml-cpp library, version >= 0.6
git
The easiest way to get yaml-cpp is:
On Debian, Ubuntu:
sudo apt install libyaml-cpp-devOn MacOS:
brew install yaml-cpp
With these prerequisites met, cerberus-cpp is installed just as any other CMake project is, e.g.
git clone https://github.com/dokempf/cerberus-cpp.git
cd cerberus-cpp
mkdir build
cd build
cmake ..
make
make install
Usage example¶
This is the most basic usage example that validates a given document against a schema.
#include<cerberus-cpp/validator.hh>
#include<yaml-cpp/yaml.h>
#include<iostream>
int main()
{
YAML::Node schema = YAML::Load(
"answer: \n"
" type: integer \n"
" default: 42 \n"
"question: \n"
" type: string \n"
);
YAML::Node document;
document["question"] = "What is 6x9?";
cerberus::Validator validator(schema);
if (validator.validate(document))
{
YAML::Node doc = validator.getDocument();
std::cout << doc["question"].as<std::string>() << " " << doc["answer"].as<int>() << std::endl;
}
else
std::cerr << validator << std::endl;
return 0;
}
As you can see, both the schema and the document are defined using the YAML::Node
data structure. In order to work with cerberus-cpp, it is important to be familiar with
the basic usage of yaml-cpp as e.g. described in the yaml-cpp documentation.
In the above examples, we are loading the schema from inline YAML with YAML::Load,
while we programmatically construct the document. Often, your document will of course come
from user input e.g. by loading it from disk with YAML::LoadFile.
Basic Usage¶
Validation¶
The most important component provided by cerberus-cpp is the cerberus::Validator class.
An instance of this validator is given a schema and a document. The document is then validated
against this schema using the validate method - returning a boolean value indicating success.
If the validation process fails, the errors can be written to a stream using the validator’s
printErrors method, or more conveniently by passing the validator itself into the stream:
cerberus::Validator validator;
if(!validator.validate(document, schema))
std::cerr << validator << std::endl;
The schema and the document are both provided as instances of YAML::Node. Using the same
data structure for schemas and documents is considered a feature of cerberus-cpp. A tutorial
on how to construct these documents from YAML files, from inline strings or programmatically
can be found in the yaml-cpp documentation.
Both schemas and documents are always expected to be mappings.
The validator class has the following configurable validation policies:
whether or not unknown fields in the document (fields that do not appear in the schema) should make the validation process fail (can be toggled using the
setAllowUnknown(value)method). The default isfalse.whether or not all fields are considered to be required fields (can be toggled using the
setRequireAll(value)method). The default isfalse.
In the following, we will provide a summary of the available validation rules in cerberus-cpp. As cerberus-cpp aims for compatibility with Python package cerberus, you can read more on the semantics of these rules in the Cerberus Validation Rule Documentation. For inconsistencies between the Python package and cerberus-cpp see Compatibility with cerberus. This is the list of implemented validation rules in order of relevance for most applications:
typespecifies the expected type for this field. Possible values areinteger,float,string,boolean,list,dictor any identifier of a custom type (see Custom Validation Rules).requiredforces the existence of the field in the document.schemaspecifies a schema for a submapping or a sublistallowedspecifies the list of allowed values for the field.forbiddenin contrast specifies a list of forbidden values for the fieldminandmaxspecify a minimum and maximum for the value and requiretypeto be set to something that allows comparisonregexmatches the field’s value against the given regular expressionkeysrules,valuesrulesallow rules only for keys resp. values of submapping.minlengthandmaxlengthconstrain the allowed length of list or mappingitemsspecifies a list of schemas that the entries of a list need to fulfillcontainsforces the existence of a value within a listdependenciesforces the existence of another field, if the given field is presentexcludesforces the absence of another field, if the given field is presentnullablespecifies whether the field accepts a null valueallow_unknownchanges the validators policy regarding unknown values e.g. for a submappingrequire_allchanges the validators policy regarding requiring all fields e.g. for a submapping
Normalization¶
Cerberus-cpp can not only perform validation, but also modify the document according to
normalization rules. Cerberus-cpp does not do this in-place. Instead you need to use the validator’s getDocument() method to
access the normalized document.
YAML::Node schema = YAML::Load(
"name: \n"
" type: string \n"
" default: John Doe \n"
" rename: user \n"
);
cerberus::Validator validator(schema);
if (validator.validate(YAML::Node()))
std::cout << "The normalized document: " << validator.getDocument() << std::endl;
else
std::cerr << validator << std::endl;
The normalized output document of above example would be user: John Doe.
Additionally, the validator class has a configurable policy whether or not unknown fields
should be purged from the normalized document (can be toggled using the setPurgeUnknown(value) method).
The default is false.
This is a list of normalization rules available in cerberus-cpp:
defaultprovides the field’s default valuerenamerenames a given field in the normalized documentpurge_unknownchanges the validators policy regarding purging unknown fields e.g. for a submapping
Advanced Usage¶
This section is only relevant to users who seek to enhance the capabilities of
cerberus-cpp by e.g. providing custom rules and types. All customizations described
in this documentation operate on instances of cerberus::Validator. You may
also apply these in the constructor of a derived class.
Custom Validation Rules¶
Custom validation rules can be registered on instances of cerberus::Validator.
This is an example that registers a custom rule oddity that only accepts odd
integer values:
cerberus::Validator validator;
validator.registerRule(
YAML::Load(
"oddity: \n"
" type: boolean \n"
" dependencies: \n"
" type: integer \n"
),
[](auto& v) {
if(!v.getDocument().IsDefined())
return;
if(v.getDocument().template as<int>() % 2 != v.getSchema().template as<bool>())
v.raiseError("oddity-Rule violated!");
}
);
The first argument here defines a schema that is used to validate the rule in the
user-provided schemas (a meta-schema so to say). This on one hand defines the name
of the rule (here: oddity) and on the other hand rules out misuse (like e.g.
providing oddity: 42, where only bool arguments are allowed). You can
use all available schema rules, though typically only the name is required. Here, we
additionally enforce the argument to be of the integer type by adding a
dependencies rule.
The second argument is expected to be a templated callable (here: a generic lambda) that implements the rule. The only argument is typically a reference to an instance of the ValidationRuleInterface API, although the type is accepted as a template parameter to integrate well with custom derived validator classes. In our example, only the most relevant methods of the ValidationRuleInterface API are used:
getDocument()gives theYAML::Nodethat describes the document snippet that is currently validated.getSchema()provides theYAML::Nodethat describes the schema snippet for this validation.raiseError()reports a validation error
Some rules require to be applied before or after certain other rules in order to
implement the correct semantics. Cerberus-cpp gives control over this by providing
a number of hooks, when rules execute. The hook at which a custom rule executes can
be controlled by passing a third argument. The enumeration cerberus::RulePriority
lists the possible values:
Warning
doxygenenum: Cannot find enum “cerberus::RulePriority” in doxygen xml output for project “cerberus-cpp” from directory: /home/docs/checkouts/readthedocs.org/user_builds/cerberus-cpp/checkouts/stable/doc/build-cmake/doc/xml
Custom Types¶
By default, cerberus-cpp supports integers, floating point types, strings, boolean values, as well a sequences and mappings. You can however provide custom types as well. We illustrate this by implementing a simple date type that only stores a year. While this of course could be achieved with an integer as well, we use this to illustrate how a custom class is validated.
struct SimpleDate {
int year;
bool operator==(const SimpleDate& other) const
{
return year == other.year;
}
bool operator<(const SimpleDate& other) const
{
return year < other.year;
}
};
This simple implementation at the same time documents the minimum requirement on the interface of eligible C++ types:
operator== and operator< need to be defined.
On top of that yaml-cpp s (de)serialization needs to be implemented for this type according to their
Guide :
namespace YAML {
template<>
struct convert<SimpleDate>
{
static Node encode(const SimpleDate& rhs)
{
return Node(rhs.year);
}
static bool decode(const Node& node, SimpleDate& rhs)
{
return convert<int>::decode(node, rhs.year);
}
};
}
Registration of the new type with a given validator is then as simple as this:
cerberus::Validator validator;
validator.registerType<SimpleDate>("date");
Now, this type can be referenced with a rule type: date and validation will fail if the
YAML deserialization of the input fails.
Schema Registration¶
If you intend to reuse schemas a lot, you can also have a validator instance store them by
using the registerSchema method. Later on, the schema can either be retrieved by
using the special signature of validate that specifies the schema with a string or
by specifying the string as the value for a schema validation rule.
cerberus::Validator validator;
validator.registerSchema("user", schema);
if (!validator.validate(document, "user"))
std::cerr << validator << std::endl;
Compatibility with cerberus¶
Cerberus-cpp tries to be compatible with the Python package cerberus. In reality, some inconsistencies exist. If you have a use case where cerberus-cpp differs from cerberus that cannot be explained by one of the following reasons please open an issue attaching YAML files with schema and data:
Several validation rules require the
typerule to be present as well. These are the rules that require equality or comparison to be implemented e.g.:minandmaxallowed
Your safest bet is to always define the
typerule.The
allowedrule does not validate iterables, because that would lead to conflicting semantics of thetypefield.The
containsrule has currently no access to the item type information. It currently assumes string values, although it could be changed to inspect a givenschemarule for type information - I am not sure yet I want to go that route.Some of the types built into cerberus are hard to implement in C++ and are therefore omitted from the library. If you need these, register a custom type and choose the correct C++ data structure yourself. These are:
dateanddatetime: With C++ lacking standardization of these types and completely missing a parser for such types, it would be unwise to implement these.binary: There is no sensible C++ equivalent of a Python bytes object, so it seems wise to skip on this one.set: This type seems to be inaccessible when starting from serialized YAML. I am currently not planning to add this.
The
regexrule is not guaranteed to accept exactly the same dialect of regular expressions as in the Python package. Currently, the C++ implementation uses plainstd::regex. Maybe this can be fixed by picking the correct grammar forstd::regex.The following rules are currently considered a won’t fix for one reason or the other:
allof,anyof,noneof,oneof: These rules are a major headache to implement. Yet, the cerberus documentation actively warns users that the need for such rule hints at a design flaw. Also, these rules disable normalization. Currently, I would rather opt to not doing these rules at all.readonly: Just from reading the documentation I do not get both the semantics or the use case for this rule. So, I am omitting it until somebody urges me to implement it.check_with: In the context of cerberus-cpp, I fail to see how this rule differs from applying a custom rule, which you should do in that case.coerce: Similarly tocheck_with, a a custom coercer is not really different from a custom normalization rule. Might add acoercerule later for compatibility with Python cerberus later though.
API documentation¶
Cerberus-cpp has two core APIs:
The Validator API is the end user interface that is used when validating data against schemas.
The ValidationRuleInterface API is the interface used when developing custom rules. It gives access to the internal state of the validation process that is necessary to implement custom validation logic.
If you do not intend to implement custom rules, there is no need to understand the latter.
Validator API¶
Warning
doxygenclass: Cannot find class “cerberus::Validator” in doxygen xml output for project “cerberus-cpp” from directory: /home/docs/checkouts/readthedocs.org/user_builds/cerberus-cpp/checkouts/stable/doc/build-cmake/doc/xml
ValidationRuleInterface API¶
Warning
doxygenclass: Cannot find class “cerberus::Validator::ValidationRuleInterface” in doxygen xml output for project “cerberus-cpp” from directory: /home/docs/checkouts/readthedocs.org/user_builds/cerberus-cpp/checkouts/stable/doc/build-cmake/doc/xml
Contributing¶
Cerberus-cpp welcomes contributions. Before considering to contribute, please read the following guidelines:
If you have a use case that does not work in cerberus-cpp, but it does work in the Python package cerberus, please open a bug report and attach YAML files with a schema and some data. Ideally, the file follows the syntax that cerberus-cpp tests use (see the
test/testdata.ymlfile).Bear in mind that cerberus-cpp tries to stay compatible with the Python package cerberus. Pull requests that increase incompatibilities will not be considered, while pull requests that remove these are highly welcome.
If you are implementing a custom rule and you need to extend the ValidationRuleInterface API, please provide a description of your use case, so that we can better discuss the interface design.
When opening a pull request against the cerberus-cpp repository, please add your name to
COPYING.mdas well.