Google’s Protocol Buffers is a structured way to serialize data.
Think JSON, on steroids.
We are going to discuss a testing method, to fill Protobuf messages with random data, using reflection and Boost.Random.
Table of Contents
Extending Protobuf: reflection
Protobuf, C++ and Boost
This article tackles advanced Protobuf topics,
so one should be comfortable with the basics before reading this.
We’re going to refer to the C++ library,
and implement a simple class to fill Protobuf messages with random data.
Boost.Random
is also used for this purpose.
The code in action
The techniques mentioned here are used within my project ProfaneDB for testing purposes. profanedb::util::RandomGenerator can be checked for reference.
It can also be seen in action in these unit tests.
Reflection
The idea behind reflection (without referring to Protobuf in particular), is for code to be able to interact in multiple ways with some other code, of which it has no knowledge at compile time.
Talking about Protobuf,
this means reading and writing messages which were not compiled using protoc
.
Generation of values
We are first going to look at how values are generated using Boost.Random.
For our purposes, we’ll need to be able to generate random values for the following types:
C++ type | Protobuf types | FieldDescriptor::CPPTYPE |
---|---|---|
google::protobuf::int32 |
int32 , sint32 , sfixed32 |
CPPTYPE_Int32 |
google::protobuf::int64 |
int64 , sint64 , sfixed64 |
CPPTYPE_Int64 |
google::protobuf::uint32 |
uint32 , fixed32 |
CPPTYPE_UInt32 |
google::protobuf::uint64 |
uint64 , fixed64 |
CPPTYPE_UInt64 |
google::protobuf::string |
string |
CPPTYPE_String |
double |
double |
CPPTYPE_Double |
float |
float |
CPPTYPE_Float |
string |
string , bytes |
CPPTYPE_String |
For integer values, we use boost::random::uniform_int_distribution
.
TYPE
is replaced with each C++ type we are going to implement for the first case:
- google::protobuf::int32
- google::protobuf::int64
- google::protobuf::uint32
- google::protobuf::uint64
boost::random::mt19937 generator;
template<>
TYPE RandomValue< TYPE >() {
boost::random::uniform_int_distribution< TYPE > range(
std::numeric_limits< TYPE >::min(),
std::numeric_limits< TYPE >::max()
);
return range(generator);
}
Let’s see what happens here:
we are defining a template function, this way,
RandomValue
can simply be called
with the correct type to get a valid value.
uniform_int_distribution
takes a template parameter
to know what result will be returned
for its operator()
call (using the mt19937
Mersenne Twister
generator as source of randomness).
Its constructor requires two parameters,
the minimum value and maximum value to return.
Here we simply make use of std::numeric_limits
which does just this for scalars.
This code can be seen here
with macros to substitute TYPE
with all the required values at compile time.
The same procedure is repeated for double
and float
,
using boost::random::uniform_real_distribution
.
Then for string
, generating a string appending a x
(generated randomly as an unsigned integer) number
of random characters (drawn from a list of characters).
And eventually for bool
,
using only 0 and 1 as numbers for uniform_int_distribution
.
Filling the messages
Now a single message can be filled with random values. Nested messages must also be filled recursively.
Here is where reflection is needed.
First, our message Descriptor
is used to retrieve the list of fields.
Descriptor * descriptor = message->GetDescriptor();
for (int i = 0; i < descriptor->field_count(); i++)
FieldDescriptor * fd = descriptor->field(i);
Then, for each field a random value is generated according to its C++ type,
which can be retrieved using FieldDescriptor::cpp_type()
.
For instance:
Reflection * reflection = message->GetReflection();
switch(fd->cpp_type()) {
case FieldDescriptor::CPPTYPE_Int32:
reflection->SetInt32(message, fd, RandomValue<google::protobuf::int32>());
break;
}
This is repeated for each C++ type,
and also for repeated
fields,
where methods such as Reflection::AddInt32
and Reflection::AddString
are used.
All of this can be seen here.
If the given field is a nested message, the method is simply called recursively with a pointer to the mutable message, hence filling the whole message tree.
Reflection * reflection = message->GetReflection();
switch(fd->cpp_type()) {
case FieldDescriptor::CPPTYPE_MESSAGE:
this->FillRandomly(reflection->MutableMessage(message, fd));
break;
}
Again, if the field is repeated
, Reflection::AddMessage
is used.