Spec: DDS-XTYPES
Document Number: ptc/2013-12-18
Revision Data: 2013-12-18
Version 1.2
Nature: Enhancement
Severity: Significant
Title: Deserialization issues with Extensible types
Problem description
The XTypes specification uses "traditional" CDR to represent extensible types on the wire.
The usage of "traditional" CDR encapsulation poses a problem in some scenarios where a DataReader with an extensible type "A" receives samples coming from a DataWriter with an extensible type "B" and where type "B" contains a subset of the members in type "A" (base to derive relationship). For example:
struct TypeB {
char member1;
};
struct TypeA {
char member1;
short member2;
};
At first sight, the usage of traditional CDR is enough to deal with the scenario described above:
The deserialize operation for type "A" will deserialize member1. After that, the function will try to deserialize member2 but it will find that there are no more bytes in the input stream. In the absence of a value for member2, the deserialize function will initialize this member with its default value 0.
So far so good.
The problem is that the previous algorithm works fine only when the number of bytes in the input stream (serialized sample published by the DataWriter) is exactly equal to the number of bytes produced by the serialization function for Type "B". In our example, this number is 1. Unfortunately, in some scenarios, including our example, the number of bytes in the input stream will include some padding bytes that are added to guarantee that the RTPS DATA sub-message containing the sample published by the DataWriter has a size divisible by 4.
The PSM in the RTPS specification requires that each sub-message is aligned in a 32-bit boundary with respect to the start of the RTPS message.
In our example, the deserialization function for Type "A" will receive a stream with 4-bytes (1 for the char in member1 and 3 bytes for padding). Because of that, the previous algorithm may end up initializing member2 with padding bytes. Since padding bytes may have a value different than 0 the member2 value will not necessarily be equal to zero.
Affected types
The previous issue may occur when all the following conditions are met:
- The top-level types used by DataReader and DataWriter are extensible.
- The DataReader type is assignable from the DataWriter type and it contains more fields at the end (see example above).
- The last primitive member on the DataWriter type is: char, octet, boolean, short, unsigned short, or string. Other types are not a problem because they require 4-byte alignment and the middleware will not have to add padding bytes at the end of the DATA message.
- The first primitive member on the DataReader type after the last primitive member on the DataWriter type is: char, octet, boolean, short, unsigned short. Other types are not a problem because they require 4-byte alignment. The deserialization function will not use the padding bytes at the end of the DataWriter's sample to initialize the contents of the first primitive member on the DataReader type.
For example:
struct TypeB {
char member1;
};
struct TypeA {
char member1;
short member2;
};
The previous two types will be problematic.
struct TypeB {
long member1;
};
struct TypeA {
long member1;
short member2;
};
The previous two types will be OK.
struct TypeB {
string member1;
};
struct TypeA {
string member1;
short member2;
};
The previous two types may be problematic or not depending on the value of member1.
struct TypeC {
char member1;
};
struct TypeB {
TypeC member1;
};
struct TypeA {
TypeC member1;
short member2;
};
The previous two types will be problematic since the first primitive member of TypeB (including nested types) is a char.
Are Mutable Types Affected?
This problem does not affect mutable types directly. However, it may affect mutable types when they contain members of extensible types.
For example:
struct TypeB {
char member1;
};
struct TypeBMutable {
TypeB member1;
}; struct TypeA {
char member1;
short member2;
};
struct TypeAMutable {
TypeA member1;
};
Mutable members are encapsulated using parametrized CDR representation.
Each member, within a mutable type is simply a CDR-encapsulated block of data. Preceding each one is a parameter header consisting of a two-byte parameter ID followed by a two-byte parameter length. One parameter follows another until a list-terminating sentinel is reached. Parameter headers must be aligned to a 4-byte boundary. If the serialized length of a member value is not divisible by 4, the implementation will add some padding bytes.
The problem is that the length field of a mutable member represents the data length measured from the end of that field until the start of the next parameter ID (this is part of the XTypes specification). Therefore, the deserialize function for the member type may receive as input a stream containing the padding bytes.
IN Implementation Changes
To reduce the number of scenarios in which we run into the problem we will initialize the padding bytes to zero. This includes the padding bytes at the end of the serialized sample, and the padding bytes at the end of a mutable member serialization.
By doing this the following scenario would not be an issue anymore
struct TypeB {
char member1;
};
struct TypeA {
char member1;
short member2;
};
In the previous example the deserialize function would use padding bytes to initialize member2. However, since these padding bytes have a zero value, the value assigned to member2 would be the right value.
With the previous fix we would have issues only when the first primitive member on the DataReader type is char, octet, boolean, short, unsigned short and in addition it is the discriminator of a union.
For example:
struct TypeB {
char member1;
};
union UnionA switch (short) {
case 0:
long member1;
default:
long member2;
};
struct TypeA {
char member1;
UnionA member2;
};
To detect the previous problem, we will change the type compatibility algorithm so that it reports an error when the issue occurs.
Potential Changes to the XTypes Specification
Resolving the extensibility problem described above for all the scenarios will likely require changes to the XTypes specification.
For the top-level types, we propose to use two bits from the options bytes following the encapsulation identifier to identify the padding.
For example:
struct TypeB {
char member1;
};
In the previous example a sample for Type B would be encapsulated as follows:
CDR_BE |
x x x x x x x x x x x x x x 1 1 |
member1 |
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
Notice that the value '11' in the options indicates the number of padding bytes.
For mutable members we propose to change the interpretation of the length field to not include the padding bytes to the next header.
For example:
struct TypeB {
char member1;
};
struct TypeBMutable {
TypeB member1; };
In the previous example TypeBMutable would be encapsulated as follows:
CDR_BE |
x x x x x x x x x x x x x x 0 0 |
ID 1 |
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 |
member1 |
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
PID_LIST_END |
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
Notice that the length field for the member with ID 1 is set to 1 instead of 4
Forward compatibility
The changes described in the previous section should not affect backward compatibility with previous XTYPES compatible versions.
Let's assume two XTYPE versions:
- One DDS compliant with XTPES 1.1 and already deployed
- A future DDS compliant with XTPES 1.2 incorporating the changes to the XTypes spec described in the previous section
TOP-LEVEL Types:
- XTYPES 1.1 DataReaders should ignore the option bits set by the XTYPES 1.2 DataWriters
- XTYPES 1.2 DataWriters receiving samples from XTYPES 1.1 DataReaders will assume that there are no padding bytes
MUTABLE Members
- XTYPES 1.1 DataReaders will receive members with a header where the data length is smaller than expected. This should not be a problem because the DataReader will always align the beginning of the next parameter to a 4-byte boundary. Therefore, the padding bytes will be skipped by the align operation.
- XTYPES 1.2 DataReaders will receive members with a header where data length includes padding. That should be fine as well. The align operation to go to the beginning of the next parameter will be a NOOP.