-
Key: DDSXTY13-16
-
Status: closed
-
Source: Real-Time Innovations ( Dr. Gerardo Pardo-Castellote, Ph.D.)
-
Summary:
XCDR serialization rules for strings and sequences serialize the length as an uint32 ahead of serializing each of the elements.
For short strings and sequences this can be a significant overhead. For example a 4 element sequence of octets would use 4 bytes for the sequence length and 4 bytes for the sequence content. That is 100% overhead. Worse since the serialized uint32 length must be serialized at a 4-byte offset, in some cases 3 additional padding bytes would be required. This would result on 7 bytes of overhead for 4 bytes of payload, or 175% overhead!!
It would be better is short sequences, that is those whose maximum length is less than 255 could be serialized using an uint8 length instead of a uint32. With this the overhead of the previous example would be 1 byte (or 25%).
An additional optimization would be to just serialize the string characters without the terminating NUL since this information is redundant with the length.
This encoding would affect type compatibility in that a "short" sequence would not be compatible with a longer one. This is different from the current situation where string/sequence length would not affect compatibility. With the new encoding short strings would be compatible with each other (regardless of their length) and likewise long strings would also be compatible with each other (regardless of length) as well as compatible with unbounded ones.
So therefore we would need an annotation to indicate we want to serialize it as a short string... Perhaps we could reuse @bit_bound. For example:
struct ShapeType { @bit_bound(8) string<32> color; long x; long y; };
Disadvantage is that the only valid values when applied to strings/sequences would be @bit_bound(8) and @bit_bound(32).
Alternatively we could also a new annotation like @compact, @pack, @short, @small, ...
-
Reported: DDS-XTypes 1.2 — Wed, 17 Jan 2018 11:29 GMT
-
Disposition: Deferred — DDS-XTypes 1.3
-
Disposition Summary:
Defer until we can determine weather the optimization is worth the added complexity
The RTF thinks is is best to wait and see if the need for this kind of optimization is sufficient to justify the added complexity to the user, and the implementation
he concern is that this effectively introduces new sets of incompatible collection types:
"short" strings are now incompatible with regular strings
"short" sequences are incompatible with regular sequences,
and so on.Note that two strings with a maximum length of say 20 would be incompatible if one is defined as a "sort" string and the other not. The one not defined as a "short" string would be compatible with (non-short) strings of any length. The one defined as short with short strings of any length.
This will increase the complexity to the user who must now decide whether their strings (and sequences) should be defined as short or not to save some bytes, but then live with the consequence that it will not be possible to extend them...
-
Updated: Tue, 8 Oct 2019 17:55 GMT
DDSXTY13 — Provide a more efficient serialization for short strings and sequences
- Key: DDSXTY13-16
- OMG Task Force: DDS-XTYPES 1.3 RTF