DDS-XTypes 1.2 RTF Avatar
  1. OMG Issue

DDSXTY12 — The encoding of strings and wide strings should be standardized

  • Key: DDSXTY12-113
  • Status: closed  
  • Source: Real-Time Innovations ( Dr. Gerardo Pardo-Castellote, Ph.D.)
  • Summary:

    Currently, strings and wide strings are treated as mere arrays of chars and wide chars respectively, regardless of their character encoding configuration. This can lead to interoperability issues between systems that assume different encoding configurations implicitly.

    Java and C# languages use UTF-16 encoding, which forces any user who wishes to interoperate their C/C+ application with a Java/.Net application to manually make sure that their strings are encoded in the UTF-16 encoding. There is no API to convert between DDS_Wchar (4 bytes) and a platform’s wchar_t. By default the conversion is done by casting wchar_t to DDS_Wchar, and the fundamental problem is that when casting, after converting back to wchar_t the right character may not be retrieved if the publisher was using a different encoding than the subscriber.

    The proposal is to standardize the wire encoding for char and wide char. For chars we propose using UTF-8 and for wide chars we propose UTF-32 encoding. Along with this change, we should provide conversion APIs to in C/C++ to convert between wchar_t and DDS_Wchar to avoid issues between platforms. For Java and C# the middleware knows the wire encoding (UTF-32) and the native encoding (UTF-16) so the conversion can be done automatically for the user.

    We need to think about how to maintain backwards compatibility with applications that may be using other wire encodings today.

  • Reported: DDS-XTypes 1.1 — Wed, 13 Jul 2016 20:55 GMT
  • Disposition: Resolved — DDS-XTypes 1.2
  • Disposition Summary:

    *Change the standard encoding of strings *

    See the discussion and attached documents in DDSXTY12-113 for rationale behind the decisions to:

    • Not specify an encoding for characters
    • Encode strings using UTF-8
    • Encode wide characters and strings using UTF-16
    • Change the type of WChar to Char16 from Char32.
  • Updated: Thu, 22 Jun 2017 16:42 GMT
  • Attachments: