Interface Definition Language 4.2 RTF — Open Issues - OMG Issue Tracker

Key: IDL42-2

Status: closed

Source: Real-Time Innovations ( Dr. Gerardo Pardo-Castellote, Ph.D.)

Summary:

As of version 4.1, IDL lacks explicit support for 8-bit signed/unsigned integers. While, there are two data type with 1-byte size, these are not suitable for encoding 8-bit integers:

char => 8-bit quantity that:
- encodes a single-byte character from any byte-oriented code set, or
- in an array, encodes a multi-byte character from a multi-byte code set
octet => opaque 8-bit. guaranteed not to undergo any change by the middleware.

Unfortunately, reference programming languages don't have consistent behavior:

Java: byte => signed 8-bit
C#
- byte => unsigned 8-bit
- sbyte => signed 8 bit
C (C90 and later)
- char => unspecified. Compiler dependent (http://www.arm.linux.org.uk/docs/faqs/signedchar.php)
- signed char => signed 8-bit number
- unsigned char => unsigned 8-bit number
C++
- char => recommended for chars only. Allowed for numbers but unspecified sign
- signed char => signed 8-bit number
- unsigned char => unsigned 8-bit number

Other dialects of IDL provide support for 8-bit signed/unsigned integers as follows:

MIDL (Microsoft's IDL)
- Introduces the small keyword to represent 8-bit integers (see MIDL documentation). Additionally, it uses the hyper keyword to represent 64-bit integers (i.e., a long long in OMG IDL).
- The integer types in MIDL are:
  - [unsigned] small (8-bit integer)
  - [unsigned] short (16-bit integer)
  - [unsigned] long (32-bit integer)
  - [unsigned] hyper (64-bit integer)
- This IDL dialect introduces also the signed keyword and therefore numeric values can be marked as signed or unsigned (if unspecified they default to signed).
- Web IDL
  - Introduces byte keyword to represent a signed 8-bit integer and treats octets as an unsigned 8-bit integer (see documentation).
  - The integer types in Web IDL are:
    - byte (signed 8-bit integer)
    - octet (unsigned 8-bit integer)
    - [unsigned] short (16-bit integer)
    - [unsigned] long (32-bit integer)
    - [unsigned] long long (64-bit integer)
XPIDL
- Does not support signed 8-bit integers (see documentation)

Reported: IDL 4.1 — Wed, 26 Jul 2017 14:28 GMT

Disposition: Resolved — IDL 4.2

Disposition Summary:

Add support for 8-bit Integers

We have identified four possible candidate solutions listed below. See the end for the decision on the adopted solution.

1. Introduce a new keyword in IDL to indicate an 8-bit integer.

This would preserve the current semantic and mapping of octet (opaque, do not use it as a number). If we do this it would be like the other integers in IDL where it is signed unless we qualify it with "unsigned."

This solution has three alternatives depending on the keyword selected. We have considered byte, small, and tinyint:

Alternative 1.1: Add new byte keyword

byte  ==> signed 8-bit integer
unsigned byte  ==> unsigned 8-bit number

This approach is clean and consistent but we are concerned with using the keyword "byte" because of the ambiguity with the definitions in C# and Java. It also does not sound like a number. It sounds like "octet". In fact it means exactly that.

Alternative 1.2: Add new small keyword (following MIDL's model)

small  ==> signed 8-bit integer
unsigned small  ==> unsigned 8-bit number

Alternative 1.3: Add new tinyint keyword (following SQL model)

tinyint  ==> signed 8-bit integer
unsigned tinyint  ==> unsigned 8-bit number

*Alternative 1.4: Add new int8 and

{uint8}

keywords (following new C/C++ standard names)*

int8    ==> signed 8-bit integer
uint8  ==> unsigned 8-bit number

Solution 2: Use annotations on "octet" to indicate use as integer

This would mean that we could have "octet" type used three different ways:

octet                        -- current meaning (opaque 8-bit) not a number
@signed octet       -- signed 8-bit number
@unsigned octet  -- unsigned 8-bit number

This approach seems less clean than Approach 1 but it has the advantage of being less ambiguous and backwards compatible. Having said that, having both a keyword and an annotation called unsigned is a bit ambiguous. Also this would be the only use of the @unsigned annotation which is confusing. For example, someone may be tempted to use @unsigned short as a type.

Solution 3: Leverage the existing @bit_bound annotation

IDL4 says this about the @bit_bound annotation:

"it may be used to force a size, smaller than the default one to members of an enumeration or to integer elements."

So it seems if we used:

struct MyStruct {
     @bit_bound(8)  short             member1;
     @bit_bound(8)  unsigned short    member2;
};

The @bit_bound(8) would affect the serialization and also the language mapping.

For example in Java member1 could be typed as byte whereas member2 would be short in order to accommodate values between 128 and 255.

The advantage of Solution 3 is that it reuses the concepts already present without new keywords and annotations in a manner consistent with the intended purpose.

There are two alternative ways to do this:

Alternative 3.1
Just explain in the IDL 4 spec that this is the pattern used to model 8-bit integer types.

Alternative 3.2 (providing typedef for all integer types)

Expand the IDL 4 to define typedefs for all integer types making the size of the representation more obvious (similar to the new standard integer types in C99 and C++).

typedef @bit_bound(8) short              int8;
typedef @bit_bound(8) unsigned short     uint8;
typedef               short              int16; 
typedef               unsigned short     uint16;
typedef               long               int32;
typedef               unsigned long      uint32;
typedef               long long          int64;
typedef               unsigned long long uint64;

Solution 4: Allow the existing keyword "unsigned" to be used with octet

There are two alternative ways to implement this.

Alternative 4.1 Add also a signed keyword

Add signed as a keyword, which overcomes the ambiguity of solution (2). With the signed keyword, we could specify whether an octet (or a char) is signed or unsigned and follow C/C++ convention.

octet                   -- current meaning (opaque 8-bit) not a number
signed octet       -- signed 8-bit number
unsigned octet   -- unsigned 8-bit number

Alternative 4.2 Redefine “octet” to mean signed int

With this approach we would have:

octet	opaque 8-bit. guaranteed not to undergo any change by the middleware. Able to hold signed integer values within the range -128 to 127 In the language it is mapped to type that can handle signed integer values within the range -128 to 127
unsigned octet	As octet also opaque 8-bit. guaranteed not to undergo any. Able to hold unsigned integer values within the range 0 to 255

Thus in C we would map IDL octet to C90 "signed char" and IDL "unsigned octet" to C90 unsigned char.

This mapping may break API portability with previous mappings, or break application code where the "char" was mapped to an unsigned value (e.g. in ARM processors).

To workaround it we could provide some way to prevent the mapping to "signed char". Maybe this is some command-line option to rtiddsgen, like "-nosignedchar"

Adopted solution

The chosen solution is 1 with alternative 4 (int8, uint8).
The reason to prefer introducing new keywords rather than relaying on annotations is that using an annotation (e.g. @bit_bound(8) opens the door for that annotation to be used in unexpected context and with un-expected parameters. For example the user could annotate not just a short, but also a long or long long. Or they could use a different value other than 8. All these cases would have to be described and handled resulting in more complexity for the user and tools. The "keyword" approach is simpler, more constrained, and better aligned with what the user expects.

The reason to choose int8 and uint8 is to align it with the resolution of ~~IDL42-9~~. So we save having to add another keyword like "small
or "tiny" which would not be intuitive to the user.

Updated: Tue, 19 Dec 2017 20:04 GMT

Attachments:

ISSUE IDL42-2 IDL lacks support for 8-bit unsigned Integers.docx 21 kB (application/vnd.openxmlformats-officedocument.wordprocessingml.document)