-
Key: IDL42-2
-
Status: closed
-
Source: Real-Time Innovations ( Dr. Gerardo Pardo-Castellote, Ph.D.)
-
Summary:
As of version 4.1, IDL lacks explicit support for 8-bit signed/unsigned integers. While, there are two data type with 1-byte size, these are not suitable for encoding 8-bit integers:
- char => 8-bit quantity that:
- encodes a single-byte character from any byte-oriented code set, or
- in an array, encodes a multi-byte character from a multi-byte code set
- octet => opaque 8-bit. guaranteed not to undergo any change by the middleware.
Unfortunately, reference programming languages don't have consistent behavior:
- Java: byte => signed 8-bit
- C#
- byte => unsigned 8-bit
- sbyte => signed 8 bit
- C (C90 and later)
- char => unspecified. Compiler dependent (http://www.arm.linux.org.uk/docs/faqs/signedchar.php)
- signed char => signed 8-bit number
- unsigned char => unsigned 8-bit number
- C++
- char => recommended for chars only. Allowed for numbers but unspecified sign
- signed char => signed 8-bit number
- unsigned char => unsigned 8-bit number
Other dialects of IDL provide support for 8-bit signed/unsigned integers as follows:
- MIDL (Microsoft's IDL)
- Introduces the small keyword to represent 8-bit integers (see MIDL documentation). Additionally, it uses the hyper keyword to represent 64-bit integers (i.e., a long long in OMG IDL).
- The integer types in MIDL are:
- [unsigned] small (8-bit integer)
- [unsigned] short (16-bit integer)
- [unsigned] long (32-bit integer)
- [unsigned] hyper (64-bit integer)
- This IDL dialect introduces also the signed keyword and therefore numeric values can be marked as signed or unsigned (if unspecified they default to signed).
- Web IDL
- Introduces byte keyword to represent a signed 8-bit integer and treats octets as an unsigned 8-bit integer (see documentation).
- The integer types in Web IDL are:
- byte (signed 8-bit integer)
- octet (unsigned 8-bit integer)
- [unsigned] short (16-bit integer)
- [unsigned] long (32-bit integer)
- [unsigned] long long (64-bit integer)
- XPIDL
- Does not support signed 8-bit integers (see documentation)
- char => 8-bit quantity that:
-
Reported: IDL 4.1 — Wed, 26 Jul 2017 14:28 GMT
-
Disposition: Resolved — IDL 4.2
-
Disposition Summary:
Add support for 8-bit Integers
We have identified four possible candidate solutions listed below. See the end for the decision on the adopted solution.
1. Introduce a new keyword in IDL to indicate an 8-bit integer.
This would preserve the current semantic and mapping of octet (opaque, do not use it as a number). If we do this it would be like the other integers in IDL where it is signed unless we qualify it with "unsigned."
This solution has three alternatives depending on the keyword selected. We have considered byte, small, and tinyint:
Alternative 1.1: Add new byte keyword
byte ==> signed 8-bit integer unsigned byte ==> unsigned 8-bit number
This approach is clean and consistent but we are concerned with using the keyword "byte" because of the ambiguity with the definitions in C# and Java. It also does not sound like a number. It sounds like "octet". In fact it means exactly that.
Alternative 1.2: Add new small keyword (following MIDL's model)
small ==> signed 8-bit integer unsigned small ==> unsigned 8-bit number
Alternative 1.3: Add new tinyint keyword (following SQL model)
tinyint ==> signed 8-bit integer unsigned tinyint ==> unsigned 8-bit number
*Alternative 1.4: Add new int8 and
{uint8}keywords (following new C/C++ standard names)*
int8 ==> signed 8-bit integer uint8 ==> unsigned 8-bit number
Solution 2: Use annotations on "octet" to indicate use as integer
This would mean that we could have "octet" type used three different ways:
octet -- current meaning (opaque 8-bit) not a number @signed octet -- signed 8-bit number @unsigned octet -- unsigned 8-bit number
This approach seems less clean than Approach 1 but it has the advantage of being less ambiguous and backwards compatible. Having said that, having both a keyword and an annotation called unsigned is a bit ambiguous. Also this would be the only use of the @unsigned annotation which is confusing. For example, someone may be tempted to use @unsigned short as a type.
Solution 3: Leverage the existing @bit_bound annotation
IDL4 says this about the @bit_bound annotation:
"it may be used to force a size, smaller than the default one to members of an enumeration or to integer elements."
So it seems if we used:
struct MyStruct { @bit_bound(8) short member1; @bit_bound(8) unsigned short member2; };
The @bit_bound(8) would affect the serialization and also the language mapping.
For example in Java member1 could be typed as byte whereas member2 would be short in order to accommodate values between 128 and 255.
The advantage of Solution 3 is that it reuses the concepts already present without new keywords and annotations in a manner consistent with the intended purpose.
There are two alternative ways to do this:
Alternative 3.1
Just explain in the IDL 4 spec that this is the pattern used to model 8-bit integer types.Alternative 3.2 (providing typedef for all integer types)
Expand the IDL 4 to define typedefs for all integer types making the size of the representation more obvious (similar to the new standard integer types in C99 and C++).
typedef @bit_bound(8) short int8; typedef @bit_bound(8) unsigned short uint8; typedef short int16; typedef unsigned short uint16; typedef long int32; typedef unsigned long uint32; typedef long long int64; typedef unsigned long long uint64;
Solution 4: Allow the existing keyword "unsigned" to be used with octet
There are two alternative ways to implement this.
Alternative 4.1 Add also a signed keyword
Add signed as a keyword, which overcomes the ambiguity of solution (2). With the signed keyword, we could specify whether an octet (or a char) is signed or unsigned and follow C/C++ convention.
octet -- current meaning (opaque 8-bit) not a number signed octet -- signed 8-bit number unsigned octet -- unsigned 8-bit number
Alternative 4.2 Redefine “octet” to mean signed int
With this approach we would have:
octet opaque 8-bit. guaranteed not to undergo any change by the middleware.
Able to hold signed integer values within the range -128 to 127 In the language it is mapped to type that can handle signed integer values within the range -128 to 127unsigned octet As octet also opaque 8-bit. guaranteed not to undergo any.
Able to hold unsigned integer values within the range 0 to 255Thus in C we would map IDL octet to C90 "signed char" and IDL "unsigned octet" to C90 unsigned char.
This mapping may break API portability with previous mappings, or break application code where the "char" was mapped to an unsigned value (e.g. in ARM processors).
To workaround it we could provide some way to prevent the mapping to "signed char". Maybe this is some command-line option to rtiddsgen, like "-nosignedchar"
Adopted solution
The chosen solution is 1 with alternative 4 (int8, uint8).
The reason to prefer introducing new keywords rather than relaying on annotations is that using an annotation (e.g. @bit_bound(8) opens the door for that annotation to be used in unexpected context and with un-expected parameters. For example the user could annotate not just a short, but also a long or long long. Or they could use a different value other than 8. All these cases would have to be described and handled resulting in more complexity for the user and tools. The "keyword" approach is simpler, more constrained, and better aligned with what the user expects.The reason to choose int8 and uint8 is to align it with the resolution of
IDL42-9. So we save having to add another keyword like "small
or "tiny" which would not be intuitive to the user. -
Updated: Tue, 19 Dec 2017 20:04 GMT
-
Attachments:
- ISSUE IDL42-2 IDL lacks support for 8-bit unsigned Integers.docx 21 kB (application/vnd.openxmlformats-officedocument.wordprocessingml.document)
IDL42 — IDL Lacks Support for 8-bit Signed/Unsigned Integers
- Key: IDL42-2
- OMG Task Force: Interface Definition Language 4.2 RTF