Previous Table of Contents Next


15.3.1 Primitive Types


   Primitive data types are specified for both big-endian and little-endian orderings. The message formats (see Section 15.4, “GIOP Message Formats,? on page 15-30) include tags in message headers that indicate the byte ordering in the message. Encapsulations include an initial flag that indicates the byte ordering within the encapsulation, described in Section 15.3.3, “Encapsulation,? on page 15-14. The byte ordering of any encapsulation may be different from the message or encapsulation within which it is nested. It is the responsibility of the message recipient to translate byte ordering if necessary. Primitive data types are encoded in multiples of octets. An octet is an 8-bit value.

   15.3.1.1 Alignment

   In order to allow primitive data to be moved into and out of octet streams with instructions specifically designed for those primitive data types, in CDR all primitive data types must be aligned on their natural boundaries (i.e., the alignment boundary of a primitive datum is equal to the size of the datum in octets). Any primitive of size n octets must start at an octet stream index that is a multiple of n. In CDR, n is one of 1, 2, 4, or 8.

   Where necessary, an alignment gap precedes the representation of a primitive datum. The value of octets in alignment gaps is undefined. A gap must be the minimum size necessary to align the following primitive. Table 15-1 gives alignment boundaries for CDR/OMG-IDL primitive types.

   Table 15-1 Alignment requirements for OMG IDL primitive data types

TYPE OCTET
ALIGNMENT
char 1

   Table 15-1 Alignment requirements for OMG IDL primitive data types

TYPE OCTET
ALIGNMENT
wchar 1, 2 or 4 for GIOP 1.1 |
1 for GIOP 1.2 and 1.3
octet 1
short 2
unsigned short 2
long 4
unsigned long 4
long long 8
unsigned long long 8
float 4
double 8
long double 8
boolean 1
enum 4

   Alignment is defined above as being relative to the beginning of an octet stream. The first octet of the stream is octet index zero (0); any data type may be stored starting at this index. Such octet streams begin at the start of a GIOP message header (see Section 15.4.1, “GIOP Message Header,? on page 15-31) and at the beginning of an encapsulation, even if the encapsulation itself is nested in another encapsulation. (See Section 15.3.3, “Encapsulation,? on page 15-14).

   15.3.1.2 Integer Data Types

    Figure 15-1 on page 15-7 illustrates the representations for OMG IDL integer data types, including the following data types:

   The figure illustrates bit ordering and size. Signed types (short, long, and long long) are represented as two’s complement numbers; unsigned versions of these types are represented as unsigned binary numbers.

Big-Endian

octet

short

MSB LSB 0 1

   


0 long

   1 2 3

   


0 1 2 3long long 4 5 6 7

Little-Endian

MSB LSB 0 1 octet


   0 1 2 3

   0 1 2 3 4 5 6 7

   Figure 15-1 Sizes and bit ordering in big-endian and little-endian encodings of OMG IDL integer data types, both signed and unsigned.

   15.3.1.3 Floating Point Data Types

    Figure 15-2 on page 15-9 illustrates the representation of floating point numbers. These exactly follow the IEEE standard formats for floating point numbers1, selected parts of which are abstracted here for explanatory purposes. The diagram shows three different components for floating points numbers, the sign bit (s), the exponent (e) and the fractional part (f) of the mantissa. The sign bit has values of 0 or 1, representing positive and negative numbers, respectively.

   1. “IEEE Standard for Binary Floating-Point Arithmetic,? ANSI/IEEE Standard 754-1985, Institute of Electrical and Electronics Engineers, August 1985.

   For single-precision float values the exponent is 8 bits long, comprising e1 and e2 in the figure, where the 7 bits in e1 are most significant. The exponent is represented as excess 127. The fractional mantissa (f1 - f3) is a 23-bit value f where 1.0 <= f < 2.0, f1 being most significant and f3 being least significant. The value of a normalized number is described by:

   –1sign ×2(exponent – 127 )×(1+ fraction )

   For double-precision values the exponent is 11 bits long, comprising e1 and e2 in the figure, where the 7 bits in e1 are most significant. The exponent is represented as excess 1023. The fractional mantissa (f1 - f7) is a 52-bit value m where 1.0 <= m < 2.0, f1 being most significant and f7 being least significant. The value of a normalized number is described by:

   –1sign ×2(exponent – 1023 )×(1+ fraction )

   For double-extended floating-point values the exponent is 15 bits long, comprising e1 and e2 in the figure, where the 7 bits in e1 are the most significant. The fractional mantissa (f1 through f14) is 112 bits long, with f1 being the most significant. The value of a long double is determined by:

   –1sign ×2(exponent – 16383 )×(1+ fraction )

   float

   double

   long double

   Big-Endian 0 1 2 3

s e1
e2 f1
f2
f3

s

e1

e2 f1
f2
f3
f4
f5
f6
f7

s e1
e2
f1
f2
f3
f4
f5
f6
f7
f8
f9
f10
f11
f12
f13
f14

   0 1 2 3 4 5 6 7

   0

   1

   2

   3

   4

   5

   6

   7

   8

   9 10 11 12 13 14 15

   Little-Endian

f3

f2
e2 f1
s e1

f7

f6
f5
f4
f3
f2
e2 f1
s e1

f14

f13
f12
f11
f10
f9
f8
f7
f6
f5
f4
f3
f2
f1
e2
s e1

   0 1 2 3

   0 1 2 3 4 5 6 7

   0

   1

   2

   3

   4

   5

   6

   7

   8

   9 10 11 12 13 14 15

   Figure 15-2 Sizes and bit ordering in big-endian and little-endian representations of OMG IDL single, double precision, and double extended floating point numbers.

   15.3.1.4 Octet

   Octets are uninterpreted 8-bit values whose contents are guaranteed not to undergo any conversion during transmission. For the purposes of describing possible octet values in this specification, octets may be considered as unsigned 8-bit integer values.

   15.3.1.5 Boolean

   Boolean values are encoded as single octets, where TRUE is the value 1, and FALSE as 0.

   15.3.1.6 Character Types

   An IDL character is represented as a single octet; the code set used for transmission of character data (e.g., TCS-C) between a particular client and server ORBs is determined via the process described in Section 13.10, “Code Set Conversion,? on page 13-37. In the case of multi-byte encodings of characters, a single instance of the char type may only hold one octet of any multi-byte character encoding.

   Note – Full representation of multi-byte characters will require the use of an array of IDL char variables.

   For GIOP version 1.1, the transfer syntax for an IDL wide character depends on whether the transmission code set (TCS-W, which is determined via the process described in Section 13.10, “Code Set Conversion,? on page 13-37) is byte-oriented or non-byte-oriented:

   For GIOP version 1.2, and 1.3 wchar is encoded as an unsigned binary octet value, followed by the elements of the octet sequence representing the encoded value of the wchar. The initial octet contains a count of the number of elements in the sequence, and the elements of the sequence of octets represent the wchar, using the negotiated wide character encoding.

   Note – The GIOP 1.2 and 1.3 encoding of wchar is similar to the encoding of an octet sequence, except for its use of a single octet to encode the value of the length.

   For GIOP versions prior to 1.2 and 1.3, interoperability for wchar is limited to the use of two- octet fixed-length encoding.

   wchar values in encapsulations are assumed to be encoded using GIOP version 1.2 and 1.3 CDR.

   If UTF-16 is selected as the TCS-W the CDR encoding purposes can be big endian or little endian, but defaults to big endian. By placing a BOM (byte order marker) at the front of the wstring or wchar encoding, it can be sent either big-endian or little-endian. In particular, the CDR rules for endian-ness of UTF-16 encoded wstring or wchar values are as follows:

   If an ORB decides to use BOM to indicate endianness, it shall add the BOM to the beginning of wchar or wstring values when encoding the value, since it is not present in wchar or wstring values passed by the user.

   If a BOM is present at the beginning of a wchar or wstring received in a GIOP message, the ORB shall remove the BOM before passing the value to the user.

   If a client orb erroneously sends wchar or wstring data in a GIOP 1.0 message, the server shall generate a MARSHAL standard system exception, with standard minor code 5.

   If a server erroneously sends wchar data in a GIOP 1.0 response, the client ORB shall raise a MARSHAL exception to the client application with standard minor code 6.