A DATUM is, in C terminology, a structure. One of the items in that structure is an indicator of what kind of item is currently being held in that DATUM. For example, if it's an integer, then it's stored in four bytes. If it's a float, then it's stored in eight bytes using a 51-bit mantissa and an 11-bit (shifted) exponent, as described in the IEEE standards. If it's a connection to an ODBC data source, then it's a pointer to the structure returned by SQLAllocConnect(). And so on. If it's a Char(1) the actual character is stored in one byte (non-NLS) within the structure. If it's a Char(1) the actual character is stored in somewhere between one and four bytes, since DataStage uses a UTF-8 encoding of Unicode. A DATUM might also hold a file handle, a pointer to a subroutine, DataStage's internal representation of NULL (one byte, value 0x80), an "unassigned variable" (nothing in the data area). As mentioned earlier, other elements in the structure support statements and functions such as REMOVE and EXTRACT. It all works - it's been around more than 20 years. Don't worry about it. And the call interfaces (ICI and GCI) and the supplied NLS maps do have the data type mappings correct; I invite you to take it up with IBM if you believe that this is not the case.
BTW, if you visit Unicode Consortium web site you will get the full story about Unicode, which can be a 16-bit or 32-bit encoding. UTF encodings specify variable-length representations.
Diff between server Job Parallel Job
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: