Data Representations...

Adam_Clone · Post by **Adam_Clone** » Mon Apr 11, 2005 5:05 am

Hi
I'd like to get some information about how the DS backend engines manage the data from different platforms...like 32-bit, 64-bit etc.
When data comes from heterogenous databases how are trhe data representations made compatible ? If the data from a 64-bit platform is to be warehoused into some repository on a 32-bit machine, will there be any data loss, say when the data is from a sequential file.
I know that the question is a bit obscure. But i believe its atleast understandable with a little reasoning.
...Regards....

ray.wurlod · Post by **ray.wurlod** » Mon Apr 11, 2005 5:31 am

Currently DataStage is a 32-bit application. You need to access data through 32-bit drivers for the relevant database.

Adam_Clone · Post by **Adam_Clone** » Mon Apr 11, 2005 6:01 am

thnx....
that has shedded some light...but may i ask...on a platform like XP or a desktop Linux for that matter.....how are large numbers (those that need more than 32 bits, say from a 64-bit platforms,say for scientific applications) be stored when data is to be "ETL'd" into something like a sequential file on a 32-bit platform like desktop linuxes ? Hope u've understood wot i asked

ArndW · Post by **ArndW** » Mon Apr 11, 2005 7:56 am

The 16, 32 & 64-bit representations are more for internal machine-level pointers, offsets & code and don't relate (directly) to the data types. Thus, even on an 8-bit machine you can have a 64-bit number represented. The Database Datatypes are largely machine independant. Of course on "wider" machines many operations can be performed with a single word (which an 8-bit machine might need 4 words) so they can be significantly faster; but as far as the ETL process is concerned the machine bus width or word size is completely transparent.

Adam_Clone · Post by **Adam_Clone** » Mon Apr 11, 2005 10:17 pm

Hey
But machine representation of data are different on different platforms...
For instance when I was developing an encryption algorythm in Java, on win 98, the decrypted text (set of integers representing ascii) was 16-bit while the same program on XP gave 32 bits.
Those representational differences are the ones I am talking about. I got to know that the drivers dealing with the connectivity to different Databases manage them implicitly. Can I get some details about that ?
...Regards....

Adam_Clone · Post by **Adam_Clone** » Mon Apr 11, 2005 10:31 pm

It was giving 32 -bit outputs on Linux also....
But types are handled by the Java Virtual Machine...no matter what the platform...My actual doubt is when some join is done between two or more tables from heterogenous databases, how are the representational consistancies preserved ? Is it converted into representations on the platform where data is warehoused by the concerned drivers ?

ArndW · Post by **ArndW** » Tue Apr 12, 2005 4:07 am

Adam,

when you feed data to DataStage you define the metadata (i.e. column definitions) and it is there where you specify what format is coming in. The full complement of data conversion is available either straight in DataStage builtins or via the use of the OCONV and ICONV functions.

Adam_Clone · Post by **Adam_Clone** » Tue Apr 12, 2005 4:42 am

Arnd,
I know that the incoming type is set at the time the stage is defined...but wot i askd is ....say data is coming from a DB2 database on a mainframe...when it is 2 be warehoused in say a desktop unix system with lesser precision...the new representation may probably be inadequate right.....wot about that ?

Adam_Clone · Post by **Adam_Clone** » Tue Apr 12, 2005 4:50 am

.....and wot are ICONV and OCONV.....are they used for the conversions during Extraction by the drivers used for connectivity ?

DSXchange

Data Representations...

Data Representations...

Further clarification...

Clarification...

Clarification...contd...

....contd