Page 1 of 1

Overall limits of DataStage

Posted: Tue Aug 22, 2006 2:02 pm
by bcarlson
How many fields can DataStage handle on import? How about the overall length of the input record?

If there are hard limits set in DataStage, where are these documented? We have a file with 4300 fields and an overall record length of 32000 bytes that we are having a hard time importing. Are there any environment settings (like APT_* variables) that we need to tweak to accomodate such a large file format?

Any help would be most appreciated! Thanks!

Brad.

Posted: Tue Aug 22, 2006 7:26 pm
by kduke
Brad

I have heard of over 2500 fields. That is the largest I have heard of and it took forever to import. I think ODBC may choke on record lengths that long.

Posted: Wed Aug 23, 2006 3:03 am
by ray.wurlod
ODBC is limited to 400 columns and a row length of 16KB by default. These limits can be changed by setting MAXFETCHCOLS and MAXFETCHBUFF respectively in the uvodbc.config file.

There is no documented limit for the Sequential File import, other than the tedium of checking that each column definition has been guessed correctly!

Posted: Wed Aug 23, 2006 3:04 am
by ray.wurlod
ODBC is limited to 400 columns and a row length of 16KB by default. These limits can be changed by setting MAXFETCHCOLS and MAXFETCHBUFF respectively in the uvodbc.config file.

There is no documented limit for the Sequential File import, other than the tedium of checking that each column definition has been guessed correctly!

Posted: Wed Aug 23, 2006 8:19 am
by bcarlson
We will be using about 800 fields spread throughout the 4300, so I would imagine that we will be using the drop option in the input schema. That will allow us to define the whole schema, but not carry any fields that are not needed downstream.

In an ideal world, we would define each field so that later on, we can easily add dropped fields back in when the customer changes their mind. :) However, I guess if 4300 is too much we can always redefine sections as filler and reduce the number of fields.

Brad.

What about splitting the file?

Posted: Wed Aug 23, 2006 2:04 pm
by jdmiceli
Hi all,

Similar to Klaus' suggesting of cutting out certain fields to work with:

Create multiple sequential files from the source putting only a hundred or so fields in each file along with the primary keys for those rows. Have one of the sequential files be the primary source for an insert to the target and then use the other files to update the target based on the keys. Alternatively, you write one job that brings all the source sequential files in at once, though I think that will drop you right back to your current problem of field count.

Bear in mind, this is just an off the wall theory and I have not actually done it this way. This is just a newbie trying to think outside the box. If this method makes those more experienced veterans grind their teeth, please counteract my message. I too have some large field count items that I'm looking for an efficient way of processing and your corrections to my thought processes would be greatly appreciated.

Bestest!