Dataset Size

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
shaimil
Charter Member
Charter Member
Posts: 37
Joined: Fri Feb 28, 2003 5:37 am
Location: UK

Dataset Size

Post by shaimil »

I'm loading 4 million rows from oracle into a dataset. I find that the size of the dataset files tends to be 10s of Gigs but when I load the same data into a loojup fileset of simple a fileset its Megs.

The load to the dataset aborts due to a space issue, but I estimate that it would require 80gb for the 4 million rows.

Any explanation would be appreciated.
shaimil
Charter Member
Charter Member
Posts: 37
Joined: Fri Feb 28, 2003 5:37 am
Location: UK

Post by shaimil »

I ran the test again but with only 2 cols (4k) and the load is much quicker, although I still ended up with 2 2Gb files.

The file I need to load has record length of 44k (varchar).

Does a dataset treat a varchar as a char, because in most cases my actual record lengths are no where near 44k.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Are your VarChar() columns bounded or unbounded? Using unbounded strings will save space, at a cost of performance.
Post Reply