how many records a dataset can hold

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Pavan_Yelugula
Premium Member
Premium Member
Posts: 133
Joined: Tue Nov 23, 2004 11:24 pm
Location: India

how many records a dataset can hold

Post by Pavan_Yelugula »

hi
i have huge tables in my database records around 15millions. instead of always going and opening my table to get the required information. i am writing all the data to a Dataset(for one country at a time the dataset might be holding randomly around 6 millions records) and using it.

will this be a performance bottle neck.
how many records can i make my dataset to hold. where can i take a trade off between the dataset and a keeping a table and using it.

Can u also please tell me how many number of records i can let my files have to consider using them in the look-up stage.

Any inputs will be really helpful
Thanks and Regards
Pavan
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Pavan,

15 Million records can easily be held in a dataset. If you still have a file system limited to 2Gb then your partitioning algorithm needs to ensure that no single file will exceed that size. This depends not only on the number of records but their data size as well.
nkreddy
Premium Member
Premium Member
Posts: 23
Joined: Mon Jun 21, 2004 7:12 am
Location: New York

Post by nkreddy »

Arnd,

Would you please explain more on this..

Does that mean that if we have three partitions, we will have 2Gb times 3 or the 2Gb file is split evenly in all the three partitions.

Thanks,
NK
ArndW wrote:Pavan,

15 Million records can easily be held in a dataset. If you still have a file system limited to 2Gb then your partitioning algorithm needs to ensure that no single file will exceed that size. This depends not only on the number of records but their data size as well.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Pavan,

if you have an even split you will get (approximately) 2Gb for each partitioned file; assuming your OS and/or file system is still limited to 2Gb maximum file sizes; so with 3 partitions you can get up 6Gb of data. It would make sense to change your partition number to make sure that you won't come close to hitting that limit. In addition, the more files you have the more you can parallelize (is that a valid English verb?) intrinsically.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

No, it's an Americanism: "nouning" verbs and "verbing" nouns is a trait primarily encountered between the North Atlantic and Pacific oceans.
Ostensibly this is in the interests of efficient language. I disagree, and feel that they abase the language.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Ray, I agree with you that it is an abuse of English, but at least it is a non-cardinal language sin. Using telephone-shorthand (Cuz,Pls,u) and intentional mis-spellings in plain language English text is a crime punishable by either several hundreds years on the 9th plane of hell or having to watch a non-stop shopping TV channel while tied & bound to a chair like Alex was in Anthony Burgess' "A Clockwork Orange" :twisted:
Post Reply