Sequential File- Using multiple readers

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
LavanyaRamesh007
Participant
Posts: 42
Joined: Mon Apr 21, 2008 1:49 am

Sequential File- Using multiple readers

Post by LavanyaRamesh007 »

Hi,

I would like to know why we need to explicitly specify the number of readers per node for Seq file stage.. What implications it has in performance..???
How do i decide how many readers per node?? There is also an option which says read using multiple nodes- Yes/No..???
I would like to know abt this.. kindly help
AmeyJoshi14
Participant
Posts: 334
Joined: Fri Dec 01, 2006 5:17 am
Location: Texas

Post by AmeyJoshi14 »

Hi,
The below link might help you out.. :wink:

viewtopic.php?t=104589&highlight=number ... s+per+node

Do Seacrh in this forum .....
http://findingjobsindatastage.blogspot.com/
Theory is when you know all and nothing works. Practice is when all works and nobody knows why. In this case we have put together theory and practice: nothing works. and nobody knows why! (Albert Einstein)
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You don't need to specify number of readers per node - that's why it's an option. Its default value is 1. You should only need to increase it to read very large files in a smaller amount of time, but not so fast that you flood the next operator downstream.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
smishra.ds
Premium Member
Premium Member
Posts: 9
Joined: Wed Apr 23, 2008 12:11 pm
Location: Global

how many readers will be generated?

Post by smishra.ds »

I am trying to utilize this multiple readers per node option in the sequential file stage.

My parallel environment is SMP.
My APT Configuration File has two logical nodes on the same physical box (SMP env).

If i am choosing 2 readers per node option, then how many readers will be generated to read the file while running the job?

1. Number Of readers = 2 * 2 = 4
Explanation :- (As the option says "Reader per Node")

2. Number Of readers = 2 =2
Explanation :- (As any way Sequential file data will be read by only one node, even if more than one node is there, hence what ever be the number of readers we are specifying, will be generated on a single node which is getting used while reading the file.)

When i ran the job with the option choosen as Number Of readers Per Node = 2, the progress report shows two readers are simultaneously reading, and 50 - 50 % of the whole data is read by each one.

i am getting confused with the Above Two things Ray.

Also, whether my understanding is correct that, sequential file will be read by only one node (any one in the config file)?

Please help me out to find the actual behaviour of the job with these options.

Thanks in advance
LavanyaRamesh007
Participant
Posts: 42
Joined: Mon Apr 21, 2008 1:49 am

Post by LavanyaRamesh007 »

Hi..
This is a bit tricky question.. 1 Actually a sequential file can read in ONLY ONE NODE..
Yes i do agree we have an option that says whether we can ask Datastage to keep partitions or not... This comes into picture only when we read multiple files using a sequential file. For eg. i have 3 files ab1.csv, ab2.csv and ab3.csv
I am giving the pattern of these files and asking datastage to read them. and i also say read using Multiple nodes. In this case 3 files will be read by 3 nodes and inturn the partition will be remembered. (This is not similar to a fileset or dataset remembering partition. Sequential file cant remember the partition in a single file)

2
You can specifiy that a number of readers run on a single node. This means, for example, that a single file can be partitioned as it is read.. I can ask datastage to read it using 4 or 10 readers.. Internally datastage using as many readers will read the records..

So answering ur question the 2 readers have read the records. and a seq file cant have multiple nodes for a single file
Post Reply