Page 1 of 1

Sequential File- Using multiple readers

Posted: Thu Jul 10, 2008 4:18 am
by LavanyaRamesh007
Hi,

I would like to know why we need to explicitly specify the number of readers per node for Seq file stage.. What implications it has in performance..???
How do i decide how many readers per node?? There is also an option which says read using multiple nodes- Yes/No..???
I would like to know abt this.. kindly help

Posted: Thu Jul 10, 2008 4:44 am
by AmeyJoshi14
Hi,
The below link might help you out.. :wink:

viewtopic.php?t=104589&highlight=number ... s+per+node

Do Seacrh in this forum .....

Posted: Thu Jul 10, 2008 7:09 am
by ray.wurlod
You don't need to specify number of readers per node - that's why it's an option. Its default value is 1. You should only need to increase it to read very large files in a smaller amount of time, but not so fast that you flood the next operator downstream.

how many readers will be generated?

Posted: Thu Jul 31, 2008 4:39 am
by smishra.ds
I am trying to utilize this multiple readers per node option in the sequential file stage.

My parallel environment is SMP.
My APT Configuration File has two logical nodes on the same physical box (SMP env).

If i am choosing 2 readers per node option, then how many readers will be generated to read the file while running the job?

1. Number Of readers = 2 * 2 = 4
Explanation :- (As the option says "Reader per Node")

2. Number Of readers = 2 =2
Explanation :- (As any way Sequential file data will be read by only one node, even if more than one node is there, hence what ever be the number of readers we are specifying, will be generated on a single node which is getting used while reading the file.)

When i ran the job with the option choosen as Number Of readers Per Node = 2, the progress report shows two readers are simultaneously reading, and 50 - 50 % of the whole data is read by each one.

i am getting confused with the Above Two things Ray.

Also, whether my understanding is correct that, sequential file will be read by only one node (any one in the config file)?

Please help me out to find the actual behaviour of the job with these options.

Thanks in advance

Posted: Fri Aug 01, 2008 4:08 am
by LavanyaRamesh007
Hi..
This is a bit tricky question.. 1 Actually a sequential file can read in ONLY ONE NODE..
Yes i do agree we have an option that says whether we can ask Datastage to keep partitions or not... This comes into picture only when we read multiple files using a sequential file. For eg. i have 3 files ab1.csv, ab2.csv and ab3.csv
I am giving the pattern of these files and asking datastage to read them. and i also say read using Multiple nodes. In this case 3 files will be read by 3 nodes and inturn the partition will be remembered. (This is not similar to a fileset or dataset remembering partition. Sequential file cant remember the partition in a single file)

2
You can specifiy that a number of readers run on a single node. This means, for example, that a single file can be partitioned as it is read.. I can ask datastage to read it using 4 or 10 readers.. Internally datastage using as many readers will read the records..

So answering ur question the 2 readers have read the records. and a seq file cant have multiple nodes for a single file