Page 1 of 1

Read same Persistent DataSet multiple times in a job

Posted: Fri Nov 24, 2006 9:14 am
by rwierdsm
Folks,

I'm creating a job that needs to look into a list of values using multiple fields in the incoming file. In server it is quite common practice to set up a list in a hashed file and have multiple links from the single hashed file to the transformer performing the lookup.

The EE version doesn't like that so much. What I've done is created multiple Dataset stages, all pointing to the same underlying persistent dataset and linked each one separately to the lookup stage.

Thus far I have been quite successful with this approach, encountering no errors.

I have not been able to find anything in the manuals or on these forums that explicitly states that this is or is not a permissible thing to do with persistent datasets.

Does anyone out there have any experience using datasets this way?

Thanks in advance for your responses,

Rob W

Posted: Fri Nov 24, 2006 9:19 am
by ArndW
DataSets can be read from simultaneously by different processes, but cannot be written to and read from at the same time.

Look at them as glorified sequential files when it comes to concurrency control.

Posted: Fri Nov 24, 2006 9:22 am
by rwierdsm
Thanks, Arnd.

No internal locking for reading or anything like that?

Rob

Posted: Fri Nov 24, 2006 9:44 am
by Nageshsunkoji
rwierdsm wrote:Thanks, Arnd.

No internal locking for reading or anything like that?

Rob
hi,

Rather than going for multiple datsets .The better approach is, use one Dataset in your job and use copy stage in the down stream and make that many copies to filter your data. It will save your memory and increase the performance. As arnd said, you can access tha data at the same time from a underlying persistent dataset. But, you can't write the data at the same time. Better approach is make copies of the dataset by using Copy stage.

Posted: Fri Nov 24, 2006 9:48 am
by rwierdsm
Hi Nageshsunkoji,

I am only using one DataSet, but referring to it multiple times by using multiple stages. Each stage points to the same underlying dataset in the OS.

No need to copy anything. I just needed to know that I could access the same dataset multiple times in the same job.

Rob W

Posted: Fri Nov 24, 2006 10:02 am
by Nageshsunkoji
rwierdsm wrote:Hi Nageshsunkoji,

I am only using one DataSet, but referring to it multiple times by using multiple stages. Each stage points to the same underlying dataset in the OS.

No need to copy anything. I just needed to know that I could access the same dataset multiple times in the same job.

Rob W
Thats ok If your using only one datset. But, in your post you have written What I've done is created multiple Dataset stages, all pointing to the same underlying persistent dataset and linked each one separately to the lookup stage.

If your using only one Dataset and accessing the same dataset multiple times. I don't think so, any problem is there. We are also using the similar manner and we haven't faced any problem.

Posted: Fri Nov 24, 2006 10:21 am
by rwierdsm
If your using only one Dataset and accessing the same dataset multiple times. I don't think so, any problem is there. We are also using the similar manner and we haven't faced any problem.
Good news.

Thanks.

Posted: Fri Nov 24, 2006 12:11 pm
by ArndW
I've used this in the past; no problems with concurrent reads on DataSets or LookupFileSets.

Posted: Fri Nov 24, 2006 1:25 pm
by rwierdsm
Thanks for the input, folks.

Rob W