Problem in Reading from dataset

Ragunathan Gunasekaran · Wed Dec 02, 2009 10:32 am

Hi ,

Following is my load job design

New_records.ds --------> Funnel ----->Oracle Enterprise stage
Update_records.ds ----->

1) the datasets of the load job are created in a two node configuration engine.
2) The upstream job that created the datasets is completely sequential untill the point that writes to the datasets. Only the target dataset stages are allowed to execute in parallel ( All the stages in this upstream parallel job is turned off for parallelism)
3) When i try to see the data distribution in the above two datasets, the data is available in the node2 and the node 1 is not at all used ...
4) The number of records in the new dataset is just 2000 and the number of records in the update dataset is 0

Any clues on why the dataset is read very slow ( half an hour to show something in the output link for 2000 records)

chulett · Post by **chulett** » Wed Dec 02, 2009 10:48 am

You sure it's the read? What happens if you replace the OE stage and write to a sequential file instead?

Ragunathan Gunasekaran · Wed Dec 02, 2009 11:13 am

I have tried to hash partition on a key column and tried re running the upstream job that creates the datasets.

Now i could see in the dataset management utility, where the datasets showing records distributed on both the nodes to have records and load job has completed in 2 seconds.

I haven't tried to replace the OE stage ... as i could observe from the Monitor that there were no records poping out of the output links of the datasets.

Any idea .. why this is happening when you have a dataset reading from a single partition....?

Ragunathan Gunasekaran · Thu Dec 03, 2009 5:44 am

no idea ..?

chulett · Post by **chulett** » Thu Dec 03, 2009 7:43 am

Me? No, not really and unfortunately I don't have time to build my own test case and experiment with this. Besides, sounds like you solved your problem and your load happens in 2 seconds now.

Perhaps someone else has an idea. Guessing it's related to the Funnel and the unbalanced partitioning.

ray.wurlod · Post by **ray.wurlod** » Thu Dec 03, 2009 2:34 pm

Do you have many unbounded VarChar data types inthe record? What is the speed if you use this design as a test?

Code: Select all

DataSet  --------->  Copy