Problem in Reading from dataset

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Ragunathan Gunasekaran
Participant
Posts: 247
Joined: Mon Jan 22, 2007 11:33 pm

Problem in Reading from dataset

Post by Ragunathan Gunasekaran »

Hi ,

Following is my load job design

New_records.ds --------> Funnel ----->Oracle Enterprise stage
Update_records.ds ----->


1) the datasets of the load job are created in a two node configuration engine.
2) The upstream job that created the datasets is completely sequential untill the point that writes to the datasets. Only the target dataset stages are allowed to execute in parallel ( All the stages in this upstream parallel job is turned off for parallelism)
3) When i try to see the data distribution in the above two datasets, the data is available in the node2 and the node 1 is not at all used ...
4) The number of records in the new dataset is just 2000 and the number of records in the update dataset is 0

Any clues on why the dataset is read very slow ( half an hour to show something in the output link for 2000 records)
Regards
Ragu
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You sure it's the read? What happens if you replace the OE stage and write to a sequential file instead?
-craig

"You can never have too many knives" -- Logan Nine Fingers
Ragunathan Gunasekaran
Participant
Posts: 247
Joined: Mon Jan 22, 2007 11:33 pm

Post by Ragunathan Gunasekaran »

I have tried to hash partition on a key column and tried re running the upstream job that creates the datasets.

Now i could see in the dataset management utility, where the datasets showing records distributed on both the nodes to have records and load job has completed in 2 seconds.

I haven't tried to replace the OE stage ... as i could observe from the Monitor that there were no records poping out of the output links of the datasets.


Any idea .. why this is happening when you have a dataset reading from a single partition....?
Regards
Ragu
Ragunathan Gunasekaran
Participant
Posts: 247
Joined: Mon Jan 22, 2007 11:33 pm

Post by Ragunathan Gunasekaran »

no idea ..?
Regards
Ragu
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Me? No, not really and unfortunately I don't have time to build my own test case and experiment with this. Besides, sounds like you solved your problem and your load happens in 2 seconds now.

Perhaps someone else has an idea. Guessing it's related to the Funnel and the unbalanced partitioning. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Do you have many unbounded VarChar data types inthe record? What is the speed if you use this design as a test?

Code: Select all

DataSet  --------->  Copy
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply