Cass stage performance

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
pattemk
Participant
Posts: 84
Joined: Wed May 16, 2007 4:04 pm

Cass stage performance

Post by pattemk »

Hi,

I did search but could not find much help.

I am trying to know the reasons why a job using cass stage takes too long to finish.

what can be done to improve the performance when using a cass stage in a job.

things that we should not be doing while using a cass stage.

Please advice
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

Throw us a bone here: you've given us nothing to work with.

How much data?
What else are you doing in the job?
How long is "too long"?
pattemk
Participant
Posts: 84
Joined: Wed May 16, 2007 4:04 pm

Post by pattemk »

approx 50000(but can vary)
seq file --> cass --> seq file
currently it is taking me 45 minutes to process 50,000 records.

i am wondering if there are any tuning tips while using a cass stage.
thanks for helping

[quote="stuartjvnorton"]Throw us a bone here: you've given us nothing to work with.

How much data?
What else are you doing in the job?
How long is "too long"?[/quote]
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

That's way too long to process 50,000 records. Where are the CASS database files stored?
pattemk
Participant
Posts: 84
Joined: Wed May 16, 2007 4:04 pm

Post by pattemk »

just an fyi, the project is using only one node. i know this might be reason. but wondering if processing 50,000 records for 45 minutes is really too long to finish on one node. and what are the thumb rules that we should do when using cass stage.

please advice.

[quote="lstsaur"]That's way too long to process 50,000 records. Where are the CASS database files stored?[/quote]
Last edited by pattemk on Thu Nov 04, 2010 3:34 pm, edited 3 times in total.
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

Strange, usually CASS database files are under the directory like /../../CASS. Try to run a test job using the data (6,500 records) in the QualityStage tutorial directory. It should be finished less than 2 sec. even with one node.
Post Reply