Performance
Moderators: chulett, rschirm, roy
Performance
Hi
Please suggest a way to make the job run faster on EE.This is the sql on a server job.Our database is Sybase IQ.SELECT
MAX(KEY) KEY,
Num,
case when substring(trim(col1),1,locate(trim(col1),' ',1)-1)='0' then ''
else substring(trim(col1),1,locate(trim(col1),' ',1)-1) end||' '||
case when substring(trim(col1),(locate(trim(col1),' ',-1))+1,length(trim(col1)))='0' then ''
else substring(trim(col1),(locate(trim(col1),' ',-1))+1,length(trim(col1))) end
AS col1
FROM table
GROUP BY Num, col1
I created a view to do the following:
key col1
-------------- ----------------
529193430 130076 6385 272
529193431 130076 6355 272
529193432 130076 5355 272
529193433 130076 5345 272
473221279 172808 555555555
481475905 173338 555555555
494442315 174095 555555555
results
key col1
-------------- ----------------
529193433 130076 272
473221279 172808 555555555
494442315 174095 555555555
481475905 173338 555555555
Please suggest a way to make the job run faster on EE.This is the sql on a server job.Our database is Sybase IQ.SELECT
MAX(KEY) KEY,
Num,
case when substring(trim(col1),1,locate(trim(col1),' ',1)-1)='0' then ''
else substring(trim(col1),1,locate(trim(col1),' ',1)-1) end||' '||
case when substring(trim(col1),(locate(trim(col1),' ',-1))+1,length(trim(col1)))='0' then ''
else substring(trim(col1),(locate(trim(col1),' ',-1))+1,length(trim(col1))) end
AS col1
FROM table
GROUP BY Num, col1
I created a view to do the following:
key col1
-------------- ----------------
529193430 130076 6385 272
529193431 130076 6355 272
529193432 130076 5355 272
529193433 130076 5345 272
473221279 172808 555555555
481475905 173338 555555555
494442315 174095 555555555
results
key col1
-------------- ----------------
529193433 130076 272
473221279 172808 555555555
494442315 174095 555555555
481475905 173338 555555555
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
How long does it take outside datastage ?
You appear to be trimming leading and trailing zero. Can't you use trim function itself for that ?
Or something like
I am not a Sybase expert. But you can try preparing an 'inner query' and rewrite that so you avoid tirm, substr etc multiple times.
You appear to be trimming leading and trailing zero. Can't you use trim function itself for that ?
Or something like
Code: Select all
SubStr(trim(col1), 1, 2) = '0 '
SubStr(trim(col1), -1, -2) = ' 0'
The existing job that need improvement does selects from a view that is created in Sybase IQ and the job takes 2 hours.The view does the whole removing the middle number in col1 and get the max (num) to remove dumplicate.
Doing this in datastage i first use stage variables to remove the middle number in col1 and the put the data in aggreagor stage to get the max and group by num and col1.It takes 4 to 6 hours.
My task is to increase performance now that we use EE bt the more i try to find new ways the slower if gets..I am even starting to think 2 hours s the best perfomance for rows 529193388.
Thanks for ur replies
Doing this in datastage i first use stage variables to remove the middle number in col1 and the put the data in aggreagor stage to get the max and group by num and col1.It takes 4 to 6 hours.
My task is to increase performance now that we use EE bt the more i try to find new ways the slower if gets..I am even starting to think 2 hours s the best perfomance for rows 529193388.
Thanks for ur replies
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
So, one job running on 6 nodes brought your server to its knees? And 2 hours to process half a billion records is 'poor performance'? Me, I would be happy with the 2 hours, especially considering the amount of hardware you (apparently) have to throw at the problem.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Thank you,i needed a second person to say this,i told our developers that this job is actually designed very well and is running fast for these records.Most people who complain about poor perfomance in Dsxchange they are talking less than 80 000 000 with simple tranformation.My case is even worse getting max value and grouping by.
I agree 100%.
I agree 100%.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Simple economics: supply and demand. Did 6 nodes make an improvement in elapsed time? Maybe you can schedule this job when there are no developers? Anyway in a production environment there should not be any developers, so you would throw as many resources at the job as the system can deliver comfortably.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI