Sort stage outputs Fewer Records
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 12
- Joined: Wed Jul 25, 2007 6:56 am
- Location: Hyd
Sort stage outputs Fewer Records
Hi,
We are using a federated join between 2 DB2 systems. and joining the output. Before joining we are using explicit sort stages to sort the data.
Following is the job design (RCP is used)
1DB2 Stage ------------ 1Sort Stage
######################## ----- Join Stage --Dataset
2DB2 Stage ------------ 2Sort Stage
Following are the record counts
Output 1DB2 Stage and Output 1 Sort Stage - 5000
Output 2DB2 stage is 5000 BUT Output 2 Sort Stage is 2500 only. (This is the problem. This should have given out 5000 records)
The job ran fine without any warnings. But we could find the following informational message (not warning) in the log
[code]/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 0 600 edwdev edwdev-sw1 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 1 600 edwdev-2 edwdev-sw1a 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 2 600 edwdev-3 edwdev-sw2 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 3 600 edwdev-4 edwdev-sw2a 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 4 600 edwdev-5 edwdev-sw1 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 5 600 edwdev-6 edwdev-sw1a 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 6 600 edwdev-7 edwdev-sw2 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 7 600 edwdev-8 edwdev-sw2a 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 8 600 edwddb1 edwddb1-sw1a 1214385257.105183.607018[/code]
Some observations:
I made sure that all the partitioning keys are right.
We have similar jobs and they are giving the same issue at the 2Sort Stage.
The other jobs which do not have federated join (and have 2 sort stages followed by join stage) run fine.
Also the user running the job has access to all the nodes.
Please help me identify whats going wrong.
Thanks
Satish Balantarapu
We are using a federated join between 2 DB2 systems. and joining the output. Before joining we are using explicit sort stages to sort the data.
Following is the job design (RCP is used)
1DB2 Stage ------------ 1Sort Stage
######################## ----- Join Stage --Dataset
2DB2 Stage ------------ 2Sort Stage
Following are the record counts
Output 1DB2 Stage and Output 1 Sort Stage - 5000
Output 2DB2 stage is 5000 BUT Output 2 Sort Stage is 2500 only. (This is the problem. This should have given out 5000 records)
The job ran fine without any warnings. But we could find the following informational message (not warning) in the log
[code]/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 0 600 edwdev edwdev-sw1 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 1 600 edwdev-2 edwdev-sw1a 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 2 600 edwdev-3 edwdev-sw2 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 3 600 edwdev-4 edwdev-sw2a 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 4 600 edwdev-5 edwdev-sw1 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 5 600 edwdev-6 edwdev-sw1a 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 6 600 edwdev-7 edwdev-sw2 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 7 600 edwdev-8 edwdev-sw2a 1214385257.105183.607018
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 8 600 edwddb1 edwddb1-sw1a 1214385257.105183.607018[/code]
Some observations:
I made sure that all the partitioning keys are right.
We have similar jobs and they are giving the same issue at the 2Sort Stage.
The other jobs which do not have federated join (and have 2 sort stages followed by join stage) run fine.
Also the user running the job has access to all the nodes.
Please help me identify whats going wrong.
Thanks
Satish Balantarapu
-
- Participant
- Posts: 42
- Joined: Wed Dec 26, 2007 1:58 am
- Location: India
Re: Sort stage outputs Fewer Records
Hi Satish,
what kind of partitioning ur applying?
Instead of aplying a explicit sort stage,why don't u try sorting in the partition.
How you are telling that the second sort is gving only 2500 records,if u r seeing through the perormance statistics,then u hv to give more info of what kind of join u r doing in the join stage..........
Thanks
what kind of partitioning ur applying?
Instead of aplying a explicit sort stage,why don't u try sorting in the partition.
How you are telling that the second sort is gving only 2500 records,if u r seeing through the perormance statistics,then u hv to give more info of what kind of join u r doing in the join stage..........
Thanks
satishbalantrapu wrote:Hi,
We are using a federated join between 2 DB2 systems. and joining the output. Before joining we are using explicit sort stages to sort the data.
Following is the job design (RCP is used)
1DB2 Stage ------------ 1Sort Stage
######################## ----- Join Stage --Dataset
2DB2 Stage ------------ 2Sort Stage
Following are the record counts
Output 1DB2 Stage and Output 1 Sort Stage - 5000
Output 2DB2 stage is 5000 BUT Output 2 Sort Stage is 2500 only. (This is the problem. This should have given out 5000 records)
The job ran fine without any warnings. But we could find the following informational message (not warning) in the log
Some observations:Code: Select all
/u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 0 600 edwdev edwdev-sw1 1214385257.105183.607018 /u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 1 600 edwdev-2 edwdev-sw1a 1214385257.105183.607018 /u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 2 600 edwdev-3 edwdev-sw2 1214385257.105183.607018 /u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 3 600 edwdev-4 edwdev-sw2a 1214385257.105183.607018 /u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 4 600 edwdev-5 edwdev-sw1 1214385257.105183.607018 /u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 5 600 edwdev-6 edwdev-sw1a 1214385257.105183.607018 /u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 6 600 edwdev-7 edwdev-sw2 1214385257.105183.607018 /u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 7 600 edwdev-8 edwdev-sw2a 1214385257.105183.607018 /u001/Ascential/DataStage/PXEngine/etc/standalone.sh -APT_PMsetupFailedFlag /u001/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag edwdev 10003 8 600 edwddb1 edwddb1-sw1a 1214385257.105183.607018
I made sure that all the partitioning keys are right.
We have similar jobs and they are giving the same issue at the 2Sort Stage.
The other jobs which do not have federated join (and have 2 sort stages followed by join stage) run fine.
Also the user running the job has access to all the nodes.
Please help me identify whats going wrong.
Thanks
Satish Balantarapu
Regards
-
- Charter Member
- Posts: 193
- Joined: Tue Sep 05, 2006 8:01 pm
- Location: Australia
-
- Participant
- Posts: 12
- Joined: Wed Jul 25, 2007 6:56 am
- Location: Hyd
Re: Sort stage outputs Fewer Records
Hi Poornima,
I believe you meant inline sorting in join stage. I used explicit sort stage because we had existing jobs. I just wanted to know why there is a difference in the record counts.
Since an inline sort is equivalent to explicit sort, I thought this should work.
Also I am using an inner join.
Partitioning:
Hash partitioning in both sort stages
Same partitioning in both input links of join stage. (Tried Auto too )
I believe you meant inline sorting in join stage. I used explicit sort stage because we had existing jobs. I just wanted to know why there is a difference in the record counts.
Since an inline sort is equivalent to explicit sort, I thought this should work.
Also I am using an inner join.
Partitioning:
Hash partitioning in both sort stages
Same partitioning in both input links of join stage. (Tried Auto too )
-
- Participant
- Posts: 12
- Joined: Wed Jul 25, 2007 6:56 am
- Location: Hyd
-
- Participant
- Posts: 49
- Joined: Fri May 11, 2007 12:24 am
- Location: kolkata
hi,
As per the job design i would suggest you why do not you perform the join in the DataBase it will reduce the time and the overhead of the system.
get the output of the join from the database into the required dataset
in the DB2 stage
do the following
select a.clo1, a.col2 , b.col1,b.col3
from table1 a , table2 b
where a.col1=b.col1
now get the data into the Dataset.
As per the job design i would suggest you why do not you perform the join in the DataBase it will reduce the time and the overhead of the system.
get the output of the join from the database into the required dataset
in the DB2 stage
do the following
select a.clo1, a.col2 , b.col1,b.col3
from table1 a , table2 b
where a.col1=b.col1
now get the data into the Dataset.
regards
sunayan
sunayan
-
- Premium Member
- Posts: 783
- Joined: Mon Jan 16, 2006 10:17 pm
- Location: Sydney, Australia
-
- Premium Member
- Posts: 783
- Joined: Mon Jan 16, 2006 10:17 pm
- Location: Sydney, Australia
-
- Participant
- Posts: 12
- Joined: Wed Jul 25, 2007 6:56 am
- Location: Hyd
Keshav,keshav0307 wrote:sometime the monitor behave crazy.
probably the link is showing fewer records because of the join stage.
can you create another job with just the
2DB2 Stage ------------ 2Sort Stage ----------Sequential file Stage
to check if the sort stage is still giving fewer records.
You are right. The monitor did not show me the correct results. I got the actual record counts using the env variable APT_RECORD_COUNTS. So my problem about the record count is solved.
Moral: Do not completely rely the on job performance statistics
Now I have a problem with the join stage. Even though there are no duplicates at the source level (both the sources), join stage's output has duplicates Any ideas
Thanks,
Satish
-
- Participant
- Posts: 12
- Joined: Wed Jul 25, 2007 6:56 am
- Location: Hyd
sunayan_pal wrote:hi,
As per the job design i would suggest you why do not you perform the join in the DataBase it will reduce the time and the overhead of the system.
get the output of the join from the database into the required dataset
in the DB2 stage
do the following
select a.clo1, a.col2 , b.col1,b.col3
from table1 a , table2 b
where a.col1=b.col1
now get the data into the Dataset.
Hi sunayan
Thanks. That is my last resort. (Because I am expected to develop jobs similar to existing ones.)
Thanks,
Satish