Job running for ever

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sshettar
Premium Member
Premium Member
Posts: 264
Joined: Thu Nov 30, 2006 10:37 am

Job running for ever

Post by sshettar »

Hi All,

I have this job which is running for ever and i had to stop the job . the warnings that i see in the director is as follows

Sort_15: When checking operator: When binding output schema variable "sortRec": When binding output interface field "CHK_NUM" to field "CHK_NUM": Implicit conversion from source type "string[max=15]" to result type "string[max=10]": Possible truncation of variable length string.

Sort_15: When checking operator: When binding output schema variable "sortRec": When binding output interface field "EXTERNAL_NUM" to field "EXTERNAL_NUM": Implicit conversion from source type "string[max=15]" to result type "string[max=10]": Possible truncation of variable length string.

Join_12: When checking operator: User inserted sort "Sort_15" does not fulfill the sort requirements of the downstream operator "APT_JoinSubOperatorNC in Join_12"

Join_12: When checking operator: When binding output schema variable "leftRec": When binding output interface field "PO_NUM_PRE" to field "PO_NUM_PRE": Implicit conversion from source type "int32" to result type "int16": Possible range limitation.

Join_12: When checking operator: When binding output schema variable "leftRec": When binding output interface field "DIV_NUM" to field "DIV_NUM": Implicit conversion from source type "int32" to result type "int16": Possible range limitation.

Join_12: When checking operator: User inserted sort "Sort_15" does not fulfill the sort requirements of the downstream operator "APT_JoinSubOperatorNC in Join_12"

Join_12: When checking operator: User inserted sort "Sort_15" does not fulfill the sort requirements of the downstream operator "APT_JoinSubOperatorNC in Join_12"

After these many warnings i stopped the job.

My design is as follows
i am doing a join here . the first DB2 stage i have written the sql where i have joined 5 tables and i have also done a order by on the key columns which is needed to join with another table with the external join stage.

similarly in the second DB2 stage i have written the query which pulls out only the fields that is needed and have done a order by on the same 3 key columns.

i have used sort stages after the DB2 stages to hash partitioin the data based on the joining key columns and have selected dont sort (previously sorted) option in the sort stage.
After which i have selected the SAME partioin in the join stage for both the links . Then the data goes through the transformer where few fields are derived and then to a dataset.

Can somebody shed light on where i' going wrong?

Any help is highly appreciated.

Thanks in advance
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Fix "Join_12" first. You need to sort the data on exactly the same column(s) as is/are used in the join. Once that warning is gone, see what happens.
sshettar
Premium Member
Premium Member
Posts: 264
Joined: Thu Nov 30, 2006 10:37 am

Post by sshettar »

Hi Arwd,

I changed my design right now , i have removed the sort stage and removed the order by clause in the sql (in both db2 stages) as well. i instead am sorting and partitoing the data in the join stage for both the links . despite this the job is taking for ever to run.

Any light on this would help me resolve the issue.

Thanks in advance
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

If you monitor the progress can you see how the data is going through the sort? Also, replace any WRITE stages to databases with a peek stage temporarily to determine if the slow speed is due to those stages.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Did you check the performance of your query ?

Does it start returning any rows ?
sshettar
Premium Member
Premium Member
Posts: 264
Joined: Thu Nov 30, 2006 10:37 am

Post by sshettar »

My Query returns rows in less than a minute.
Where could i be going wrong?

Any help is highly appreciated

Thanks in advance
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Use APT_DUMP_SCORE and look at the score, I would wager that you are getting inserted extra tsort operators that are slowing down the job by doing extra sorts. Note that not only does your join data need to be sorted, but DataStage needs to know that it is sorted (through inserting a sort operator, but with "don't sort" on the key).
sshettar
Premium Member
Premium Member
Posts: 264
Joined: Thu Nov 30, 2006 10:37 am

Post by sshettar »

Note that not only does your join data need to be sorted, but DataStage needs to know that it is sorted (through inserting a sort operator, but with "don't sort" on the key).

How can i accomplish this??What env Apt file am i suppose to intrduce in my parameter list?

Let me put my design more clear
In my DB2 API stage1 i have custom sql with order by clause on 4 fields. This goes to the sort stage1 where the sort option is Dont sort previously sorted but hash partitioned the 4 fields.

In my DB2 API stage2 , i have custom sql with order by clause on the same 4 fields. This goes into the sort stage 2 where the sort option is dont sort previosuly sorted but hash partitioned the same 4 fields.

not output from sort stage1 and sort stage 2 goes into the join stage where join is inner join and the data for both the links is on SAME partition and sort option not enabled( as in stable sort option) .

The output of Join stage goes into the transformer for few transformation and then to a dataset .

Now when i run the job i see that data is flowing from DB2 API stage1 and also through SORT stage1 but only 1 record flowing from DB2 API stage2 and no record through SORT stage2 .
it has been 20 min now and the link from DB2 Stage2 still shows 1 record

Any light on why this could be happening ?


Any help is highly appreciated

Thanks in advance
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

API Stage1 & API Stage2
Since you are re-partitioning your sorted data it is no longer sorted, so you need to make that a "real" sort on your join criteria.
sshettar
Premium Member
Premium Member
Posts: 264
Joined: Thu Nov 30, 2006 10:37 am

Post by sshettar »

Do you think i rather do the sorting and partitioing in the join stage rather than using 2 sort stages?

do you think that would make my job run much faster?

Thanks i advance
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

You cannot sort in the join stage, unless you edit in the input links and do it there, but it is preferable to do so in an explicit sort stage since you can opt to not sort certain columns there.
Post Reply