Too much Compilation Time

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Aquilis
Participant
Posts: 204
Joined: Thu Apr 05, 2007 4:54 am
Location: Bangalore
Contact:

Too much Compilation Time

Post by Aquilis »

Hi all,
I am have Job Design where i am using Two Stage Variables In transformer stage.I am Validating data Format for Decimal and Integer Values ,Both stage Variables has got almost 150 and 35 'AND' Clauses respectively.But whole design is Taking Too much time for Compilation almost nearly 2-3 hours.

sample Stagevariable Code:

StageVariable1:

Code: Select all

IF 
(IsValid('DECIMAL[22,6]',TrimLeadingTrailing(NullToZero(LNK_Read_Input.FF_BOOK_VALUE_PS)))) AND
(IsValid('DECIMAL[22,6]',TrimLeadingTrailing(NullToZero(LNK_Read_Input.FF_DIV_FY8)))) AND
(IsValid('DECIMAL[22,6]',TrimLeadingTrailing(NullToZero(LNK_Read_Input.FF_DIV_FY7)))).........
Then 
0 
else
 -1

Stagevariable2:
IF
(IsValid('INT32',TrimLeadingTrailing(NullToZero(LNK_Read_Input.FF_BOOK_VALUE_FLG)))) AND
 (IsValid('INT32',TrimLeadingTrailing(NullToZero(LNK_Read_Input.FF_FY_LAST_DT)))) AND
 (IsValid('INT32',TrimLeadingTrailing(NullToZero(LNK_Read_Input.FF_FQ_LAST_DT))))........
Then
0
Else
-1
When i remove validation check from stage variables then compilation completes with in 2 -3 Minutes.
If you have better suggestions to minimise the compilation Time,please Share with us.
Aquilis
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

At the site I am currently at we have seen similar issues, but we peak at 40 minutes compilation time. The root issue does seem to be network transfer rates between the UNIX server and the PC on which the compilation is being done. Some of the job compile is done via Java on the client side and I think (this is unsubstantiated) that we are getting a network IO bottleneck.
The PX job compile times are abysmally slow - perhaps if enough of us complain they will find a way to move the java processing onto the server and eliminate some of the bottleneck. In addition, it seems that all of the transform stages are compiled sequentially regardless of platform (some c++ compilers limit concurrent usage) instead of concurrently.
Aquilis
Participant
Posts: 204
Joined: Thu Apr 05, 2007 4:54 am
Location: Bangalore
Contact:

Post by Aquilis »

Andw,
whatever you have said makes sense and acceptable.
But all other jobs with same types of job designs with lesser validations are compiling with in the expected timelines.
But only two Jobs with this functionality taking Too much time.

At a same time, same datastage UserId is being used by two more developers.does this aspect also hamper anywhere during compilation(i dont think,this may lead to any issues,since multiple jobs compilation is supported by DS).

if i split single Stagevariable into multiple stagevariables,does it make any sense and enhance the compilation time.
Aquilis
ajay.vaidyanathan
Participant
Posts: 53
Joined: Fri Apr 18, 2008 8:13 am
Location: United States

Too much Compilation Time

Post by ajay.vaidyanathan »

yes.....................splitting of stage variables will take time to compile.......
Regards
Ajay
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The functions you use are expanded inline and will result in a large c++ program - I'm not sure how the location of the logic in the transform stage (whether as a stage variable, constraint or derivation) will make a difference, but it might.

The number of identical userids working in parallel should not make a diffrence at all.

What UNIX are you on?
udankar
Premium Member
Premium Member
Posts: 14
Joined: Tue Oct 18, 2005 6:27 am
Location: India

Compilation of Simple job takes time

Post by udankar »

We are using DS 8.1. Our database (Source as well as Target) and the server components of DS sit in a different location and the development team along with the client components sit in a different location.

In this kind of setup even a simple job like one to one mapping from DB2 to oracle takes 10 minutes to compile. I was under the impression that this is due to network traffic. But after reading this post I may have to think of other problems / bottlenecks involved. Please do let me know if you a solution for this problem
Post Reply