Page 1 of 1

cyclic or linear dependency error

Posted: Fri Jan 16, 2004 1:01 pm
by sumitgulati
Hi All,

I have a Server Job that uses a shared container 'X'. Now from a transformer 'T' I pass one column to the container and the remaining columns to a transformer 'T1'. The output of the container 'X' populates a hash file 'H'. This hash file 'H' is used as a look up in transformer 'T1'.
This setting actually forms a loop in the Server Job.

When I compile the Job it gives the following error:
"Job contains cyclic or linear dependencies and will not run".

Is there any way to get this resolved.

Regards,
Sumit

Re: cyclic or linear dependency error

Posted: Fri Jan 16, 2004 1:07 pm
by raju_chvr
I would approach this error in multiple steps to find out at what point the error is being induced into the job.

1) In JOB1 just have the T ->X -> Hash/seq file
if this works then go to step 2
2) Then add your T1 to the job see.

If Step-2 fails then u know what is the problem. Just an idea to zero on error...

Good Luck .. I will be watching this topic

Posted: Fri Jan 16, 2004 1:17 pm
by chulett
Actually, it sounds like we know exactly what the problem is:
This setting actually forms a loop in the Server Job.
Can't do that. That is what is meant by the error "cyclic or linear dependencies" here - you've created a looping construct and that construct will not run.

You'll have to rethink your design to remove the loop.

Posted: Fri Jan 16, 2004 1:18 pm
by sumitgulati
Hi Raju,

If I remove the link from hash file 'H' to transformer 'T1' the job is working fine. But with the that it is giving the same error during compilation itself.

Regards,
Sumit

Posted: Fri Jan 16, 2004 2:08 pm
by kcbland
I bet if you change the container to local, then deconstruct it, the job will compile. A job can reference and write to the same hash file, that's okay. But, I would guess than you can do this using a shared container as you've described because of some internal rules about using shared containers.

Posted: Fri Jan 16, 2004 2:40 pm
by kcbland
Whoa whoa whoa. I misunderstood the design. What is going on has nothing to do with a shared container.

The issue is that the primary input stream is split. One branch writes to a hash file, and the other branch references that hash file.

YOU CAN'T DO THIS.

You should stream all your data first to the hash file, in your case, going thru the shared container. Then, ANOTHER link from your sequential source has to go thru ANOTHER transformer, and it's there you can reference the hash file.

Posted: Fri Jan 16, 2004 3:00 pm
by chulett
Guess I should have used a bigger font. :wink:

Posted: Fri Jan 16, 2004 3:00 pm
by sumitgulati
Hi,

I tried to avoid the loop by using two hash file stages instead of one. The new design goes like this.
From a transformer 'T' I pass one column to the shared container 'X' and the remaining columns to a transformer 'T1'. The output of the container 'X' populates a hash file 'H'.
Then I use a separate hash file stage to read the hash file 'H' and use it as a look up in transformer 'T1'.

Now is there any way in which I can ensure that before making the lookup the hash file load from the output of shared container 'X' is over.

There is actually no link between the hash file populating stage and the hash file reading stage to avoid a loop.

Regards,
Sumit

Posted: Fri Jan 16, 2004 3:04 pm
by kcbland
sumitgulati wrote: Then I use a separate hash file stage to read the hash file 'H' and use it as a look up in transformer 'T1'.

You can't do this. Look at the job design shown on this page from raju_chvr. That is your solution. I've corrected it for clarity:

Code: Select all

SEQ_source --> SHRD CONT --> HASH 
         |                    |
         |                    |
         |                    v
         ------------------>TRANS ---> SEQ_target

Posted: Fri Jan 16, 2004 3:09 pm
by sumitgulati
Hi Kenneth Bland,

The solution you gave works fine but I don't want to run two parallel paths right from the source itself. This is because my shared container is being used multiple time in the same server job. In that I would have to draw multiple parallel paths from the source itself.

Is there any other way. My Server Job is already very complex and drawing so many parallel paths would make it look even more complex. I am afraid that this might even degrade the performance drastically.

Thanks and Regards,
Sumit

Posted: Fri Jan 16, 2004 3:15 pm
by kcbland
The job design you see in on this page could easily be broken into two jobs that run serially. This is how the process would run anyway. So your complicated job is mostly likely a candidate for simplification thru decomposition. It's my experience that beginner DataStage developers go thru this learning curve of always wanted to build huge job designs and the rest of us here on the forum get to spend a lot of time telling them to break down their jobs into smaller, simpler, tunable job designs.

I can tell you for a 100% fact that the design on this page has no ability to instantiate. Broken into two pieces, each piece could be instantiated. Therefore, any concern people have can be washed away by using simpler, smaller job designs and only then can they use instantiation to achieve MULTIPLES of throughput improvement, most likely FAR SURPASSING any other design for performance.

Posted: Fri Jan 16, 2004 4:02 pm
by sumitgulati
Hi Kenneth Bland,

I totally aggree with what you said. Even I believe in creating simple jobs. But the project I am working on involves conversion of Informatica maps to DataStage Jobs. The client is highly reluctant about changing the job design. They want the functionality to be implemented exactly the same way as in Informatica maps.

Considering this I came up with the below design.

HASH_stage 1 (H1)
^
|
|
SHRD CONT9 (X) HASH_stage2 (H1)
^ |
| |
| v
odbc_source --> Transformer(T)------>Transformer(T1)----odbc_target

Kindly suggest me if there is any way in which I can make HASH_stage2 dependent upon HASH_stage1 and also avoind the loop.

Thanks in advance,
Sumit

Posted: Fri Jan 16, 2004 4:20 pm
by sumitgulati
Sorry in the last mail the design is not clear. Following is the correct design.

Code: Select all

                     HASH_stage 1 (H1)
                                ^	
                                |
                                |	
                    SHRD CONT9 (X)    HASH_stage2 (H1)
                                ^                 |
                                |                 |
                                |                 v
odbc_src --------------> Transfr(T)--------->Transfr(T1)----odbc_tgt

Posted: Fri Jan 16, 2004 5:00 pm
by kcbland
Your problem is that there is no way to guarantee that the row destinted for HASH_stage1 will be there when it needs to be referenced by HASH_stage2. I would put a bet that it isn't. So, your design fundamentally doesn't work.

Posted: Fri Jan 16, 2004 5:23 pm
by sumitgulati
Kenneth Bland , guess you are right and I now should think of changing the design.

Best option would be breaking the Job into simpler ones. Hope I am able to convince people out here.

Thanks for all your inputs,
Sumit