Logic required

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
savis
Premium Member
Premium Member
Posts: 27
Joined: Tue Apr 15, 2008 11:06 pm

Logic required

Post by savis »

I have a requirement as stated below:

From_Id To_Id
100 200
200 300
300 400
111 222
444 555

I have field named 'Id'. An Id may be transferred to another Id.
Say 100 gets transferred to 200, in this case, I have to fetch all records with Id as 100 and change it to 200.

This is a straight forward one but the issue is when we have cyclic transfers, I am not sure how to do it.
say 100 gets transferred to 200 and 200 gets transferred to 300 and 300 gets transferred to 400 In this case I should transfer all records with Id as 100,200,300 to 400. There can be N number of cyclic transfers. I am not sure how to do it.

My output should be as follows:
From_Id To_Id
100 400
200 400
300 400
111 222
444 555

please let me know your thoughts.
Thank you
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

What is your source? Is it a database or a flat file? What databases do you have access to?
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
savis
Premium Member
Premium Member
Posts: 27
Joined: Tue Apr 15, 2008 11:06 pm

Post by savis »

Source is a flat file.
I have access to Oracle database
uegodawa
Participant
Posts: 71
Joined: Thu Apr 27, 2006 12:46 pm

Post by uegodawa »

Define your transforms [ 1..N] in flat files.
Example :
First file say Transfer_1.txt has following structure
FROM_ID TO_ID
100 , 200
200, 300
300 , 400

Second file say Transfer_2.txt has following structure
FROM_ID TO_ID
100 , 400
200, 400
300 , 400

So you can pre define those files and use them as lookups.

Source -- LookUP1------------------Lookup2
File...............|................................|
------------Transfer_1.txt.......Transfer_2.txt
Thanks,
Upul
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

The best way that I can think of is using the oracle table. Load your flat file into a temp table and use a tree like join. Google up 'connect by' clause. Thats your best bet.
I did something similar a while back in DB2. That was a real pain as DB2 does not have connect by. You have to do several unions and self joins. In oracle it should be pretter straight forward.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
savis
Premium Member
Premium Member
Posts: 27
Joined: Tue Apr 15, 2008 11:06 pm

Post by savis »

Second file say Transfer_2.txt has following structure
FROM_ID TO_ID
100 , 400
200, 400
300 , 400

How did you get the second file?
savis
Premium Member
Premium Member
Posts: 27
Joined: Tue Apr 15, 2008 11:06 pm

Post by savis »

I used the following query.

SELECT from_ID,to_ID,level FROM ID START WITH to_ID = 400 CONNECT BY PRIOR from_ID = to_ID;


FROM_ID TO_ID LEVEL
---------- ---------- ----------
300 400 1
200 300 2
100 200 3

But I am not sure how to get the below output?

FROM_ID TO_ID
---------- ----------
300 400
200 400
100 400

please help.
savis
Premium Member
Premium Member
Posts: 27
Joined: Tue Apr 15, 2008 11:06 pm

Post by savis »

Thanks for your help!

I used the below query and it worked.

select a.from_Id,a.to_Id from
(SELECT from_Id as From_Id, CONNECT_BY_ROOT to_Id as TO_Id,
LEVEL-1 as Pathlen
FROM Id CONNECT BY PRIOR from_Id = to_Id)a,
(select from_Id, max(pathlen) MAX_PATHLEN from (SELECT from_Id as From_Id, CONNECT_BY_ROOT to_Id as TO_Id,
LEVEL-1 as Pathlen
FROM Id CONNECT BY PRIOR from_Id = to_Id ) group by from_Id)b
where a.from_Id=b.from_Id and a.pathlen = b.MAX_PATHLEN;

How to do it in datastage?
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Stick with the oracle query solution. But if I had no choice but to do it in dagtastage then I would write a server routine. Use COMMON areas and dynamic arrays to store values, do comparisons and write it out. The same thing can be implemented in a C function and use it in a parallel job.
My suggestion, stick to the oracle solution rather then re-inventing the wheel. It will come handy when you need to enhance it or even for support.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Arbitrary levels of recursion aren't easy to do in any ETL tool and don't really belong there (in the same way that RANK doesn't). These really are display functions, not transformation functions.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply