Denormalizing Data

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
phillip.small
Participant
Posts: 29
Joined: Mon Aug 16, 2004 8:12 am

Denormalizing Data

Post by phillip.small »

Hello

I have two input tables with 3 keys. I am taking some fields from both tables and putting them into another table. Based on the two input tables, I have a phone type field with values either "BUSINESS" or "HOME" and this indicates which type if the phone number. The tables also have a status column with either "A" or "I" (Active or Inactive) as the statuses. Therefore, the error correctly states that there are two rows per PK. In the target, I changed the phone column and recreated the table with BUSN_PHONE and HOME_PHONE and did not create a phone_type column. If I run the load and populate the BUSN_PHONE, how do I loop to load the HOME_PHONE column on the same record?

input table1 ---- input table2 ---->Transformer ----> target table
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Your solution is not a loop, it is a UNION in your SQL. You need to collapse two rows into one. Imagine SQL like this:


Code: Select all

select key, max(bus_phone), max(home_phone) from
   (
   select key, NULL "BUS_PHONE", home_phone from table A
   union all
   select key, bus_phone, NULL "HOME_PHONE" from table B
   )
group by key
This solution provides a FULL OUTER JOIN, meaning all rows from table A and all rows from table B will make it into your target. Using one table as a reference DOESN'T WORK unless you do a pass thru both tables as a source, using the other as a reference each time. This SQL is very elegant.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Just as an FYI for those in our studio audience using Oracle as their database of choice, there is actual "Full Outer Join" syntax available in 9i:

Code: Select all

SELECT a.x, b.x
FROM a FULL OUTER JOIN b
ON (a.y = b.y)
-craig

"You can never have too many knives" -- Logan Nine Fingers
phillip.small
Participant
Posts: 29
Joined: Mon Aug 16, 2004 8:12 am

Post by phillip.small »

chulett wrote:Just as an FYI for those in our studio audience using Oracle as their database of choice, there is actual "Full Outer Join" syntax available in 9i:

Code: Select all

SELECT a.x, b.x
FROM a FULL OUTER JOIN b
ON (a.y = b.y)
DB2 is my DB env. in DataStage. How do you implement this in DataStage?
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Doing it in DataStage is ugly, that's why I gave you an "elegant" SQL solution. Write source table "A" into a hash file, leaving the B phone number column blank. Read source table "B" and reference the hash file, if the row is there, add the B phone number to the B phone number column and write out the complete row. If the row is not in the hash file, create it but leave the "A" phone number blank. Then, dump out all of the rows from the hash file.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
phillip.small
Participant
Posts: 29
Joined: Mon Aug 16, 2004 8:12 am

Post by phillip.small »

kcbland wrote:Doing it in DataStage is ugly, that's why I gave you an "elegant" SQL solution. Write source table "A" into a hash file, leaving the B phone number column blank. Read source table "B" and reference the hash file, if the row is there, add the B phone number to the B phone number column and write out the complete row. If the row is not in the hash file, create it but leave the "A" phone number blank. Then, dump out all of the rows from the hash file.
When you say Hash file? I thought you meant doing in DataStage? Outside of Ascential, what do you mean by Hash file? Are you referring to DB2 command line or another tool that also references Hash files? Sorry but I am new to this and I am only familar with that term for Ascential.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

I'm talking about a DataStage Server job utilizing DataStage Server hash files. You are posting questions in the Server forum, so I'm giving you a Server solution.

If you want a Parallel solution, I'd tell you to a different solution.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

DB2 is my DB env. in DataStage. How do you implement this in DataStage?
You use user-defined SQL in the stage type that is extracting the data from DB2.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply