Combine Records

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
Manfred Hagedorn
Participant
Posts: 58
Joined: Wed Apr 04, 2007 10:02 am

Combine Records

Post by Manfred Hagedorn »

Hello,
i have tried and searched for hours, and i don't find a solution!
I have the following type of data input:

Record-Id, Field-Name, Field-Value
1,Name,Hagedorn
1,Age,50
2,Name,Schmitz
2,City,Hamburg
2,Income,50000
3,Name,Meier
3,City,Berlin

You see, not all columns are always available for each id.

And i want to get the following output:
Record-Id, Name, City, Age, Income
1,Hagedorn,,50,
2,Schmitz,Hamburg,,50000
3,Meier,Berlin,,

All Input Records with the same ID should be in one Output record, and the values should be assigned to the Outputfields accordingly to the Fieldnames.

The real file has more than 300 possible Fieldnames!

The Combine Records Stage seems not to solve my problem.

Is it in general possible with Datastage?
Pagadrai
Participant
Posts: 111
Joined: Fri Dec 31, 2004 1:16 am
Location: Chennai

Re: Combine Records

Post by Pagadrai »

Hi,
In this case, your target will have 300+ column names.
I can see a way using lookup.

Lets say your source file has all distinct ids.
Your lookup will be the different lookup file sets created for each field value like name etc.

But the might jobs look ugly with so many links & stages. :(
you can split the task into multiple jobs to make it simpler.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If the inport is sorted by Record_ID, then you can assemble the output record putting the correct value in the correct field via If..Then..Else statements in a Transformer stage (use stage variables initialized to null, and name the stage variable in the Else part). Then run the output through a Remove Duplicates stage keeping only the final row for each Record_ID value.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Manfred Hagedorn
Participant
Posts: 58
Joined: Wed Apr 04, 2007 10:02 am

Re: Combine Records

Post by Manfred Hagedorn »

Hi Pagadrai,

thanks a lot for your feedback!
The main reason for posting this issue was, that i expected, that there is an easy solution within Datastage, but i simply don't find or don't know it.

Has anybody else any idea, how to solve this problem?
Best regards
Manfred
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You were probably typing that while I was typing my solution.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Manfred Hagedorn
Participant
Posts: 58
Joined: Wed Apr 04, 2007 10:02 am

Re: Combine Records

Post by Manfred Hagedorn »

I understand, that there is no smart solution within Datastage.
I think, i will use a SAS-Stage within Datastage and do a kind of pre-processing with SAS-coding and write an output file which will be used for further processing with real Datastage steps.

Thanks for your support/feedback.
Manfred
sbass1
Premium Member
Premium Member
Posts: 211
Joined: Wed Jan 28, 2009 9:00 pm
Location: Sydney, Australia

Post by sbass1 »

Yeah I was going to suggest SAS since it supports first.key and last.key processing plus conditional output.

If you don't have SAS licensed (although it sounds like you do) you could also do this in perl, loading a hash (associative array with character keys) and outputing the hash (accumulated record) when the key changes.
Post Reply