copy rows based on input column value!

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
samplify
Participant
Posts: 5
Joined: Mon Aug 22, 2005 12:42 am

copy rows based on input column value!

Post by samplify »

Hi,

I have a curious problem which didn't turn up in DSXchange search.
i have a key column and a count column in the source, need to produce "as many" output rows for "that key column" in the source.

Sample:

(input)

Key/Count
1,2
2,1
3,5
4,2

(output)

Key/Count
1,2
1,2
2,1
3,5
3,5
3,5
3,5
3,5
4,2
4,2

Thanks in advance! (and hope its not a UNIX solution!)

Sam
kandyshandy
Participant
Posts: 597
Joined: Fri Apr 29, 2005 6:19 am
Location: Singapore

Post by kandyshandy »

This could not be done directly in a PX job but could be done in a server / BASIC routine or px C++ routine

Here is the logic

Open your input file

Loop start here
Read first line to a Linevariable

Assign second column value to a countvariable

Open Output file
For countvariable =1, increment by 1
Write Linevaraible to output file
Next

Next line in Loop

Close both files.
Kandy
_________________
Try and Try again…You will succeed atlast!!
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

One question.

Why?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
samplify
Participant
Posts: 5
Joined: Mon Aug 22, 2005 12:42 am

@Ray

Post by samplify »

Well, the requirement wants the evaluation of an expression (Field1+(Field2*n))
where n is (0 to (Field3)-1)
so i was wondering if i split row into as many rows as Field3 value minus 1 then i can straight away use the formula at a "scalar" level.
(oh yes and i need that many multiple rows being written to the output)
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Generating Rows Using a Multiplier

Post by ray.wurlod »

samplify wrote:(oh yes and i need that many multiple rows being written to the output)
I still don't understand why (that is, how any such output could be useful/meaningful). Maybe it's just me.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jusami25
Premium Member
Premium Member
Posts: 84
Joined: Tue Oct 26, 2004 12:49 am

Post by jusami25 »

Hi,

he had the same requeriment and we solved it with two jobs:

First job

Input Record --> KEY, COUNT

Seq File --> Transform -- Seq File

On the Transfrom job we use the Str() function in this way:

If InputLink.COUNT > 0 Then
InputLink.KEY : ',' : InputLink.COUNT: ',' : Str(Char(10) : InputLink.KEY : ',' : InputLink.COUNT, InputLink.COUNT-1)
Else ''

We map this derivation to a single field with enought lenght to store all the records generated for each input row.

Output Record --> ALL_RECORDS


Second job

Seq File --> Transform -- Seq File

This job will use the output file from the First job as input with the next format:

field delimiter = ,
record delimiter = Char(10)

Input Record --> KEY, COUNT

At this point we have the number of records we need and on the transform we can perform the required operations...

Key points here are the Char(10) Return code, the Str function to repeat the fields dinamically and the format on which we read the file on the second job.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Or, maybe a three to four liner shell script.....
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Re: copy rows based on input column value!

Post by ray.wurlod »

samplify wrote:Thanks in advance! (and hope its not a UNIX solution!)
I'd be inclined to write a before-job subroutine (or just a routine to be invoked from a Routine activity in a sequence).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply