column sampling

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ashishm
Premium Member
Premium Member
Posts: 37
Joined: Thu Jun 16, 2011 8:12 am
Location: india

column sampling

Post by ashishm »

Hi all,


I have to pick up data from a column randomly.My column datatype is varchar.I am using datastage 8.1.How can I do this????
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

We'll need a better explanation than that. Can you provide an example of your 'columns' and what the output of your 'random sampling' might look like?
-craig

"You can never have too many knives" -- Logan Nine Fingers
ashishm
Premium Member
Premium Member
Posts: 37
Joined: Thu Jun 16, 2011 8:12 am
Location: india

Post by ashishm »

Hi chulett

I have two input Sequential file each have two columns and their datatypes are varchar.My requirement is the output file have four columns.First two columns are direct mapping from first input file and other two columns have to be loaded with data randomly picked from the two columns in the second file.There is no business rule for picking up the data from the second input file.How can i do this.???
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

One potential method:

On your second file (which you will randomly select from), assign a sequential sequence number (NOT random) to each row. You could use row number generation in Sequential File stage, for example...this would probably be the best place to do it. The results should be 1, 2, 3, 4, ... number_of_rows

Knowing exactly how many rows are present in your second file, assign a random sequence number to each row of your first file, with a maximum value of the number of rows in the second file. You can use one of the random number functions in a transformer or use a column generator stage.

Then, using a lookup/join/merge, join the two files on the sequence number columns.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
Satwika
Participant
Posts: 45
Joined: Mon Jan 02, 2012 11:29 pm

Post by Satwika »

Hi ashishm ,

is it resolved? If so let us know how you did it. Thank you
Last edited by Satwika on Mon Feb 06, 2012 3:14 am, edited 1 time in total.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It's not marked as resolved, nor is there any indication that U (one of our posters) had any involvement at all.

The second person personal pronoun in English is spelled "you".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Satwika
Participant
Posts: 45
Joined: Mon Jan 02, 2012 11:29 pm

Post by Satwika »

Thank you ray
Post Reply