Page 1 of 1

Address Shuffle

Posted: Thu Dec 06, 2012 1:14 pm
by kennyapril
I have a source file with 1M records which has addresses in it.

Please provide me an idea to shuffle the addresses with in the file and also the addresses should be from the same state.


I sorted the state_cd field and generated a key column to identify the state change.

Can any one help me out the next step or any other idea?

Posted: Thu Dec 06, 2012 2:29 pm
by ray.wurlod
What do you mean by "shuffle"? Sort? Use a Sort stage.

Posted: Thu Dec 06, 2012 2:31 pm
by chulett
Shuffle = Randomize, in a sense. As in re-arrange whom has which address within a given state.

Posted: Thu Dec 06, 2012 2:33 pm
by ray.wurlod
I think we need to wait for the OP's answer on this one.

Re: Address Shuffle

Posted: Tue Dec 11, 2012 12:19 pm
by elsont
I will try to explain using one example
suppose your records is like below one
"Name Address State"
Now you want to shuffle Name and Address with the state

Ans: Split the record into two streams

1: Name + State
2: Address + State

Now add new column "Order" for both streams and use use Random function to get the value (I haven't used the random function in DataStage. It should not give same sequence.. otherwise we have to find another way to so that it give different sequece each time). Then partition only using "State" and sort using "State, Order". This should give you different order in both the streams. Now add another column "Key" to both streams and assign values 0, 1, 2 etc for each State (or simply assigning @INROWNUM also should work).
Now you can join Both the streams on "State and Key" columns and output will be shuffled.

Posted: Wed Dec 12, 2012 5:19 pm
by kennyapril
Thanks very much!

I will try the same scenario and let you know

Posted: Wed Dec 12, 2012 5:24 pm
by kennyapril
Just to be clear my requirement is

Before: 1)John, 123 rew dr,chicago, IL
2)Anthony, 456 qwe dr, springfield, IL
3)Ronny, 789 hjg dr, queens, NY
4)Joseph, 345 kli dr, nyc, NY

After: 1)John, 456 qwe dr, springfield,IL
2)Anthony, 123 rew dr,chicago,IL
3)Ronny, 345 kli dr,nyc, NY
4)Joseph, 789 hjg dr, queens, NY

Thank you!

Posted: Thu Dec 13, 2012 3:03 pm
by kennyapril
In the below scenario I was trying to partition only state , can you please suggest how to partition only one field and after that I used sort stage to sort state code and order.

Thank you!

Posted: Fri Dec 14, 2012 11:25 am
by suryadev
In transformer pass the flow in parallel and select Hash partition for that field and dont select the sort and then next step do sort for others

That should do it