Address Shuffle

kennyapril · Post by **kennyapril** » Thu Dec 06, 2012 1:14 pm

I have a source file with 1M records which has addresses in it.

Please provide me an idea to shuffle the addresses with in the file and also the addresses should be from the same state.

I sorted the state_cd field and generated a key column to identify the state change.

Can any one help me out the next step or any other idea?

ray.wurlod · Post by **ray.wurlod** » Thu Dec 06, 2012 2:29 pm

What do you mean by "shuffle"? Sort? Use a Sort stage.

chulett · Post by **chulett** » Thu Dec 06, 2012 2:31 pm

Shuffle = Randomize, in a sense. As in re-arrange whom has which address within a given state.

ray.wurlod · Post by **ray.wurlod** » Thu Dec 06, 2012 2:33 pm

I think we need to wait for the OP's answer on this one.

elsont · Post by **elsont** » Tue Dec 11, 2012 12:19 pm

I will try to explain using one example
suppose your records is like below one
"Name Address State"
Now you want to shuffle Name and Address with the state

Ans: Split the record into two streams

1: Name + State
2: Address + State

Now add new column "Order" for both streams and use use Random function to get the value (I haven't used the random function in DataStage. It should not give same sequence.. otherwise we have to find another way to so that it give different sequece each time). Then partition only using "State" and sort using "State, Order". This should give you different order in both the streams. Now add another column "Key" to both streams and assign values 0, 1, 2 etc for each State (or simply assigning @INROWNUM also should work).
Now you can join Both the streams on "State and Key" columns and output will be shuffled.

kennyapril · Post by **kennyapril** » Wed Dec 12, 2012 5:19 pm

Thanks very much!

I will try the same scenario and let you know

kennyapril · Post by **kennyapril** » Wed Dec 12, 2012 5:24 pm

Just to be clear my requirement is

Before: 1)John, 123 rew dr,chicago, IL
2)Anthony, 456 qwe dr, springfield, IL
3)Ronny, 789 hjg dr, queens, NY
4)Joseph, 345 kli dr, nyc, NY

After: 1)John, 456 qwe dr, springfield,IL
2)Anthony, 123 rew dr,chicago,IL
3)Ronny, 345 kli dr,nyc, NY
4)Joseph, 789 hjg dr, queens, NY

Thank you!

kennyapril · Post by **kennyapril** » Thu Dec 13, 2012 3:03 pm

In the below scenario I was trying to partition only state , can you please suggest how to partition only one field and after that I used sort stage to sort state code and order.

Thank you!

suryadev · Post by **suryadev** » Fri Dec 14, 2012 11:25 am

In transformer pass the flow in parallel and select Hash partition for that field and dont select the sort and then next step do sort for others

That should do it

DSXchange

Address Shuffle

Address Shuffle

Re: Address Shuffle