Page 1 of 1
Re-generating sequence numbers within list
Posted: Wed Jun 18, 2008 9:30 pm
by wahi80
Hi,
I have data as follows:
Code: Select all
Jersey City,NJ
Princeton, NJ
Houston,TX
Dallas,TX
LA,CA
Miami,FL
I need to assign a sequence number to each city within the state. Hence my output should look like this:
Code: Select all
1,Jersey City,NJ
2,Princeton, NJ
1,Houston,TX
2,Dallas,TX
1,LA,CA
1,Miami,FL
The sequence should re-start for each state.
I think I need to sort the data first by state, and use create key change column of sort. But how do I re-generate sequence?
Regards
Wah
Posted: Wed Jun 18, 2008 11:40 pm
by Minhajuddin
You can declare a stage variable in a transformer after your sort stage which can be used as a counter.
Code: Select all
Input=====>sort===============>Transformer=========>output
(create key change (Use the stageVar given
on state) below to generate counts)
Code: Select all
counterVariable==> if not(ip.keyChange) then (counterVariable + 1) else 1
Posted: Thu Jun 19, 2008 12:36 am
by ray.wurlod
This is the correct approach, and requires also that the data are partitioned and sorted by state.
Posted: Thu Jun 19, 2008 8:41 am
by wahi80
ray.wurlod wrote:This is the correct approach, and requires also that the data are partitioned and sorted by state. ...
Hi,
The keyChange from Sort is not being generated properly, I think it is due to some partitioning error. I did the following for first half of the job
Code: Select all
InputSeq------------->Sort Stage------------->OutputSeq
(Hash partitioned and (Sort Merge Collector)
sorted by State)
Is there anything Im missing??
Regards
Wah
Posted: Thu Jun 19, 2008 9:46 am
by vidya_6_2000
I have the same situation, but my file is already sorted, so I cannot use the sort stage. In that case, how do I generate a variable that changes value when the key value changes otherwise, remains the same.
In my server job, I could use RowProcCompareWithPreviousValue routine that came with the tool by IBM itself. There is no such routine for parallel jobs.
Regards,
Vidya Iyer
Posted: Thu Jun 19, 2008 10:23 am
by wahi80
Hi,
There were some spaces in the fields which needed to be trimmed.
The numbers are generated in right order.
Thanks for the help!!
Wah
Posted: Thu Jun 19, 2008 10:54 pm
by ray.wurlod
The Sort stage is perfect to use if the data are already sorted. Specify the sort mode property as "don't sort, already sorted". This prevents DataStage from inserting a tsort operator into the step (= job).
Posted: Fri Jun 20, 2008 7:38 am
by vidya_6_2000
Oh Thank you! That helped! Makes sense.
Appreciate everybody's time and effort.
Regards,
Vidya Iyer