How to generate row numbers for each file-Job Design

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sureshreddy2009
Participant
Posts: 62
Joined: Sat Mar 07, 2009 4:59 am
Location: Chicago
Contact:

How to generate row numbers for each file-Job Design

Post by sureshreddy2009 »

Hi :lol:

I have files in my inbound directory as
whs_e1_20100502.txt,
whs_e2_20100502.txt,
whs_e3_20100502.txt. dates are changing for every day load.

Now I have to generate the row numbers for each and every file. For example if first file contains 100 records, numbers will be generated in a count column as 1,2,3...100 for all records. If second file contains 30 records then count column will have numbers as 1,2,3...30.
some day multi batch scenario is also there like I am running 3 days batch at a time. For this example there are files for so many days with names whs_e1*.txt , in the place of * dates are coming.

Please tell me the job design to implement this... Thanks in advance for your solutions.
Suresh Reddy
ETL Developer
Research Operations

"its important to know in which direction we are moving rather than where we are"
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

If you are having specific count of files to read then you can use a transformer and create a column with deriavation as @OUTROWNUM system variable instead if you are not sure of how many files you would be reading each time then you can specify the multiple files to read option in this case you will not have the numbers restarted each time for different files.

In such case you can have a small unix script to do the same where you can generate numbers for each file using the cat -n option
sureshreddy2009
Participant
Posts: 62
Joined: Sat Mar 07, 2009 4:59 am
Location: Chicago
Contact:

Now Got the answer

Post by sureshreddy2009 »

Hi,

I got the answer for my question/scenario
Here is the solution.
I read all the files through sequential file stage by putting file pattern , I put another option file name column. after sequential file stage I used sort stage and is sorted based on file name column including hash paritioning on file name column. after that in transformer stage variables based on file name column by using previous key and after key logic. i generated numbers for each and every file starting from number 1 incrementing 1 by 1 up to completion of all records in that file. and atlast i loaded in the target table.
Suresh Reddy
ETL Developer
Research Operations

"its important to know in which direction we are moving rather than where we are"
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You can't do that in a server job.
(Moved, as requested)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

While you may not be able to do that, you can certainly do that in a Server job. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
sureshreddy2009
Participant
Posts: 62
Joined: Sat Mar 07, 2009 4:59 am
Location: Chicago
Contact:

Post by sureshreddy2009 »

If your post is explaining about this logic will be implement only in server jobs not in paralell jobs.

I can say one thing that I implemented in parallel jobs 8.0.1
I tested the scenario also.
Suresh Reddy
ETL Developer
Research Operations

"its important to know in which direction we are moving rather than where we are"
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

No, the comment from Ray is directed at your choice of forum to post in.
-craig

"You can never have too many knives" -- Logan Nine Fingers
sureshreddy2009
Participant
Posts: 62
Joined: Sat Mar 07, 2009 4:59 am
Location: Chicago
Contact:

Post by sureshreddy2009 »

But my problem is related to parallel jobs that is why I posted in parallel jobs forum.
Any how I got it thanks
Suresh Reddy
ETL Developer
Research Operations

"its important to know in which direction we are moving rather than where we are"
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Double-check where you posted, where we are right now. We are not in the parallel jobs forum. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
sureshreddy2009
Participant
Posts: 62
Joined: Sat Mar 07, 2009 4:59 am
Location: Chicago
Contact:

Post by sureshreddy2009 »

Next time I will take care the forum selection :? .. Thanks a ton :)
Suresh Reddy
ETL Developer
Research Operations

"its important to know in which direction we are moving rather than where we are"
Post Reply