How to generate row numbers for each file-Job Design
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 62
- Joined: Sat Mar 07, 2009 4:59 am
- Location: Chicago
- Contact:
How to generate row numbers for each file-Job Design
Hi
I have files in my inbound directory as
whs_e1_20100502.txt,
whs_e2_20100502.txt,
whs_e3_20100502.txt. dates are changing for every day load.
Now I have to generate the row numbers for each and every file. For example if first file contains 100 records, numbers will be generated in a count column as 1,2,3...100 for all records. If second file contains 30 records then count column will have numbers as 1,2,3...30.
some day multi batch scenario is also there like I am running 3 days batch at a time. For this example there are files for so many days with names whs_e1*.txt , in the place of * dates are coming.
Please tell me the job design to implement this... Thanks in advance for your solutions.
I have files in my inbound directory as
whs_e1_20100502.txt,
whs_e2_20100502.txt,
whs_e3_20100502.txt. dates are changing for every day load.
Now I have to generate the row numbers for each and every file. For example if first file contains 100 records, numbers will be generated in a count column as 1,2,3...100 for all records. If second file contains 30 records then count column will have numbers as 1,2,3...30.
some day multi batch scenario is also there like I am running 3 days batch at a time. For this example there are files for so many days with names whs_e1*.txt , in the place of * dates are coming.
Please tell me the job design to implement this... Thanks in advance for your solutions.
Suresh Reddy
ETL Developer
Research Operations
"its important to know in which direction we are moving rather than where we are"
ETL Developer
Research Operations
"its important to know in which direction we are moving rather than where we are"
-
- Premium Member
- Posts: 730
- Joined: Tue Nov 04, 2008 10:14 am
- Location: Bangalore
If you are having specific count of files to read then you can use a transformer and create a column with deriavation as @OUTROWNUM system variable instead if you are not sure of how many files you would be reading each time then you can specify the multiple files to read option in this case you will not have the numbers restarted each time for different files.
In such case you can have a small unix script to do the same where you can generate numbers for each file using the cat -n option
In such case you can have a small unix script to do the same where you can generate numbers for each file using the cat -n option
-
- Participant
- Posts: 62
- Joined: Sat Mar 07, 2009 4:59 am
- Location: Chicago
- Contact:
Now Got the answer
Hi,
I got the answer for my question/scenario
Here is the solution.
I read all the files through sequential file stage by putting file pattern , I put another option file name column. after sequential file stage I used sort stage and is sorted based on file name column including hash paritioning on file name column. after that in transformer stage variables based on file name column by using previous key and after key logic. i generated numbers for each and every file starting from number 1 incrementing 1 by 1 up to completion of all records in that file. and atlast i loaded in the target table.
I got the answer for my question/scenario
Here is the solution.
I read all the files through sequential file stage by putting file pattern , I put another option file name column. after sequential file stage I used sort stage and is sorted based on file name column including hash paritioning on file name column. after that in transformer stage variables based on file name column by using previous key and after key logic. i generated numbers for each and every file starting from number 1 incrementing 1 by 1 up to completion of all records in that file. and atlast i loaded in the target table.
Suresh Reddy
ETL Developer
Research Operations
"its important to know in which direction we are moving rather than where we are"
ETL Developer
Research Operations
"its important to know in which direction we are moving rather than where we are"
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 62
- Joined: Sat Mar 07, 2009 4:59 am
- Location: Chicago
- Contact:
If your post is explaining about this logic will be implement only in server jobs not in paralell jobs.
I can say one thing that I implemented in parallel jobs 8.0.1
I tested the scenario also.
I can say one thing that I implemented in parallel jobs 8.0.1
I tested the scenario also.
Suresh Reddy
ETL Developer
Research Operations
"its important to know in which direction we are moving rather than where we are"
ETL Developer
Research Operations
"its important to know in which direction we are moving rather than where we are"
-
- Participant
- Posts: 62
- Joined: Sat Mar 07, 2009 4:59 am
- Location: Chicago
- Contact:
-
- Participant
- Posts: 62
- Joined: Sat Mar 07, 2009 4:59 am
- Location: Chicago
- Contact: