Find Last Record in File
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 232
- Joined: Sat May 07, 2005 2:49 pm
- Location: USA
Find Last Record in File
Hi Guys,
I have a sequential file, that can contain variable number of rows. While processing the input data in the transformer. Is there any way, by which I can tell that the row I am currently processing is the last row in the input source file or not ?. Any help would be appreciated.
Thanks
Naveen
I have a sequential file, that can contain variable number of rows. While processing the input data in the transformer. Is there any way, by which I can tell that the row I am currently processing is the last row in the input source file or not ?. Any help would be appreciated.
Thanks
Naveen
-
- Premium Member
- Posts: 224
- Joined: Tue Sep 24, 2002 7:32 am
- Location: Denver, CO USA
If it is a fixed-width file, you could use a stage variable to call a routine (once) that goes to the OS and parse out the size of the file. Divide that by the length of each record then use @INROWNUM to determine when you are at the last record.
Sorry I don't have an example to give you, just an idea. Otherwise, there isn't a built-in way of knowing you are on the last row.
John
Sorry I don't have an example to give you, just an idea. Otherwise, there isn't a built-in way of knowing you are on the last row.
John
Naveen,
since the sequential file has no pointer forward, you can never know if your next READ is going to reach the end-of-file. Depending upon what you want to do at the EOF you have several approaches. As John mentioned, only a fixed-length record file will let you know the number of lines - and even there you would have to position to the end of the file to get that number - in other words you will need to make at least one pass of all the data.
One approach would be to use the result of a UNIX wc -l command that counts the number of lines in a sequence and then pass that result to your job as a parameter, then you could perform a check like "IF @INROWNUM=#MyInputNumberOfLines# THEN ..."
since the sequential file has no pointer forward, you can never know if your next READ is going to reach the end-of-file. Depending upon what you want to do at the EOF you have several approaches. As John mentioned, only a fixed-length record file will let you know the number of lines - and even there you would have to position to the end of the file to get that number - in other words you will need to make at least one pass of all the data.
One approach would be to use the result of a UNIX wc -l command that counts the number of lines in a sequence and then pass that result to your job as a parameter, then you could perform a check like "IF @INROWNUM=#MyInputNumberOfLines# THEN ..."
-
- Charter Member
- Posts: 199
- Joined: Tue Jan 18, 2005 2:50 am
- Location: India
Alternate,
step1: Write a routine, which would use DSEXECUTE to fire wc -l command on UNIX box and get the count of lines. pass the file name and path as argument to the file.
step2:Call that routine in a stage variable in one of your transformer and get control over the records. :D
step1: Write a routine, which would use DSEXECUTE to fire wc -l command on UNIX box and get the count of lines. pass the file name and path as argument to the file.
step2:Call that routine in a stage variable in one of your transformer and get control over the records. :D
Shantanu Choudhary
-
- Premium Member
- Posts: 224
- Joined: Tue Sep 24, 2002 7:32 am
- Location: Denver, CO USA
A techique I use to call a routine in a stage variable only once is:
1. set the default to @NULL when defining the stage variable
2. use 'IF IsNull(stage_variable_name) THEN call routine ELSE stage_variable_name'
3. of course, you need to make sure the routine doesn't return NULL or it will be called again...
John
1. set the default to @NULL when defining the stage variable
2. use 'IF IsNull(stage_variable_name) THEN call routine ELSE stage_variable_name'
3. of course, you need to make sure the routine doesn't return NULL or it will be called again...
John
Last edited by ds_developer on Fri Jul 01, 2005 11:32 am, edited 1 time in total.
John,
that's a good approach, I usually use a COMMON in the routine to skip subsequent calls; and the overhead to PCL a function or subroutine has much more overhead than an IF-THEN construct. But in either case we are adding unneeded extraneous code for each row, so for efficiency it does make sense to put this part outside of the loop. One still needs an IF-THEN each row to test whether or not we are at the last record, but that is unavoidable under the circumstances.
that's a good approach, I usually use a COMMON in the routine to skip subsequent calls; and the overhead to PCL a function or subroutine has much more overhead than an IF-THEN construct. But in either case we are adding unneeded extraneous code for each row, so for efficiency it does make sense to put this part outside of the loop. One still needs an IF-THEN each row to test whether or not we are at the last record, but that is unavoidable under the circumstances.
-
- Participant
- Posts: 232
- Joined: Sat May 07, 2005 2:49 pm
- Location: USA
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Charter Member
- Posts: 199
- Joined: Tue Jan 18, 2005 2:50 am
- Location: India
Thanx for highlighting, sorry i missed out writing that part. For that I would have used COMMON or the way ds_developer has suggested or having StageVar=If @INROWNUM =1 then call routine Else StageVar. Third approach is almost similar to ds_devloper approach.ArndW wrote:Shantanu,
the basic idea is good, but a stage variable will get executed for each row passed through the transformer... This will probably slow down your job considerably. Better a combination - use the routine, but call it from a parent Sequencer.
Chulett your approach is gr8 man, I never thought of calling a routine in Initial Value.
Shantanu Choudhary
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: