Page 1 of 1

Processing First Record Differently Than Others

Posted: Fri Oct 09, 2009 6:20 pm
by tracy
I've got a data file being sent to me that has various Record Types. For instance:

RecordType1|10
RecordType2|09/09/2009|Tracy|3
RecordType2|09/08/2009|Tracy|5
RecordType2|09/10/2009|Tracy|2

Record Type 1 is a validation record. I need to add up the numbers in all Record Type 2's and make sure it matches the validation record (Type 1). So here, 3+5+2 = 10 so it's a match and a valid file.

I've got this working by processing the file through two different Sequential File stages:
1. Once in the format of Record Type 1 (where there are two fields)
2. A 2nd time in the format of Record Type 2 (where there are 4 fields) where the Incomplete column is set to Replace (otherwise it complains that the Record Type 1 doesn't have enough fields).

In my Router Stage I've got a constraint that will pull either RecordType1 or RecordType2 as appropriate.

It seems to be working. My concern is that the data file is huge and takes quite some time to process. When I'm processing for Record Type 2, I've got to go through the whole file so there's no choice. But when I'm processing for Record Type 1, I don't really want it to have to go through the entire file again when I know I'm only interested in the first record.

I've read through some of the other posts and see suggestions to change the Abort After Rows to 1 in the Router stage, but I don't want the job to end in an Aborted state because I want it to Abort for other purposes.

I also read to put "@INROWNUM < 2" in the constraint in the router. I've done this and it appears to be working. However, when I look at it while it's processing, I see this:

Sequential File (100 records) ---> Router ---> Database (1 record)

So it's working to limit it to 1 record in the end. But it still appears to be going through the entire Sequential file. Is there a way to prevent this?

Posted: Fri Oct 09, 2009 6:25 pm
by chulett
Read just the header record (I assume there's only 1) using "head -1" in the Filter option of the stage.

Posted: Sat Oct 10, 2009 2:49 am
by Sainath.Srinivasan
Or use complex flat file stage to constraint flow by different record types.

Posted: Sat Oct 10, 2009 3:53 am
by ray.wurlod
Or read the file as a single string per row, and parse the different record types in a downstream Transformer stage.

Or read the file parsed according to RecordType2 metadata, with missing columns rules in place, and handle RecordType1 in the downstream Transformer stage.

Posted: Mon Oct 12, 2009 1:37 pm
by tracy
Thanks! I used the "head -1" in the Filter option of the stage since it was the first and easiest suggestion. Works like a charm.

I wasn't aware of this feature. It's going to come in really handy. Thanks![/u]

Posted: Mon Oct 12, 2009 1:44 pm
by chulett
8)