Sequential File: Too many columns recieved than expected.

dwscblr · Post by **dwscblr** » Mon Jul 12, 2004 7:31 pm

We have a DataStage server job that read from a sequential file, transforms the data and outputs it to another sequential file. Input is a quote delimited file and is supposed to have only 15 columns.

But the data we recieve from our source is erroneous at times. In a file, some rows are sent with lesser number of columns and some rows are send with more number of columns. Either case, we have to drop the record.

For lesser number of columns we use the Missing Columns provided by DataStage on the output link, where in we specify "Discard and Warn". But for too many columns they is no such option.

Has anybody come across this scenario

Thanks
Sreeja

ray.wurlod · Post by **ray.wurlod** » Mon Jul 12, 2004 7:54 pm

You could re-design, where you read the entire record as a single column, then parse out only the pieces you require.
Using this design, you can also report on erroneous input data.

rasi · Post by **rasi** » Mon Jul 12, 2004 10:07 pm

In the Datastage Designer there is an option in the Format tab of Sequential File Stage which says "Supress row truncation warnings". You can use it to supress for too many columns.

You can do like what Ray mentioned as making this as Single column and in addition to that Design a job which reads that Single Column record and check for the no of delimiters in that column and if doesn't satisfy reject the records and pass the valid records to another Sequential File. And use this Seqeuntial file as your input to the original job.

Thanks
Rasi Siva

dwscblr · Post by **dwscblr** » Tue Jul 13, 2004 10:36 am

I found way to achieve the same.

In the transformer stage I added the constraint following constraint to the outputlink: NOT(OutputLink.REJECTED=@FALSE)

But this is exactly the opposite of what the DataStage documentation states.

Documentation excerpt:Input and Output link variables are predefined variables applicable to links. They can be used in the constraint or derivation expression of an output link.

The available Output link variables are:

LinkName.REJECTED. Is set according to whether the write of an output row to the named link was successful or not, i.e., if the write was successful LinkName.REJECTED is set FALSE.

Linkname.REJECTEDCODE. Returns an error code number if the write fails, or 0 if either the write succeeds or is rejected because a link constraint was not met. You can set return error codes for linkname.REJECTEDCODE by selecting from the Expression Editor Link Variables > Constants... menu options.

Notes
As per the documentation the constraint should be OutputLink.REJECTED = @FALSE.

If is I set OutputLink.REJECTED = @FALSE. All records are rejected.
If is I set OutputLink.REJECTED = @TRUE. It works exactly the opposite of what I want - reject and out data are swapped.
If is I set NOT(OutputLink.REJECTED = FALSE). It works as i want!!!

Am I missing something?