Page 1 of 1

Skip header record from complex flat file stage

Posted: Thu May 11, 2017 1:01 am
by harikhk
Hi,

I am receiving zipped binary (EBCDIC))complex flat file
This is a fixed width file and contains a header record

Could you please suggest how an I filter the header record in the complex flat file stage

At the job level the file is being unzipped and on running the job, the job is failing with the below error


Short read encountered on import; this most likely indicates one of the following possibilities:
1) the import schema you specified is incorrect
2) invalid data (the schema is correct, but there is an error in the data).

I understand the issue is with file definition or with the data.

I was told to remove the header record and continue the loading.
Please suggest how can I get the header record filtered from complex flat file stage

Posted: Thu May 11, 2017 2:13 am
by ray.wurlod
Presumably there is some kind of "record type" field. Use that to direct the header to a different output.

Posted: Thu May 11, 2017 2:42 am
by harikhk
Tried creating a new record defining a single column as character type equal to the length of the record but of no result

Posted: Thu May 11, 2017 7:00 am
by chulett
How exactly is the file being unzipped "at the job level"? If you are sending it to standard out then you should be able to pipe it to something like "sed" to remove the first record before the job consumes it.

Posted: Thu May 11, 2017 9:43 am
by FranklinE
Not sure about the compression aspect. If the data is coming into CFF in the clear, it should not be an issue.

CFF requires you to define every record type on the input, but it does not require you to have an output link for every type. I have many files with header and trailer records, and I never need the trailer, so I don't code an output link for it.

Posted: Thu May 11, 2017 2:57 pm
by chulett
... or that. :wink:

Posted: Thu May 11, 2017 5:11 pm
by vmcburney
Is this a file generated by Cobol or from IMS? There should be a Cobol Definition File or Copy Book that DataStage can import to load up the required metadata to read the file. This would not just identify the record type, level and position of the header record but would also handle Cobol keywords like OCCURS and REDEFINES.

Posted: Mon May 15, 2017 3:51 am
by harikhk
Thank you all for the replies. It has been identified that the copy book definition was outdated and I am waiting for the revised copy book
And hope it does not create any issues

Posted: Mon May 15, 2017 5:31 am
by ray.wurlod
That'll do it every time, and says a great deal about information governance in your organization. You should escalate the existence of this lack of communication through the stewardship community - as far as the CDO if necessary.

Posted: Mon May 15, 2017 6:39 am
by FranklinE
It is often the case that there is a very different "method" of version control between platforms. Cobol systems generally have an embedded (and sometimes proprietary) method. We use Endevor, which both manages code between regions (dev, test, prod) and does automatic compiles. Interfacing that with a DataStage environment, at least here, just doesn't work.

We download copybooks during development, then rely on that same communication that didn't work for HK. We also enjoy strong discipline in our host development, and changes to copybooks are almost always made in the end "filler" of a copybook layout. This means that our code only needs to change if we have data being added in that filler area. If we don't need it, it doesn't actually change for us.