Sequential file stage - Number of readers per node

Thripura · Post by **Thripura** » Tue Jan 15, 2013 3:00 am

Hi,

In Sequential file stage - Number of readers per node property is set to 2 and filter property is also used.
I am getting broken pipe issue and job is aborting when it is triggered with 15K data and no issue with less data.

Job design :

Sequentail file ---> Filter --->Column import ---> Target Db

Getting some warning at the filter property of the sequential file stage.

Warning message :
A filter can only be applied with multiple readers if the filter command is guaranteed to return the same number of bytes in the record for every record in the input file. If more or fewer bytes are returned the output dataset will contain duplicate or missing records

Director log:

Source subproc: tr: write error: Broken pipe
tr: write error
.

main_program: APT_PMsectionLeader(1, node1), player 4 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 5 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 7 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 8 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 9 - Unexpected exit status 1.

ray.wurlod · Post by **ray.wurlod** » Tue Jan 15, 2013 4:34 am

15K might be the point where a differently-sized record is first encountered.

zulfi123786 · Post by **zulfi123786** » Tue Jan 15, 2013 8:27 am

what is the tr command used for ?

Better practice is to read the fixed width record as single column without filter command on multiple readers per node and then use filtering options followed by parsing the single record into multiple columns using substring functions.

The added advantage would be :
1. Reading parallely on multiple readers
2. Parsing the data parallely into columns
3. No warnings/errors