Sequential file stage - Number of readers per node

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Thripura
Participant
Posts: 26
Joined: Wed Jan 27, 2010 5:56 am

Sequential file stage - Number of readers per node

Post by Thripura »

Hi,

In Sequential file stage - Number of readers per node property is set to 2 and filter property is also used.
I am getting broken pipe issue and job is aborting when it is triggered with 15K data and no issue with less data.

Job design :

Sequentail file ---> Filter --->Column import ---> Target Db

Getting some warning at the filter property of the sequential file stage.

Warning message :
A filter can only be applied with multiple readers if the filter command is guaranteed to return the same number of bytes in the record for every record in the input file. If more or fewer bytes are returned the output dataset will contain duplicate or missing records

Director log:

Source subproc: tr: write error: Broken pipe
tr: write error
.

main_program: APT_PMsectionLeader(1, node1), player 4 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 5 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 7 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 8 - Unexpected exit status 1.
APT_PMsectionLeader(1, node1), player 9 - Unexpected exit status 1.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

15K might be the point where a differently-sized record is first encountered.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

what is the tr command used for ?

Better practice is to read the fixed width record as single column without filter command on multiple readers per node and then use filtering options followed by parsing the single record into multiple columns using substring functions.

The added advantage would be :
1. Reading parallely on multiple readers
2. Parsing the data parallely into columns
3. No warnings/errors :)
- Zulfi
Post Reply