Sequential File Size Limit

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
estevesm
Participant
Posts: 7
Joined: Wed Jun 30, 2004 10:25 am

Sequential File Size Limit

Post by estevesm »

I'm trying to import a 1GB sequential file using the Sequential File Stage in EE 7.5.
The job aborts at the last record saying the data is bad. This is the error msg:

Code: Select all

StatementDetailFile,0: Short read encountered on import.  This most likely indicates one of the following possibilities:
1) The import schema you specified is incorrect
2) Invalid data (the schema is correct, but there is an error in the data)
Expected 504 record bytes, got 56

StatementDetailFile,0: Import error at record 2130439.
I broke the file into two using sed and the ETL job works fine. I thought this would be related to the ulimit settings for my userid which are:

Code: Select all

time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        32768
memory(kbytes)       32768
coredump(blocks)     2048
nofiles(descriptors) 2000
Am I missing something? Any recommendations on how to fix this?
I read somewhere that for big files the FileSet stage is recommended but I could not find anywhere (including in the documentation) how to use it to read that 1GB file.

Any help would be really appreciated.

Rgds

Marcelo Silva, PMP
JPMorganChase
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

The usual limitation that is encountered is found at 2Gb, not 1Gb. The error message is pretty clear, you are working with a fixed length record and the last one just doesn't match that length. If you were to add 1 line at the beginning of the file you will certainly get the same error message, just one line later.

I think that when you split the file into two with sed you might have (luckily) padded the last line so that it now matches the metadata format. This possible cause is likely if your data is blank-padded at the end of the record.
Post Reply