Sequential file - Record delimiter

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
bakul
Participant
Posts: 60
Joined: Wed Nov 10, 2004 2:12 am

Sequential file - Record delimiter

Post by bakul »

Hello,

I am writing records to a sequential file and then using a shell script to split the file. The record delimiter is 'UNIX Newline' and the final delimiter is 'end'.
There are 2 columns in the file and the second column is a VarChar(2000). The second column itself contains double quotes. When I use an awk command to retrieve the second column, only a partial string is retrieved.
If I open the file and manually, press enter at the end of every record and delete the line (effectively put a UNIX newline at the end), the awk statement gives the correct results.
What could be the reason for this? What change should I make to ensure that the file is read correctly.

Regards,
Bakul
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
The reason is ascii zero.
This will occur for example when you open in DS basicv code a seq file and seek to a position after EOF, in this case automaticaly every place skipped till the new written position will get ASCII CHAR 0 (see the basic.pdf documentation of the seek statement).
IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
bakul
Participant
Posts: 60
Joined: Wed Nov 10, 2004 2:12 am

Post by bakul »

Thanks for your reply! But I am not sure I understand it completely. :?:
Is it because the ascii 0 char is not present or is it because it is present?
In either case, what is the workaround? How can I ensure that the complete record will be returned by the awk statement? Is there some specific setting for the sequential file stage to achieve this?
Post Reply