Page 1 of 2

Reading a sequential file from an EE job into a server job

Posted: Wed Feb 28, 2007 8:33 am
by sjordery
Hello Everyone,

I have searched the forum and could not find this specific problem, so if anyone can help out, that would be much appreciated.

I have a Parallel job which ends up creating a sequential file with the following Format settings:

Final Delimiter: None
Delimiter: Comma
Quote: Double

This file in intended to be read in further down-stream by a Server job. I have the settings for the Sequential file in the server job set to:

Delimiter: ,
Quote character: "

Both sequential stages are set up to use the same Table Definition.

The problem is that the EE job runs fine and produces a file that I am able to View easily in DataStage. However, trying to view the file in the Server job and it displays very strangely with odd control characters.

Can EE job created sequential files be read into Server jobs?

Thanks in advance.

Posted: Wed Feb 28, 2007 8:46 am
by kumar_s
Hi welcome to Dsxchange :D !!!!
Are you sure that there is not change in the metadata?
Was there any migration of the file across servers?
IS the special character at the end of the lines?

Posted: Wed Feb 28, 2007 8:57 am
by sjordery
Hi kumar_s,

Thanks for your reply, and for the welcome! :)

The meta data is definately the same. When I view the file in the Server job all the columns are displayed, but the data seems to spill over into the next rows.

The file is on the same server - it might help to tell you that I used a join stage in the EE job - input to the job was four datasets, output is the sequential file.

The control character doesn't appear at the end of line, they are all over the place!

Thanks again.

Posted: Wed Feb 28, 2007 10:26 am
by ArndW
What does the file look like when you view the contents from UNIX?

Posted: Wed Feb 28, 2007 10:53 am
by sjordery
Hi ArndW,

The file when viewed in vi contains all kind of junk like:

"0^M" - I was expecting "0" - EDITED: checked the input and there are control characters where-ever the ^M appears. The stuff below is still a mystery though.

and

^@28/12/2003^F^@137064^B^@NA^B^

where I would have expected:

"28/12/2003","137064","NA"

It looks to me therefore that the join of datasets into a sequential file ins't working. Is this something that could cause problems and that should be avoided?

Thanks

Re: Reading a sequential file from an EE job into a server j

Posted: Wed Feb 28, 2007 12:51 pm
by girish119d
I am not sure but one of the possibility that in parallel job you creating file as dos file and when you tying to view the file in server job you might be reading the file as UNIX file please check that you should create and read file as UNIX file.

Posted: Wed Feb 28, 2007 1:57 pm
by kumar_s
As noted, what you see when you vi the file in Unix server.

Posted: Wed Feb 28, 2007 2:58 pm
by ray.wurlod
What stage type did you use to write the sequential file in the Parallel job?

Try changing the format of the file so that it uses UNIX-style record delimiter. No record delimiter is contra-indicated if the data are variable length and/or delimited format.

Posted: Wed Feb 28, 2007 4:20 pm
by sjordery
Thanks Ray.

The Sequential file is being written as output from a Join Stage. The input to the join stage is 4 seperate datasets - all the same layout.

I tried setting format record delimiter to UNIX new line, but no luck.

The other odd thing is that there are 601 rows input, and the stats show 601 lines output from the Join, but when I view data on the sequential file, only 24 of the rows are shown!

Cheers

Posted: Wed Feb 28, 2007 4:37 pm
by DSguru2B
24 must be the limit on the sequential file stage when you hit view data. Increase that to 9999, which I think is the max you can go, it will pull up everything, atleast in your case.

Posted: Wed Feb 28, 2007 4:42 pm
by sjordery
[quote="DSguru2B"]24 must be the limit on the sequential file stage when you hit view data.[/quote]

Thanks DSguru2B, but I have set the Limit to 1000. The display still only shows 24. They are also 24 from various places on the file, not just the first 24 rows.

I put in a peek and the output of the peek shows all 601 rows are being passed in the link.

Thanks

Posted: Wed Feb 28, 2007 5:10 pm
by DSguru2B
It has to be something with the display limit. Do a wc -l at the unix level and see what is the true count?

Posted: Wed Feb 28, 2007 5:44 pm
by sjordery
Thanks for that, I shall check it out.

Incidentally, out of the following columns, the first outputs ok, the second doesn't:

Int_18266_M9Q:string[max=35];
Int_05350_N0Q:string[max=35] {prefix=2};

This is from the OSH. Could you tell me where the prefix=2 is defined please?

Thanks

Posted: Thu Mar 01, 2007 5:23 am
by sjordery
Hello again all,

I have been through the job with a fine toothed comb and cannot for the life of me work out why there is "{prefix=2}" after the fields in the OSH. This to me seems likely to be the problem - or at least indicative of the problem.

Can anyone advise me please as to where this prefix is set?

Many thanks in advance.

Posted: Thu Mar 01, 2007 9:42 am
by sjordery
Hi All,

Managed to get around this by joining the datasets into a complex flat file, then loading the CFF to a sequential file, at which point it is readable in a Server job.

Still perplexed by the original problem, but am happy with a work-around, so onwards and upwards..

Thanks again to anyone that offered help.