Page 1 of 1

AGGGGGGGGGGGGG!!!

Posted: Thu Jan 08, 2004 11:37 am
by JDionne
ok im gona go crazy here. I have taken previous advice and and running a large file through three transformsers creating three small files and then using the dos copy comand i am adding them back together for a load.
Im getting the following error. JOCDEVLoadStage..Sequential_File_0.DSLink3: nls_read_delimited() - row 627743, column LINE, required column missing

this made me think that i have a null in the data...i have further investigated and found no Nulls or blank strings in that column. I have to load this now so im gona load all the little files but i have to get this fixed.
any suggestions would be great.
Jim

Posted: Thu Jan 08, 2004 11:49 am
by kcbland
Are you using fixed width files or delimited files? When dealing with sequential files you have to be aware of embedded <CR> and <LF> characters, as well as in the case of delimited files the delimiter being used within the data.

It doesn't matter if 1 big file or 100 small files are used. I suspect your issue is related to an embedded <CR> or <LF>. This messaging typically happens because the <CR> or <LF> forms a newline in the data, which will be the remainder of the previous line. Now, your metadata and data do not match, thus the error message.

Unless line 627743 falls near where two files were concatenated together (look at your row counts on the 3 files and figure it out), then your problem is data related. The other day I told you to make sure you did NOT check omit newline, because of the nature of sequential files behaving without a trailing newlnie. If you didn't heed that advice, this could also be your problem.

Posted: Thu Jan 08, 2004 12:02 pm
by JDionne
kcbland wrote:Are you using fixed width files or delimited files? When dealing with sequential files you have to be aware of embedded <CR> and <LF> characters, as well as in the case of delimited files the delimiter being used within the data.

It doesn't matter if 1 big file or 100 small files are used. I suspect your issue is related to an embedded <CR> or <LF>. This messaging typically happens because the <CR> or <LF> forms a newline in the data, which will be the remainder of the previous line. Now, your metadata and data do not match, thus the error message.

Unless line 627743 falls near where two files were concatenated together (look at your row counts on the 3 files and figure it out), then your problem is data related. The other day I told you to make sure you did NOT check omit newline, because of the nature of sequential files behaving without a trailing newlnie. If you didn't heed that advice, this could also be your problem.

@#@$%!#$#@%!%@#$^#$$@# stupid delimiters!!! im willing to put money on the fact that i have an imbeded dilimiter. Ill check and let you know.
Jim

Posted: Thu Jan 08, 2004 12:50 pm
by JDionne
kcbland wrote:Are you using fixed width files or delimited files? When dealing with sequential files you have to be aware of embedded <CR> and <LF> characters, as well as in the case of delimited files the delimiter being used within the data.

It doesn't matter if 1 big file or 100 small files are used. I suspect your issue is related to an embedded <CR> or <LF>. This messaging typically happens because the <CR> or <LF> forms a newline in the data, which will be the remainder of the previous line. Now, your metadata and data do not match, thus the error message.

Unless line 627743 falls near where two files were concatenated together (look at your row counts on the 3 files and figure it out), then your problem is data related. The other day I told you to make sure you did NOT check omit newline, because of the nature of sequential files behaving without a trailing newlnie. If you didn't heed that advice, this could also be your problem.
Its not a delimiter problem i reran the job with a dilimter that was not in the data "|" and it still failed. I havent seen the option of omit newline so im sure i didnt check it. like to know where it is though so that i can check to be certain. i dont think i have a <CR> or <LF> in the data, i can get the original file to load its the file that has been combined with the dos copy command that will not load. still scratching around here
Jim

Posted: Thu Jan 08, 2004 1:54 pm
by roy
Hi,
well another way might be the type command something like

Code: Select all

type file1 file2 file3 > file4
in case that doesn't work you might have some problem with 1 of 2 things;
1. as Ken said a line brake due to some unseen character in your data (could be a CR)
2. a delimiter missmatch between the 3 files.

the thing that bothers me is that loading each file works, so could it be that somehow the copy command messes things up? :shock:

do tell us if and how you solved this.

Posted: Thu Jan 08, 2004 1:58 pm
by kcbland
Look at the sequential stage file definition. There is a check box for omit new line. If you wrote a file with this NOT checked, the data looks like this:

Code: Select all

file 1:
aaaa <LF>
bbbb <LF>
cccc <LF>

file 2:
dddd <LF>
eeee <LF>
ffff<LF>

concatenated you get:
aaaa <LF>
bbbb <LF>
cccc <LF>
dddd <LF>
eeee <LF>
ffff<LF>
If you wrote a file with this checked, the data looks like this:

Code: Select all

Omitting newline file 1:
aaaa <LF>
bbbb <LF>
cccc

Omitting newline file 2:
dddd <LF>
eeee <LF>
ffff

concatenated you get:
aaaa <LF>
bbbb <LF>
ccccdddd <LF>
eeee <LF>
ffff

Hash File??

Posted: Thu Jan 08, 2004 2:53 pm
by 1stpoint
Why are you writing to 3 separate dos files and then concatenating them??
I would have used a non-indexed hash file and then cleared it at the beginning of the process.

Re: Hash File??

Posted: Thu Jan 08, 2004 2:56 pm
by kcbland
1stpoint wrote:Why are you writing to 3 separate dos files and then concatenating them??
I would have used a non-indexed hash file and then cleared it at the beginning of the process.
Because you incur the overhead of hashing and seeking/writing/dynamically resizing/overflowing/etc a hash file when what you need to do is divide-and-conquer data and stream it to a reliable landing zone for recombination. The link collector is pitiful, and mkfifo pipes are problematic, not to mention they are destructively read, so no restart capabilities.

He's doing it on my recommendation, please read the prior commentary on this post.

Posted: Thu Jan 08, 2004 3:37 pm
by JDionne
kcbland wrote:Look at the sequential stage file definition. There is a check box for omit new line. If you wrote a file with this NOT checked, the data looks like this:

Code: Select all

file 1:
aaaa <LF>
bbbb <LF>
cccc <LF>

file 2:
dddd <LF>
eeee <LF>
ffff<LF>

concatenated you get:
aaaa <LF>
bbbb <LF>
cccc <LF>
dddd <LF>
eeee <LF>
ffff<LF>
If you wrote a file with this checked, the data looks like this:

Code: Select all

Omitting newline file 1:
aaaa <LF>
bbbb <LF>
cccc

Omitting newline file 2:
dddd <LF>
eeee <LF>
ffff

concatenated you get:
aaaa <LF>
bbbb <LF>
ccccdddd <LF>
eeee <LF>
ffff
Ill give it a gander
Jim

Posted: Thu Jan 08, 2004 4:44 pm
by ray.wurlod
There is one other possibility, and that is that a required column is indeed missing. In the Columns grid of the Sequential File stage, scroll to the right and you can change the rules about what happens when a column is missing, for example from aborting the job to substituting a pad character (maybe NULL).

Posted: Thu Jan 08, 2004 5:08 pm
by wdudek
Since you stated that loading each file works, I'd suggest using a hex editor to look at the concatenated file, at the byte position starting where the first file ends, to see if there's anything unusual in the data that won't show up in an editor.

Posted: Fri Jan 09, 2004 7:24 am
by JDionne
wdudek wrote:Since you stated that loading each file works, I'd suggest using a hex editor to look at the concatenated file, at the byte position starting where the first file ends, to see if there's anything unusual in the data that won't show up in an editor.
um hex editor? have no idea even about the name of such a thing. Which one would u sugest?
Jim

Posted: Fri Jan 09, 2004 12:29 pm
by wdudek
Try this one

http://www-physics.mps.ohio-state.edu/~prewett/hexedit/

it's free so your not wasting any money if it doesn't work out for you. Ultra edit also works, but unless you pay for it you can only use the trial version.

Hex editors are similiar to a text editor like notepad, except that you are not viewing the ascii characters that may or may not be displayed in such an editor, but instead the decimal or hexadecimal (thus the name) number that represents this data. Therefore, you will be able to see the carriage return and linefeed characters 13 and 10 in dec (i might have that backwards) and any other non displayable character that may be in your data and could be causing problems.

Posted: Fri Jan 09, 2004 12:45 pm
by JDionne
wdudek wrote:Try this one

http://www-physics.mps.ohio-state.edu/~prewett/hexedit/

it's free so your not wasting any money if it doesn't work out for you. Ultra edit also works, but unless you pay for it you can only use the trial version.

Hex editors are similiar to a text editor like notepad, except that you are not viewing the ascii characters that may or may not be displayed in such an editor, but instead the decimal or hexadecimal (thus the name) number that represents this data. Therefore, you will be able to see the carriage return and linefeed characters 13 and 10 in dec (i might have that backwards) and any other non displayable character that may be in your data and could be causing problems.
thanx ill give it a go and get back to you guys at the begining of the week
Jim

Posted: Mon Jan 12, 2004 8:00 am
by aaronej
JDionne wrote: Its not a delimiter problem i reran the job with a dilimter that was not in the data "|" and it still failed. I havent seen the option of omit newline so im sure i didnt check it. like to know where it is though so that i can check to be certain. i dont think i have a <CR> or <LF> in the data, i can get the original file to load its the file that has been combined with the dos copy command that will not load. still scratching around here
Jim
Try using a routine that checks the data for carriage returns or line feeds. I have this issue and use some code like this ina routine to check this:

Code: Select all

Ans = CONVERT(CHAR(13), '', CONVERT(CHAR(10), '', Arg1))
I just pass the data I am looking at into this routine (Arg1) and spit out the data with the line feeds and carriage returns removed (Ans).

Hope this helps!

Aaron