Sequential file issue

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
raviyn
Participant
Posts: 57
Joined: Mon Dec 16, 2002 6:03 am

Sequential file issue

Post by raviyn »

Hi,
1) I have a fixed width file , that has got a column name in FirstLine.
The EOR for the First Row i.e. i think the last column name is occuring a bit earlier.So DS job is getting aboted.
Does that mean that even though the first line is the Column name it still has to follow the metadata.

2) One more thing is on a .csv file,what has happened is that when i saved a .xls file to the .csv file for few records i have full metadata but for some others where there is no data for the end columns no blank commas have come i.e the format is something like this

When i saved from the .xls file to a .csv it looks like this
1,2,3,4,5
3,4,5
1,,2,,3

How do i handle both the conditions????
I think for second one is the Field option the only way out.

Reg
WoMaWil
Participant
Posts: 482
Joined: Thu Mar 13, 2003 7:17 am
Location: Amsterdam

Post by WoMaWil »

Hi Reg,

concerning (1)
fixed width is no *.csv so if you say in Format first line = column names it will be ignored, so this shouldn't be a problem.
concerning (2)
My expierence is: If you can switch your users away from Excel to Access or anything else: do it. This will save you a lot of painfull coding.
For your problem: In the first step read the whole file with seq as one line, count with the field-function the elements add the relevant number of comma missing and write it to a new file an then read it with the right metadata and you won't have any problems. If the number of fields are few you can beside this preventing reading it a second time by reading the fields thereafter with the field-function.


Wolfgang Huerter
=====================
Cologne, Germany
inter5566
Premium Member
Premium Member
Posts: 57
Joined: Tue Jun 10, 2003 1:51 pm
Location: US - Midwest

Post by inter5566 »

raviyn,

For your first question, yes you can get around this problem. I assume you have already found the checkbox on the seq-file format tab for column headers in the first row. this will ignore the first row, but still cause an error if it is shorter than your regular rows. The possible patch for this error is to go to the columns tab and scroll over to the right to the 'incomplete column' column and change the value from error to retain on the necessary fields at the end of your record. I would call this a patch, because you will lose data integrity testing in those fields you changed to retain.

As for the second question, I would check for any settings you have changed in excel. I tried a test .csv file and was able to see trailing commas around trailing null fields(using wordpad to view). Have you tried to view your .csv file with wordpad or some other editor? Perhaps the commas are there, but some setting in DS is ignoring them.

Hope this helps
Steve
raviyn
Participant
Posts: 57
Joined: Mon Dec 16, 2002 6:03 am

Post by raviyn »

Hi all,

Thanks for the First one "Patch" , Ds is not aborting.Is it some bug.

Yes, For the Second one I also don't know some problems in Excel.When i create the excel file it is OK...I think the problem is only in the way that Excel file was created by them or some other bug at Excel level i think .Yes I checked the file at Wordpad the columns were missing no DS problem.

Wolf,Eventhough i say the first line is column names,though it does not insert this row 1 data but it checks whether it is following the metadata and thus my DS job was getting aborted saying in Row1 EOR was found earlier.

Thanks for the help
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

raviyn,

I know this doesn't help you, other than to confirm your problem with Excel, but I have also had this problem when exporting data to flat files from Excel. It "forgets" to write anything at the end of the line if there is no data in those columns. Happens after the first 8 rows or so. So, you end up with too few columns in a line. Very irritating.

I'm not sure if MS acknowledges it as a bug or if, in some odd fashion, this is by design.

Tony
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

There is an option on the Sequential stage, at least in the newest version, to 'ignore row truncation'... something like that, anyway. [:I] It tells DataStage to not worry about a row that ends early. I would think that would help with #1.

-craig
Post Reply