Page 1 of 1

from BCP to Oracle using DataStage

Posted: Fri Jan 25, 2008 4:35 am
by Victoria
From SQLServer, I get my data through BCP and create a datafile. I read this datafile and load the data to Oracle using DataStage. My problem is that one of the columns in the table contains a text which has tabs and newline characters. Say this column name is email_msg varchar(500). It will have value such as:

Code: Select all

Sent: Wednesday, January 09, 2008 11:03 PM
To: Kathy Shouner
Cc: Brent Ford; William Dew
Subject: RE: Revision of presentation schedule

 

I support the recommended revisions.


When I open the datafile, it spans 6 separate lines - ending in carriage returns. Hence, I am having a problem importing it to my table. When I query the database, it appears as one line under that column - which is the way it should be.

My BCP command is something like this:

Code: Select all

BCP "select email_msg, email_id from emails" QUERYOUT "c:\email.txt" -c -T -S sunny.erwin.com
When I run DS, there are errors because of unmatching column values. IDeally, my records are separated by newline characters, however, one of the field values actually have newline characters. How can I properly load such data??? Please help..

Posted: Fri Jan 25, 2008 6:54 am
by WoMaWil
Hi Victoria,

If you Code something like:

Code: Select all

BCP "select '**ROW**', email_msg, email_id from emails" QUERYOUT "c:\email.txt" -c -T -S sunny.erwin.com 
you can tell DataStage, that a line beginning with **ROW** is a real new line and all other new lines belong to the row before.

Posted: Fri Jan 25, 2008 7:49 am
by chulett
Slide the Columns display over to the right and change the Contains Terminators value for that column to 'Yes'.

Posted: Mon Jan 28, 2008 9:45 am
by Victoria
I'm sorry but I'm no pro in DS. How do I do that? Besides, how will that handle the tabs (which are my field delimiters) inside that email_msg field? Hence, even if I have this **ROW**, because of the tabs inside that field, i may map incorrect column mappings.. :(

Posted: Mon Jan 28, 2008 10:13 am
by Victoria
Hi Chulett,
I made my datafile to have fixed widths, but I am getting "column unexpectedly ended by EOR".
All I need is to be able to load a field value that has tabs and newline characters inside it. I hope anyone can help me figure out how to do it..
Thanks..

Posted: Mon Jan 28, 2008 2:00 pm
by chulett
First off, you can't just declare your file to be 'fixed width' if it isn't, that will never work - so let's undo that change.

You don't need to be any kind of pro to make the change I suggested. Just go to the Columns tab in the Sequential File stage and notice there is a slider at the bottom of the window. Grab it and slide everything over so you can see the other columns 'hidden' off to the right. One of them will be Contains Terminators and you can try setting it to 'Yes'. That way the newlines in that field will just be treated as data and not the end of the record. That should let you read the file properly.

After that, it's up to you what to do with the data in the field. If you don't want to load the tabs and newline characters into your database, you could use Convert or EReplace to remove them from the field.

Posted: Mon Feb 04, 2008 9:09 am
by Victoria
hi Craig,

It did load the columns with newline characters correctly - however, it failed for those which have tabs. Do you think it will help if I change my delimiter?

Thanks..

Posted: Mon Feb 04, 2008 9:52 am
by chulett
Meaning you couldn't read the file or load the data with tabs in it? For the latter, remove them inline with EReplace or Convert:

Code: Select all

Convert(CHAR(9),"",YourField)
That will remove them from the field. For the former, we'd need more specifics.

Posted: Mon Feb 04, 2008 4:20 pm
by ray.wurlod
Does the Format tab nominate tab as the field delimiter character? Change it to 000 (or any other character not in the data). You can't have tabs in tab-delimited data unless all character strings are quoted. More generally you can't have "x" in "x"-delimited data unless all character strings are quoted.

Posted: Tue Feb 05, 2008 3:53 am
by ray.wurlod
That's OK, it just means that the data browser doesn't know how to display Char(137). DataStage is probably reading it successfully, so you can convert the Char(137) back to whatever you like within the job, perhaps tab (Char(9)), perhaps "". Whatever is appropriate.