Page 1 of 1

reading excel file

Posted: Sun Oct 10, 2010 2:45 am
by Aggie99
I have an excel workbook "myworkbook" with two sheets, namely regions and mappings. And I have been trying to read the mapping sheet in datastage with no success.

note:
Because the excel file contained Chinese characters, I was not able to save it as csv file, which changed all the Chinese chars to ???? signs. So instead of processing the file as csv file, I need to handle it as an Excel file.

I came down to using the odbc stage, but I need help to set it up properly.

Can someone give me details on how to setup the odbc dsn for the excel file, and how would the select statement looks like.

thanks in advance,
aggie

Posted: Sun Oct 10, 2010 5:36 am
by ray.wurlod
The main trick is that Excel worksheets present as "system tables". Therefore, you need to import the table definitions with "include system tables" option checked. It is also a requirement (of the ODBC driver for Excel, not particularly of DataStage) that the worksheets have the first line as column headings - these become the column names when table definitions are imported.

Secondarily you will, of course, need to specify the correct code page - the same one that Excel uses - as the NLS map for DataStage.

The SELECT statement is not special - let DataStage generate it. The only unusual feature is that "system tables" in Excel have names beginning with $.

Posted: Sun Oct 10, 2010 5:52 am
by Aggie99
thanks Ray.

I can now import the table definition with your advise.

Can you tell me which stage to use to read the Excel file. and how to set it up.

thx,
-aggie

Posted: Sun Oct 10, 2010 6:10 am
by naveen.p
Hi,

You could use a ODBC Stage and extract the data from Excel File.

Steps to be followed:

1. Create a Machine DSN which consists the path of the excel file.
2. Create the Named Range in the excel sheet.
3. Use a ODBC Stage, use the DSN name and Named Range as DB Name and Table respectively.

Let me correct if am wrong.
I hope this will work

Naveen

Posted: Sun Oct 10, 2010 9:06 am
by Aggie99
I did step 2 in Excel to create the range name. But still can't read it as table name. Error message was object (which was the table name I use) not exist.

I can import the table definition, but I still can't use odbc stage to read the worksheet yet.

Posted: Sun Oct 10, 2010 9:24 pm
by Aggie99
ok, I made some progress and like to share with others.

On window:
1) open odbc and set up a dsn point to my excel file
2) I have one sheet in the excel file called mappings
3) open the excel file and define a name range.


In datastage:
1) use ODBC stage
2) under properties,
Connection:
Data Source = CUS (this is the dsn name i created)
User = I put the user id here
Password = " put the pwd here
Source:
Table=`E:\data\myexcel.xls`.`mappings` (i have to give it a fullpathname in order for it to work)


All worked.
But, the table name is a bit too long. I was thinking table=mappings instead. What can I do further?

thx again.

Posted: Mon Oct 11, 2010 7:52 am
by naveen.p
Hi,

You can use the Name Range as your table name.

Naveen

Posted: Mon Oct 11, 2010 1:39 pm
by ray.wurlod
Table name should be $mappings as I observed earlier.