reading excel file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Aggie99
Participant
Posts: 54
Joined: Thu Sep 04, 2008 6:54 pm

reading excel file

Post by Aggie99 »

I have an excel workbook "myworkbook" with two sheets, namely regions and mappings. And I have been trying to read the mapping sheet in datastage with no success.

note:
Because the excel file contained Chinese characters, I was not able to save it as csv file, which changed all the Chinese chars to ???? signs. So instead of processing the file as csv file, I need to handle it as an Excel file.

I came down to using the odbc stage, but I need help to set it up properly.

Can someone give me details on how to setup the odbc dsn for the excel file, and how would the select statement looks like.

thanks in advance,
aggie
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The main trick is that Excel worksheets present as "system tables". Therefore, you need to import the table definitions with "include system tables" option checked. It is also a requirement (of the ODBC driver for Excel, not particularly of DataStage) that the worksheets have the first line as column headings - these become the column names when table definitions are imported.

Secondarily you will, of course, need to specify the correct code page - the same one that Excel uses - as the NLS map for DataStage.

The SELECT statement is not special - let DataStage generate it. The only unusual feature is that "system tables" in Excel have names beginning with $.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Aggie99
Participant
Posts: 54
Joined: Thu Sep 04, 2008 6:54 pm

Post by Aggie99 »

thanks Ray.

I can now import the table definition with your advise.

Can you tell me which stage to use to read the Excel file. and how to set it up.

thx,
-aggie
naveen.p
Participant
Posts: 55
Joined: Sat Nov 21, 2009 5:19 am
Location: Chennai

Post by naveen.p »

Hi,

You could use a ODBC Stage and extract the data from Excel File.

Steps to be followed:

1. Create a Machine DSN which consists the path of the excel file.
2. Create the Named Range in the excel sheet.
3. Use a ODBC Stage, use the DSN name and Named Range as DB Name and Table respectively.

Let me correct if am wrong.
I hope this will work

Naveen
Aggie99
Participant
Posts: 54
Joined: Thu Sep 04, 2008 6:54 pm

Post by Aggie99 »

I did step 2 in Excel to create the range name. But still can't read it as table name. Error message was object (which was the table name I use) not exist.

I can import the table definition, but I still can't use odbc stage to read the worksheet yet.
Aggie99
Participant
Posts: 54
Joined: Thu Sep 04, 2008 6:54 pm

Post by Aggie99 »

ok, I made some progress and like to share with others.

On window:
1) open odbc and set up a dsn point to my excel file
2) I have one sheet in the excel file called mappings
3) open the excel file and define a name range.


In datastage:
1) use ODBC stage
2) under properties,
Connection:
Data Source = CUS (this is the dsn name i created)
User = I put the user id here
Password = " put the pwd here
Source:
Table=`E:\data\myexcel.xls`.`mappings` (i have to give it a fullpathname in order for it to work)


All worked.
But, the table name is a bit too long. I was thinking table=mappings instead. What can I do further?

thx again.
naveen.p
Participant
Posts: 55
Joined: Sat Nov 21, 2009 5:19 am
Location: Chennai

Post by naveen.p »

Hi,

You can use the Name Range as your table name.

Naveen
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Table name should be $mappings as I observed earlier.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply