Japanese .xlsx data handling in DatStage 8.7

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
naveed.zuber
Participant
Posts: 19
Joined: Wed Jul 16, 2014 4:14 am

Japanese .xlsx data handling in DatStage 8.7

Post by naveed.zuber »

Facing issue in reading Japanese data in .xlsx format.
Tried to convert .xlsx to .csv,csv file converting Japanese charaters to ?????

What are the options available read/process the .xlsx(Japanese) data in DatStage 8.7(DB2 and Unix)

Please suggest.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard.

There are no options for reading XLSX files on UNIX in DataStage 8.7.
You need first to export them to CSV or some other text format. And you will need to do this on a Windows machine set to use similarly coded Japanese characters.

In version 9.1.2 and later you might be able to use the Unstructured Data stage to read XLSX files directly.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
naveed.zuber
Participant
Posts: 19
Joined: Wed Jul 16, 2014 4:14 am

Post by naveed.zuber »

Thanks Ray!

Converted XLSX to CSV and the CSV enabled to support Japanese charaters.
Now after moving CSV file from Windows to Unix,same CSV file showing Junk characters inplace of Japanese.
Tried with both Binay and ASCII mode of transfer from SSH Tectia File transfer.
Kindly suggest.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

naveed.zuber wrote:Now after moving CSV file from Windows to Unix,same CSV file showing Junk characters inplace of Japanese.
Showing how / where / using what tool?
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

How did you move the file to UNIX? If you used FTP, did you use BINARY mode?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Not familiar with the tool used but they did say this:
naveed.zuber wrote:Tried with both Binay and ASCII mode of transfer from SSH Tectia File transfer.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You answered Ray's question but not mine which is about the viewing of the transferred data rather than the transfer itself.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What is the language setting on the Windows machine where you're saving the CSV file, for this seems to be at the heart of the problem. Can you (or have someone else) save the CSV file on a Japanese version of Windows?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
naveed.zuber
Participant
Posts: 19
Joined: Wed Jul 16, 2014 4:14 am

Post by naveed.zuber »

The CSV file created by someone else.
I have been updated that the CSV enabled for Japanese characters created under unicode.
Your previous post seems to be premium content which it couldn't see completely.
What are the options available now? Please suggest.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The apparent problem at source appears to have been resolved. You now need accurate knowledge about how the Japanese characters are encoded in the CSV file, and you must effect a binary transfer to guarantee that nothing in the file is changed during its transition to UNIX file system. This should also mean that the file has Windows line terminators, so you may need to allow for that in the Sequential File stage used to read the file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What are you using to view the contents of the DB2 table? Do not rely upon DataStage's data browser (View Data) - it is notoriously bad when the language is anything but Amurrican. Try some other tool - even the db2 command itself.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply