How to load data with Special Characters

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
bmouton
Participant
Posts: 18
Joined: Thu Nov 06, 2008 8:31 pm

How to load data with Special Characters

Post by bmouton »

Hi and thanks for reading ...

We have a database that we are extracting from that has the possibility of ALL and ANY special characters that is humanly / computerly (is that a word?) possible. So ... We can't use a pipe (|) delimiter or any other type of delimiter.

If our data was just varchar, we could use fixed length formats and dump the data, then move it to our target platform, then load the data. However, we are using DB2 8.2 and 9.5 which we have CLOB data. So, space is now an issue with fixed length records ...

Long story short, is there an easy method to dump the data in a variable length format with some kind of delimiter? Since our application / database accepts all special characters, I don't know how we can dump the data, then read it ...

We have DS Server 8.0.1 running on SUSE Linux 10. We were thinking we could extract and land the data in XML files. However, we are not having any luck with using the XML approach. We can't get the XSDs right and there is not much documentation, tutorials, or classes that address this that we have sought.

Please help ...

Thanks,

Benny
John Smith
Charter Member
Charter Member
Posts: 193
Joined: Tue Sep 05, 2006 8:01 pm
Location: Australia

Post by John Smith »

Can you please clarify "humanly/computerly" ? Do you mean what could be input from the keyboard? Or that your data will contain all possible ASCII characters eg char(167) ?
Producing XML is possible with the XML Output stage. Try the IBM Developerworks website , there are some handy tips/examples in there.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

DB2 9.5 can also produce XML, which may be an easier solution. And you could use it to extract the data from the version 8.2 database.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Re: How to load data with Special Characters

Post by chulett »

bmouton wrote:We can't get the XSDs right and there is not much documentation, tutorials, or classes that address this that we have sought.
Documentation on XML in DataStage? Have you gotten your hands on the XML Best Practices document that Kim Duke is hosting for us in his Tips section?
-craig

"You can never have too many knives" -- Logan Nine Fingers
bmouton
Participant
Posts: 18
Joined: Thu Nov 06, 2008 8:31 pm

Post by bmouton »

Thanks for the information so far ... the XML Best Practices information seems to be helping us ...

Here is what we are doing. Based on a specific set of extract criteria, we use a DB2 API stage to pull data and land in a file (whether it be XML or Sequential). Due to the facts that any possible typed or otherwise entered (e.g. Escape Sequences, Chinese Characters, Greek Characters) COULD be in our character-based fields (including char, varchar, CLOB), we have to find a way to land that data in a format that another set of Datastage jobs can read.

It's the typical three-tier data architecture. Extract the source data and land it, clean that file in a staging area, then load it.

We have tried to just use a Sequential File stage with a variety of delimiters. No luck.

The target data structures are a DB2 9.5 relational database. We don't have the option to move toward an XML format due to the BI Tools we are using. Not only that, we are less than 90 days from implementation.

Our team is going to review the XML Best Practices documentation. It seems that our issue has been creating the "right" XSD to land the file and then read it ....

Again thanks for the help so far!!!
Post Reply