Is Unicode an overkill?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
jerome_rajan
Premium Member
Premium Member
Posts: 376
Joined: Sat Jan 07, 2012 12:25 pm
Location: Piscataway

Is Unicode an overkill?

Post by jerome_rajan »

Hi DS Gurus,
I'm reviewing some of the code built by my peers and came across a scenario where we are not (and never will) dealing with UNICODE data. But all the metadata(output definitions) within DataStage have been defined as UNICODE. The jobs are fairly complex with lookups, joins, change captures, etc.

How much of an impact is this going to have in terms of performance or otherwise?

Would also appreciate if someone can point me to an exhaustive DS checklist.
Thank you
Last edited by jerome_rajan on Sun Mar 13, 2016 10:02 pm, edited 1 time in total.
Jerome
Data Integration Consultant at AWS
Connect With Me On LinkedIn

Life is really simple, but we insist on making it complicated.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The only real impact of using Unicode is that the total number of bytes processed is necessarily larger (typically 2X). Not using Unicode means that you are limited to the ASCII code set. Depending on what external data you are using there may also be a requirement to translate from/to the encoding that the external data use.

As for "an exhaustive DS checklist", that's how several of us make our living (doing DataStage health checks). On that basis you are asking for a gift of our intellectual property. Would you countenance a commercial arrangement?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Post by rkashyap »

We have seen issues due to Unicode, so have adopted a standard practice to use Unicode only when necessary.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I have not seen such issues. However, even not using the Unicode extension for string data types, you still clearly have NLSMODE enabled (so that you can use Unicode as required).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply