Page 1 of 1

Is Unicode an overkill?

Posted: Fri Mar 11, 2016 11:44 pm
by jerome_rajan
Hi DS Gurus,
I'm reviewing some of the code built by my peers and came across a scenario where we are not (and never will) dealing with UNICODE data. But all the metadata(output definitions) within DataStage have been defined as UNICODE. The jobs are fairly complex with lookups, joins, change captures, etc.

How much of an impact is this going to have in terms of performance or otherwise?

Would also appreciate if someone can point me to an exhaustive DS checklist.
Thank you

Posted: Sat Mar 12, 2016 2:54 pm
by ray.wurlod
The only real impact of using Unicode is that the total number of bytes processed is necessarily larger (typically 2X). Not using Unicode means that you are limited to the ASCII code set. Depending on what external data you are using there may also be a requirement to translate from/to the encoding that the external data use.

As for "an exhaustive DS checklist", that's how several of us make our living (doing DataStage health checks). On that basis you are asking for a gift of our intellectual property. Would you countenance a commercial arrangement?

Posted: Sun Mar 13, 2016 12:58 pm
by rkashyap
We have seen issues due to Unicode, so have adopted a standard practice to use Unicode only when necessary.

Posted: Sun Mar 13, 2016 9:45 pm
by ray.wurlod
I have not seen such issues. However, even not using the Unicode extension for string data types, you still clearly have NLSMODE enabled (so that you can use Unicode as required).