Different character collation types across source and produc

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Xanadu
Participant
Posts: 61
Joined: Thu Jul 22, 2004 9:29 am

Different character collation types across source and produc

Post by Xanadu »

I posted this in another EPM specific thread too..but reposting it to hear from others..(i am not sure how many clicked on that thread after seeing that specific subject :) )

Wouldn't it be a problem if character datatypes in source and target follow different collation types ? Is there any advantage of using one over the other ?
In my implementation, all the target/staging tables use Latin1_General_BIN as collation type where as the sources use SQL_Latin1_General_CP1_CI_AS. This sometimes causes problems in user defined queries when doing the joins. (Collation conflict..)
Any one faced similar problem ?

~Xan
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It shouldn't be a problem, as you're not comparing or sorting or joining during the load process.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Xanadu
Participant
Posts: 61
Joined: Thu Jul 22, 2004 9:29 am

Post by Xanadu »

not during the load but even the staging tables have a different collation type...so this might be a problem during the transformation right ?
and also Ray.....one question..is there any particular reason why DWH tables contain char instead of varchar ?

thanks Ray
ray.wurlod wrote:It shouldn't be a problem, as you're not comparing or sorting or joining during the load process.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Q: is there any particular reason why DWH tables contain char instead of varchar ?

A1: There's the obvious one; whoever issued the CREATE TABLE command put them there. :lol:

A2: In some databases (probably most) extracting CHAR is far more efficient than extracting VARCHAR, at a physical level, and definitely far more efficient for loading. Some databases (transparently) put all the VARCHAR columns at the end of the physical record.

A3: Some databases, particularly older versions, do not support VARCHAR. For example, and from memory, VARCHAR support was only introduced into Red Brick at version 6, and that reluctantly.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply