Nulls - to convert or not to convert?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
SettValleyConsulting
Premium Member
Premium Member
Posts: 72
Joined: Thu Sep 04, 2003 5:01 am
Location: UK & Europe

Nulls - to convert or not to convert?

Post by SettValleyConsulting »

I am currently setting out some best practice guidelines for DS EE developers in our department and want to canvas opinions as to if it is possible or advisable to set down a firm rule for nullable columns.

We have two schools of thought. One camp maintains that nullable columns cause too many problems, e.g. in transform derivations, writing to sequential files etc and should be converted to non-nullable and a suitable value inserted to signify a Null as soon as possible in the data stream.

The other camp points out that it is not always possible to choose a suitable value (for example a bank account balance), that Null sometimes has a legitimate 'meaning' within the data and every data type conversion incurs a (small) performance penalty, they insist that if a column starts nullable it should remain nullable throughout, especially as it will probably be written to a nullable output column at the end of the ETL process, and transforms file, writes etc should be coded to handle nulls properly.

I am in the second camp, but I can see the legitimacy of the other viewpoint. What do you do at your site? Is there a standard or do you let developers choose on a case by case basis?

Not an interview question, I promise.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I prefer it when a column is not nullable; if there is a choice then I always recommend not allowing null values. This simplifies database queries and storage.

But as you have pointed out, there are many legitimate cases where a null value should be used. Trying to avoid a null in those columns is a mistake (I shudder to remembering the issues I've seen with "special" values in fields that are meant to denote a null or unknown value - what a mess!).

I think that there is no real choice when it comes to nullable fields; the metadata or business rules will determine nullability.
SettValleyConsulting
Premium Member
Premium Member
Posts: 72
Joined: Thu Sep 04, 2003 5:01 am
Location: UK & Europe

Post by SettValleyConsulting »

Thanks, Arnd.

As often happens - you have confirmed my own thinking.

BTW - SettValleyConsulting is actually my old company name, I am more usually known as Phil Clarke, currently sitting three floors above you in Griffin House :lol:

Note to self - must sign posts going forward.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Lol, now you've given away our location. Perhaps we can pretend that "Griffin House" is a code name :roll:

Actually, it does sound like a name that comes from a Harry Potter book.
SettValleyConsulting
Premium Member
Premium Member
Posts: 72
Joined: Thu Sep 04, 2003 5:01 am
Location: UK & Europe

Post by SettValleyConsulting »

... opinions as to if it is possible or advisable to set down a firm rule for nullable columns.

I guess the conclusion is no, it is not advisable to lay down a one-size-fits-all-occasions null conversion rule.

.. comes from a Harry Potter book

That would explain Matt's beard ... he fancies himself as the next Dumbledore ...
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I don't know if I'd aspire to be Dumbledore -doesn't he get ETL'd and loaded to the big database in the sky?
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Dumbledore

Post by asorrell »

I believe he becomes a null value...
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
Post Reply