EVERYONE!
Pay attention.
APT_CombinedOperatorController,0: Field 'field name' from input dataset '0' is NULL. Record dropped.
If you are getting the above message, you are NOT programming correctly in DataStage EE.
This is NOT a case of...
It's quite common and we used to get this warning in our jobs.
NEVER EVER EVER EVER EVER EVER EVER EVER EVER EVER get used to this message.
You are DROPPING records due to this message. Records that your company is most likely needing for their future processing.
NEVER EVER EVER EVER... *deep breaths*... EVER IGNORE THIS MESSAGE!
* * *
Okay... now that I am done ranting... Pay attention:
The most common cause of this job is due to the Transformer stage. When you try to handle some data in a certain way, and it have NULL as a value, DataStage take one look at you and go, 'Hell no if I am going to handle your bad programming. RECORD DROPPED!'
Why is it bad programming? Lets assume here that you have an input field: Foo. You want to transform it to Bar, with the following function:
The function expects that there be an actual data there. However, if it is NULL... how can the function handle that?
Simply, it doesn't. Absolutely NO functions within DataStage handle NULL values (except, well... see below). It is fully expected that the Transformer pay attention to the NULL values. So the Transformer follows these rules:
1. If you have a derivative, check to see if the input data have NULL.
2. If the input data is NULL, well... see the entire record say "good bye!" on its way to the /dev/null in the sky.
So what do we do if we REALLY want that record? We HANDLE the NULL ourselves.
Code: Select all
If IsNull(input.Foo) Then SetNull() Else DecimalToString(input.Foo)
Yes, we have to explicitly tell it, "Hey, if this is a NULL value... pass it along as... a Null, otherwise, do the derivative."
Of course, if the output field is Not Nullable, you will need to supply an actual value.
* * *
Why is Null so special in DataStage. That is because the actual data field CAN be Null while the data field's metadata is NOT Null. You can actually pass an Ascii(0), and have DataStage consider it as a VALID value. The Null flag is set elsewhere within that field's metadata.
So, thus, unlike C/C++ where you can be lazy (sometimes), in DataStage, you must handle the Null field whenever it's possible and you're doing a derivative.
* * *
Is NULL BAD? No. In Kimball, however, you should minimize the possibility of NULL fields. Not to the point where you must provide a dummy Null value, mind you. But don't give EVERY database field the ability to contain Null if you know it will never have Null.
* * *
What is:
APT_CombinedOperatorController,0
You may ask?
It is DataStage's way of reducing the amount of processing by combining several stages together into one big stream all handled in memory. This allows for less disk I/O, and yield better performance. Woo woo!
But when you debug, it sure is a pain to identify the cause, eh? That's where $APT_DISABLE_COMBINATION comes in play. Set the variable in your Job parameter as "True", recompile, and let it rip. Suddenly, the offending stage's name will pop up where APT_CombinedOperatorController is.
Also, ",0" means the first node.
* * *
Now in summary, what have you learned today?
1. NEVER ignore Null messages.
2. NEVER ignore messages that says "dropping records."
3. Null values must be explicitly handled when you do derivatives.
4. APT_DISABLE_COMBINATION works pretty well for debugging.
Please. The warning messages are there for a reason. Learn what those reasons are by experimenting with your job. Above all, some warning messages are pretty mild, but other can mean your job.
After all, what is there to stop your company from firing you for dropping 25% of their production data for several months simply because it have a NULL somewhere that is not handled?
That's it for today. Have fun!