Good Debugging Practices:Guidance needed

vijayrc · Post by **vijayrc** » Wed Jan 31, 2007 3:39 pm

Hi,
Most companies use DS for DataWarehousing etc and we are using it for our Application Developement. DS is a new product in our place and ours is the first group to be tried out. We are in the learning curve and in the short time, have grasped the tools' capabilities/power etc, and at the same noticed its glitches [particulary debugging part ]
I would like to get some guidance towards this subject. Besides putting in Peek, Copy Stage with Dataset etc etc., in a Application Environment where DS is being used, what would be the best practice to establish useful debug-related stuff. In other words, how do you go about developing your functionality with debugging aspect in mind. I know I'haven't clearly expressed this. Thanks a ton. -Vijay

ray.wurlod · Post by **ray.wurlod** » Wed Jan 31, 2007 4:34 pm

Planning and Peer Review
Before you even touch DataStage, sketch out your plan of attack. Which stage types will you use, and why. Sources and targets - are their table definitions in Repository? Are any custom routines needed, or can all logic be performed using available components? Sketch out the transformation expressions you plan to use. Get a colleague to check it. (Get the janitor to check it also - you'd be surprised what they spot that the "experts" miss! Consider Dilbert's garbage man, for example.)

Incremental design
Design the "E" part first and get it working. Tested.
Then design the "T" part - break this down into smaller pieces where possible. Test as you go. Document test results. All the earlier pieces tested OK, so any problems must be in the current piece.
You can use a Copy stage with no output to consume the rows or a Peek stage with no output to consume the rows and direct a sample to the job log.

Test Data
Use Row Generator stage to create test data with which to test your logic.

Compile in Trace mode
This is an option in the job properties dialog. When you run the job, only process a small number of rows; otherwise your trace files will be large and cumbersome. Recompile not in trace mode when done.

Conditionally Compiled Routine Statements
If you develop your own routines or BuildOp, you can use #define to set up tokens and #ifdef to include statements purely for debugging purposes in your code. Disable before promoting out of development.

narasimha · Post by **narasimha** » Wed Jan 31, 2007 4:37 pm

Good points Ray!
I would vote to move this post to the FAQ forum.

DSguru2B · Post by **DSguru2B** » Wed Jan 31, 2007 6:31 pm

I am putting this in my favourites bucket

vijayrc · Post by **vijayrc** » Wed Jan 31, 2007 6:35 pm

ray.wurlod wrote:Planning and Peer Review
Before you even touch DataStage, sketch out your plan of attack. Which stage types will you use, and why. Sources and targets - are their table definitions in Reposi ...

thanks Ray..It helps.
Any other suggestions/best practices are welcome and appreciated.

ray.wurlod · Post by **ray.wurlod** » Wed Jan 31, 2007 10:58 pm

Whilst it's not in the debugging basket, I always add a sanity check to make sure that I'm doing things as efficiently as I know how, especially not processing any data unnecessarily or evaluating any expression more than once per row.

DSguru2B · Post by **DSguru2B** » Thu Feb 01, 2007 7:39 am

That is certainly a very good practice. All these things make up for a good design with no room for tuneups

chulett · Post by **chulett** » Thu Feb 01, 2007 8:46 am

Much like Jello, there's always room for tuneups.

DSguru2B · Post by **DSguru2B** » Thu Feb 01, 2007 8:59 am

Let me rephrase myself, "......with little room for tuneups"

ray.wurlod · Post by **ray.wurlod** » Thu Feb 01, 2007 4:42 pm

Delighted I am that you have been paying attention, grasshopper. One small step along the path to enlightenment.

DSguru2B · Post by **DSguru2B** » Thu Feb 01, 2007 6:11 pm

Always pay attention to your comments Ray

Thats how my journey seems to go on towards enlightenment