Balancing of jobs

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
bluezed28
Participant
Posts: 6
Joined: Wed May 17, 2006 7:25 pm

Balancing of jobs

Post by bluezed28 »

I'd like to know or get some advice from any shops or projects that anyone has worked on where there was a requirement to ensure that all records were accounted for during processing, i..e records input - rejects/dropped = records extracted. How was this done, i.e. data stage job, sql , scripts etc. If it has not been done, what things do you have in place to ensure processing integrity.

One of my customers requires it, but due to the complexity and number of different jobs there are alot of rejects (for business reasons, i.e. not necessarily data issues), remove duplicates, aggregations, creation of records etc. which make trying to create a generic formula for all jobs impractical. The main goal is to prove to auditors and downstream consumers of the data that there is processing integrity on the system as well as provide a report if needed.

Thanks,
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard, This is usually done post-run with a routine that invokes DSGetLinkInfo() a sufficient number of times to get the row count for each link. This could then be matched, for example, against a set of business rules (in a database?) that stores information about which sums need to match.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Nagaraj
Premium Member
Premium Member
Posts: 383
Joined: Thu Nov 08, 2007 12:32 am
Location: Bangalore

Post by Nagaraj »

This can be usually handled with the help of Shell scripts.....!
we have implemented something like this in our project, where we get the counts and no of records processed.

soruce to temp counts, temp to stg counts, and stg to main tables count.
report will be something like this.
slno src_tbl Src_count Tgt_table Tgt_count Count_Diff
------------------------------------------------------------------
1 src1 10 Stg1 10 0
2 src 2 11 Stg2 11 0

slno STg_table Count Main_table Inserts Updates Deleted Count_dif
------------------------------------------------------------------------------------
1 Stg1 100 M_tbl1 25 25 50 0

Thanks
-Raj
Post Reply