Page 1 of 1

Balancing of jobs

Posted: Wed Oct 22, 2008 9:11 pm
by bluezed28
I'd like to know or get some advice from any shops or projects that anyone has worked on where there was a requirement to ensure that all records were accounted for during processing, i..e records input - rejects/dropped = records extracted. How was this done, i.e. data stage job, sql , scripts etc. If it has not been done, what things do you have in place to ensure processing integrity.

One of my customers requires it, but due to the complexity and number of different jobs there are alot of rejects (for business reasons, i.e. not necessarily data issues), remove duplicates, aggregations, creation of records etc. which make trying to create a generic formula for all jobs impractical. The main goal is to prove to auditors and downstream consumers of the data that there is processing integrity on the system as well as provide a report if needed.

Thanks,

Posted: Wed Oct 22, 2008 11:21 pm
by ray.wurlod
Welcome aboard, This is usually done post-run with a routine that invokes DSGetLinkInfo() a sufficient number of times to get the row count for each link. This could then be matched, for example, against a set of business rules (in a database?) that stores information about which sums need to match.

Posted: Sat Oct 25, 2008 3:00 pm
by Nagaraj
This can be usually handled with the help of Shell scripts.....!
we have implemented something like this in our project, where we get the counts and no of records processed.

soruce to temp counts, temp to stg counts, and stg to main tables count.
report will be something like this.
slno src_tbl Src_count Tgt_table Tgt_count Count_Diff
------------------------------------------------------------------
1 src1 10 Stg1 10 0
2 src 2 11 Stg2 11 0

slno STg_table Count Main_table Inserts Updates Deleted Count_dif
------------------------------------------------------------------------------------
1 Stg1 100 M_tbl1 25 25 50 0

Thanks
-Raj