Pre-row and Post-row Counts

iq_etl · Post by **iq_etl** » Fri Jul 19, 2013 9:03 am

For each job that we run, we want to first collect the number of rows on the table, then after running the job, the number of rows now on the table. Also, we'd compare these and if there's a sufficient difference we plan to send an email so the ETL developers are notified.

So, here's my question. Which of the following two approaches would be the best practice:

1. Container Stages so the same logic can be used in multiple jobs.

2. Create a pre-row count job, a post-row count job, and a comparison job, then put those jobs in a Job Sequence.

I understand Container Stages are for logic that is used in multiple jobs, whereas Job Sequencers are more to collect multiple jobs into an application. With that understanding, I'm leaning towards option one using Containers.

Thoughts?

(This is actually for 9.1 not 8.x)

ray.wurlod · Post by **ray.wurlod** » Fri Jul 19, 2013 3:59 pm

You might also consider before-job and after-job subroutines.

Shared Containers could be used. They would have to be the first and last stages in each job design.

Jobs invoked from sequences could be used. I'd be using server jobs here, since only one row (the count) needs to be processed.

They're all valid approaches.