Page 1 of 1

Implementing backdated processing

Posted: Tue Jul 07, 2009 9:38 pm
by sbass1
Hi,

I have to implement processing where, if a "backdated" record occurs, update a column for all records greater than or equal to the backdate.

Say I have this data in my target table:

Code: Select all

KEY  DATE      FLAG_VAR
1    20090701  T
1    20090702  T
1    20090703  T
1    20090704  T
2    20090701  F
2    20090702  F
2    20090703  F
2    20090704  F
Now say I have this data in my source table:

Code: Select all

BACKDATED_FLAG KEY  DATE      FLAG_VAR
T              1    20090702  F
F              3    20090705  F
F              4    20090705  T
My desired output is:

Code: Select all

KEY  DATE      FLAG_VAR
1    20090701  T
1    20090702  F
1    20090703  F
1    20090704  F
2    20090701  F
2    20090702  F
2    20090703  F
2    20090704  F
3    20090705  F
4    20090705  T
IOW, the non-backdated records were added, and the backdated records were updated from the date of the backdate and all records forward. Note lines 2-4 in the desired output.

The target updates are split based on Backdated_Flag (i.e. two output links to the target table). Non-backdated data is Update or Insert (and updates many more columns than the dummy data above), and backdated data is Update Existing Row Only (any only updates a select number of columns).

That's the dummy data to illustrate the problem. My "real" generated SQL for the backdated link is:

Code: Select all

UPDATE Source.Account SET AccountOptedInOut=? WHERE SourceSystem=? AND EffectiveDate=? AND AccountNumber=?
This would do exactly what I want, if I could coax the generated code to be:
UPDATE Source.Account SET AccountOptedInOut=? WHERE SourceSystem=? AND EffectiveDate>=? AND AccountNumber=?
I'm pretty sure I can't use the "?" syntax in user written SQL code.

The target database is SQL Server.

So, any ideas on how to accomplish this?

Thanks,
Scott

Posted: Tue Jul 07, 2009 9:52 pm
by ray.wurlod
Use an immediately updated hashed file in the process, to detect whether the record has already been processed (and which will always contain the most recently used value), then Insert or Update into the target. If you prefer, interpose an Aggregator stage to group by key and only deliver the Last record of each group.

Posted: Wed Jul 08, 2009 1:38 am
by sbass1
ray.wurlod wrote:Use an immediately updated hashed file in the process, to detect whether the record has already been processed (and which will always contain the most recently used value), then Insert or Update into the target. If you prefer, interpose an Aggregator stage to group by key and only deliver the Last record of each group.
Hi Ray,

I don't understand this, nor am I sure it solves the stated problem. But I'm probably just being dense.

I need to update data from a particular date forward for backdated records. I don't know how to get DS to do this without user generated code.

Having said that, I think I may have a workaround. I'll try to explain...

I have a transform that splits out two links, NotBackDated and BackDated. NotBackDated updates or inserts the target table. BackDated updates a scratch table in SQL Server. This link "writes" to /dev/null, then /dev/null feeds back into the target table. This ensures that both links finish before user-generated code executes.

It looks something like:

Code: Select all

xfm --> NotBackDated --------------------------------> TargetTable
 |                                                          |
 |                                                          | (user SQL)
 + ---> BackDated --> TempTable --> SeqFile (/dev/null) ----+
This user-generated code is:

Code: Select all

UPDATE Source.Account
SET EODBalanceNaturalCcy = S.AccountEODBalanceNaturalCcy
FROM Source.Account T, Temp.Account_Backdated S
WHERE
       T.EffectiveDateSK  = S.EffectiveDateSK
   AND T.AccountMonthlySK = S.AccountMonthlySK

UPDATE Source.Account
SET OptedInOut = S.AccountOptedInOut
FROM Source.Account T, Temp.Account_Backdated S
WHERE
       T.EffectiveDateSK  >= S.EffectiveDateSK
   AND T.AccountMonthlySK  = S.AccountMonthlySK
If the EODBalanceNaturalCcy is backdated, I only update the single record. If the OptedInOut flag is updated, I need to update OptedInOut from that date forward.

Using user-generated SQL is somewhat problematic, as the business wants per-record warnings if any rejects occur. I don't know how to retrieve this information from SQL Server using user-generated code.

Sorry this is so vaguely described :? If necessary I can put a picture and DSX on Google Docs.

Cheers,
Scott

P.S.: Hope Melbourne is treating you well...

Re: Implementing backdated processing

Posted: Wed Jul 08, 2009 6:29 am
by chulett
sbass1 wrote:This would do exactly what I want, if I could coax the generated code to be:
UPDATE Source.Account SET AccountOptedInOut=? WHERE SourceSystem=? AND EffectiveDate>=? AND AccountNumber=?
I'm pretty sure I can't use the "?" syntax in user written SQL code.
Of course you can! Switch the stage to user-defined / custom sql and change the "=" to ">=" just like you've noted and you'll be fine.

Re: Implementing backdated processing

Posted: Wed Jul 08, 2009 11:18 pm
by sbass1
chulett wrote:Of course you can! Switch the stage to user-defined / custom sql and change the "=" to ">=" just like you've noted and you'll be fine.
Doh! Should have read the help before posting (sorry).

Yep, that fixed it. Thanks for the help!