Conditional Deletion

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Conditional Deletion

Post by chandra.shekhar@tcs.com »

Hi,
I have a requirement where I have to load the data daily in a table after deleting previous day's data based upon the column TRAN_DT(Transaction Date).
While deleting the data I have to make sure that I delete only previous day's data only and not today's and future day's data.
I have done the same through "Before SQL" property available in DB2 Connector stage in target using below query.

Code: Select all

DELETE FROM TABLE WHERE TRAN_DT < #BusinessDate#
The data which gets delete daily is almost 30 million, so sometimes the job gets aborted due to "transaction log full".
Can there be any other approach other than creating another job ?
Thanx and Regards,
ETL User
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Well... you're not deleting just the "previous day's data" with that statement, you are deleting ALL dates less than the business date. Are those two equivalent? Is everything less than the business date all from the previous day?

ps. Creating another job won't change how a 30M record transactional delete behaves.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

Yeah Craig, when I said "previous day's data" I meant previous business day's data. Both are same thing.
Everything is to be deleted which of previous day or older.
Thanx and Regards,
ETL User
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

If the amount of data you are keeping is a very small portion of the data to be deleted, you may want to look into the old trick of creating a new table with just the data to keep copied into it and then drop and swap.
-craig

"You can never have too many knives" -- Logan Nine Fingers
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

If you have access rights you can alter the table with NOT LOGGED INITIALLY prior to starting the deletion.

Refer to this: http://www-01.ibm.com/support/docview.w ... wg21215818

This turns off transaction logs during the process. Not great, because recovery can be problematic, but better than filling up the log and making the DBA unhappy.

Note: Don't forget to use the After SQL to turn logging back on!
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

@Craig,
I tried to do the same but if I have to dump that small portion of data, I need to create a separate job for it. Cant there be something else ?

@Andy
Ok, I'll try it but 1 query.
DELETE is a reversible process, It'll create transaction logs all the time Or there will not be any difference between TRUNCATE and DELETE ?
Can you tell me the exact syntax for activating it back to normal ?
And will it turn off the logging for whole database or for that table only ?
Thanx and Regards,
ETL User
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

chandra.shekhar@tcs.com wrote:I tried to do the same but if I have to dump that small portion of data, I need to create a separate job for it.
So? You'd just need to run it first. You haven't clarified for us how much data is "left" after the delete (or if you could get the grants to do that) to help decide if the approach I mentioned is even appropriate. However, if it is who cares if that means another job? :?
chandra.shekhar@tcs.com also wrote:DELETE is a reversible process, It'll create transaction logs all the time Or there will not be any difference between TRUNCATE and DELETE ?
First you'd need to get the permissions you'd need to alter the table (twice) and I'd wager that is not going to happen. And of course there's a difference between a truncate and a delete. The former empties the table and is not reversible. The latter removes targeted rows and can be rolled back - unless you turn off logging.

IF you can truncate the table rather than delete, that would be the preferred approach. If.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

@Craig,
Thats the thing I don't want to do.
I have kept that as the last option. I was hoping if I can do something with in the same job
Thanx and Regards,
ETL User
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

If you want to keep logging on, then the only solution is to tell your DBA to increase the space available for the logging. That's your only option.

There's no way to compress or limit the amount of logging, that is controlled by the database, not DataStage. You can either turn it off on a per-table basis or live with the consequences of recoverability.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
MT
Premium Member
Premium Member
Posts: 198
Joined: Fri Mar 09, 2007 3:51 am

Post by MT »

Hi

one way is to create a stored procedures which handles the deletion for you and add some intermediate commits so the log full condition does not occur.
This would be a way (and we do it this way) if you need the deletes logged.

For any other solution I would like to know more details but in general talking to the DBA is always a good idea.
regards

Michael
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

Hi All,
Based upon the responses I got, following are the possible options for me :-

1) Create a separate job as suggested by Craig.
2) Use No Logging option as suggested by asorrell(It isn't feasible since according to our DBA that will make the table inaccessible if the job aborts during the deletion. And recreating the table is the only way to get back the table).
3) Ask the DBA to increase the logging space( It isnt feasible too since it'll involve approval from client and lot of standard processes have to be followed :( )
4) Create a procedure which which does intermediate commit as suggested by MT.

I think I have to create a separate job :(
I was a bit reluctant to do it because I am already having 3 jobs to populate this table(and now 4th job :cry:).
Thanks guys for your support
Last edited by chandra.shekhar@tcs.com on Wed Apr 17, 2013 12:45 am, edited 1 time in total.
Thanx and Regards,
ETL User
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Create a sequence to run them all, then the client has only the one "job" with which to interact.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

We normally do not create sequences here at my project :(.
I know it sucks...
Thanx and Regards,
ETL User
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

:idea: "normally do not" does not preclude doing one.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

:)
Thanx and Regards,
ETL User
Post Reply