Database Insert/Update taking long time in datastage server

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
HemaV
Participant
Posts: 63
Joined: Wed Jan 09, 2008 1:38 am
Location: Bangalore

Database Insert/Update taking long time in datastage server

Post by HemaV »

I am running job in Datastage server 7.5 version

Table_lookup Stage:
select Column1,Column2 from tablename;

SequenceOracle Stage:
select seqname.nextval as Column1 from dual;

SequentialSourceFile--------> Trans1
Table_lookup Stage-------->HashLookupFile------->Trans1
SequenceOracle Stage -------------> Trans1

In transformer (Trans1) the logic is as follows:
Constraint for insert is: if HashLookupFile results zero records then insert the records to the table.
Constraint for Update is: If HashLookupFile results some records then Update the records to the table.

Collect Column1 values for the records that got inserted or updated to the table in a target sequential file for further processing.

Here the problem what I am facing is,
Whenever the records insert or update it is taking very long time due to the SequenceOracle Stage. the records inserts or updates 9rows/sec.

When I include the "seqname.nextval" in the insert table stage as
{{
insert into tablename(empid,empname) values (nvl(:1,seqname.nextval),:2)
rather than
insert into tablename(empid,empname) values (:1,:2)
}}
and delete the "SequenceOracle" Stage it is taking very less time but I am unable to collect those records which got inserted in this way to the target sequential file.

Can someone please suggest me is there any way to collect the Column1 values which are getting inserted into the table into the target file?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

So... it's not the "Insert/Update" that is taking a long time but the reference lookup to Oracle. And when you assign a new surrogate key value in the inserts, you want to capture those values in a flat file for "further processing". Yes?

How enamored are you of this database sequence? There are other, more... "DataStage centric" approaches to surrogate key generation which would simplify your life greatly and be speedy - say, for example, the KeyMgmt routines in the sdk. Or a simple counter in your job, seeded by a job parameter starting point, which works great as long as there's no concurrency issues. All of which can be easily diverted to a flat file.

For the Oracle sequence without a reference lookup, you'll need something on the database side to capture it... a trigger, for example, to write records to a work table that you use for your "further processing" and then clear when complete. Or you mark the records you insert in such a way (batch number? timestamp?) so that you can uniquely indentify them after the job completes and then select your list of new surrogates from the target to drive whatever processing you need to do next.

Things of that nature.
-craig

"You can never have too many knives" -- Logan Nine Fingers
HemaV
Participant
Posts: 63
Joined: Wed Jan 09, 2008 1:38 am
Location: Bangalore

Post by HemaV »

Can you please give me an idea how to use the triggers in my job design for datastage server?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:? Triggers are database objects so you'd need to talk to your DBA for specifics on that. All I'm saying is you could perhaps use an insert trigger to put your data into a work table (rather than a flat file) and then drive any further processing from that work table.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Even though trigger can be another design, I doubt it will increase the speed efficiency as it will be another SQL fired for each row.

Alternative solution may be to identify currval of sequence number before and after the run and use it to obtain list of inserted values.

This is just a starting point as you will have to design around it.
Post Reply