Performance Issue (DB2/400 to Oracle 9i) Extraction

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
sreekanth kutumbaka
Participant
Posts: 16
Joined: Wed May 31, 2006 10:57 pm
Location: Hyderabad

Performance Issue (DB2/400 to Oracle 9i) Extraction

Post by sreekanth kutumbaka »

Hi,

I facing performance problem while extracting the data from DB2/400 using datastage ODBC connection.

----- Enviorment -- -------------
Source System
JD Edwards (ERP APPLICATION)
OS: AS/400
DB: DB2/400
----------------------------------------------
Target Database:
Oracle 9i
------------------------------------------------

The requirement is that we need to do extract the data from JD Edwards (DB2/400) database on a daily basis.

All the dates in Source system are in Julian Date formate

To extract the delta records from source system we are doing a lookup with a Current_status_table from oracle 9i (which keep track of load) with the DB2/400 table (eg.F0005 ) compare with the respective julian date field by doing a lookup. This is design is having poor performance issues.

Eg. some of the source system table has more than 40Lakhs of records out of which we may need to extract only 1000+ rows per day as per the date filter conditions. and in some case out of 40Lakhs record there no transaction on that particular day. In such case this lookup filter is checking each and every records and then after 2:30 hours its populating zero records.

To avoid this situation if we directly apply the date filter condition in the UserDefine SQL at the ODBC Stage Level its very very effective and fast.

But in need to assign this Job paramter at run time bcos this contains dynamic values keep changing on daily basis.

Please provide some effective solution how to assgin the run the parameter in datastage.


------------------------------------------------
SELECT FIELD1,FIELD2,FIELD3,FIELD4,FIELD5,FIELD5 FROM F0005
WHERE DRUPMJ > P_ST_DT AND DRUPMJ <= P_END_DT
-------------------------------------------------------------

DRUPMJ is the julian format date in (DB2/400)

P_ST_DT = 106001 julian format date ('01-JAN-2006')
P_END_DT = 106005 Julian format date ('05-JAN-2006')

The above two parameter values needs to be assign at run time.

Please help...

Regards,
Srikanth






--------------------------------------------------
Sreekanth
DS_SUPPORT
Premium Member
Premium Member
Posts: 232
Joined: Fri Aug 04, 2006 1:20 am
Location: Bangalore

Post by DS_SUPPORT »

You can assign those values as Job Parameter and use them in ur Source Query.

Create two job parameters PS_START_DT and PS_END_DT and change your quey like

Code: Select all

SELECT FIELD1,FIELD2,FIELD3,FIELD4,FIELD5,FIELD5 FROM F0005 
WHERE DRUPMJ > #PS_START_DT# AND DRUPMJ <= #PS_END_DT#
sreekanth kutumbaka
Participant
Posts: 16
Joined: Wed May 31, 2006 10:57 pm
Location: Hyderabad

re.Performance Issue (DB2/400 to Oracle 9i) Extraction

Post by sreekanth kutumbaka »

Hi,

Thanks for you reply.

Our requirement is to pass those parameter
(#PS_START_DT#,#PS_END_DT#) dynamically by reading from the database during run time.

------------------------------------------------------------------------------------
SELECT FIELD1,FIELD2,FIELD3,FIELD4,FIELD5,FIELD5 FROM F0005
WHERE DRUPMJ > #PS_START_DT# AND DRUPMJ <= #PS_END_DT#
------------------------------------------------------------------------------------

We have a table called DWTP_CURRENT_STATUS where we store the information of last run date and current run date.

SAMPLE DATA CONTAINS IN DWTP_CURRENT_STATUS
COLUMN COLUMN
NAME VALUE
~~~~ ~~~~~~~~
SRCID =JDE
BUS_LOAD_DT =6/10/2006
PRV_LOAD_JULIAN_DT =106001 -- assign value (#PS_START_DT#)
CUR_LOAD_JULIAN_DT =106282 --assign value(#PS_END_DT#)
PREV_LOAD_DT =1/1/2006


It would be appreaciate if you guys can let us know how to pick the value from the about table at run time and assign it to the datastage Job_parameter.

Regards,
srikanth.
Sreekanth
DS_SUPPORT
Premium Member
Premium Member
Posts: 232
Joined: Fri Aug 04, 2006 1:20 am
Location: Bangalore

Post by DS_SUPPORT »

Is it possible to do something like joinig the tables in the query itself

Code: Select all

SELECT FIELD1,FIELD2,FIELD3,FIELD4,FIELD5,FIELD5 FROM F0005 
WHERE DRUPMJ > (select PRV_LOAD_JULIAN_DT from DWTP_CURRENT_STATUS where condition) AND DRUPMJ <= (select CUR_LOAD_JULIAN_DT from DWTP_CURRENT_STATUS where condition)
If you are able to do this you dont require Job parameters itself
DS_SUPPORT
Premium Member
Premium Member
Posts: 232
Joined: Fri Aug 04, 2006 1:20 am
Location: Bangalore

Post by DS_SUPPORT »

If the above method is not possible, then the thing comes into my mind is

1) Create a Server Job which will read the data from DWTP_CURRENT_STATUS and write it to a Hashed file

2) Create a routine which will read the value from the Hashed File (Look into Routines/SDK for examples)

3) Use the routine activity in a Sequence Job and get the value from the routine and pass it to the server job.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Why do you assert that the performance issue (whatever is meant by "performance") is with the extraction phase? Have you proven this?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply