Posted: Wed May 09, 2001 3:32 am
Get Oracle to select just the required records.
Use a job parameter in the WHERE clause of user-defined SQL. SELECT col1, col2, ... FROM table WHERE INSERT_DATETIME > #LastRetrieveDate#;
Create a job control routine to set the value of the controlled jobs parameter, after reading it from wherever it is stored.
-----Original Message-----
From: Melvin Thong [mailto:melvin@adtel.esb.com.my]
Sent: Wednesday, 09 May 2001 13:39
To: informix-datastage@oliver.com
Cc: ray_wurlod@informix.com
Subject: Problem extracting records within time frame
Hello,
I am sorry to post this message with a wrong subject previously. Here I re-send the email and below is my problem description.
I am encountering a problem with a DataStage job that I have developed. This job has been scheduled to run every 15 minutes to extract data from an Oracle table and insert to Unidata table.
The job will extract the Oracle new records base on the timestamp of a field called INSERT_DATETIME in the Oracle table. The comparison of extraction is something like this in the Constraints of the transformer. INSERT_DATETIME >= LAST_RETRIEVE_DATE
The LAST_RETRIEVE_DATE is actually the last retrieval datetime written into a file. When the job starts executing, it will actually read the last retrieval date from the mentioned file.
The problem that I am facing is that the records in the Oracle table keeps growing and currently there are more than 5 million records. I noticed that DataStage jobs will actually read through the whole table from record number 1 to the end of 5 million and each of the record retrieved will have to compare the INSERT_DATETIME with the LAST_RETRIEVE_DATE.
This method has actually affected the performance of the DataStage jobs which is supposed to complete within 15 minutes. As for this problem, it caused the DataStage job to exceed the 15 minutes buffer.
My question is, whether the DataStage have a way to recognise the newly inserted records in the Oracle table without having scan through the Oracle table from record 1 again when doing the comparison?
Our customer has been complaining about this performance issue since the job cannot complete within 15 minutes buffer.
Hope you all could give some suggestions on how to rectify this problem. I would appreciate your help. Thank you.
Regards,
Melvin
Use a job parameter in the WHERE clause of user-defined SQL. SELECT col1, col2, ... FROM table WHERE INSERT_DATETIME > #LastRetrieveDate#;
Create a job control routine to set the value of the controlled jobs parameter, after reading it from wherever it is stored.
-----Original Message-----
From: Melvin Thong [mailto:melvin@adtel.esb.com.my]
Sent: Wednesday, 09 May 2001 13:39
To: informix-datastage@oliver.com
Cc: ray_wurlod@informix.com
Subject: Problem extracting records within time frame
Hello,
I am sorry to post this message with a wrong subject previously. Here I re-send the email and below is my problem description.
I am encountering a problem with a DataStage job that I have developed. This job has been scheduled to run every 15 minutes to extract data from an Oracle table and insert to Unidata table.
The job will extract the Oracle new records base on the timestamp of a field called INSERT_DATETIME in the Oracle table. The comparison of extraction is something like this in the Constraints of the transformer. INSERT_DATETIME >= LAST_RETRIEVE_DATE
The LAST_RETRIEVE_DATE is actually the last retrieval datetime written into a file. When the job starts executing, it will actually read the last retrieval date from the mentioned file.
The problem that I am facing is that the records in the Oracle table keeps growing and currently there are more than 5 million records. I noticed that DataStage jobs will actually read through the whole table from record number 1 to the end of 5 million and each of the record retrieved will have to compare the INSERT_DATETIME with the LAST_RETRIEVE_DATE.
This method has actually affected the performance of the DataStage jobs which is supposed to complete within 15 minutes. As for this problem, it caused the DataStage job to exceed the 15 minutes buffer.
My question is, whether the DataStage have a way to recognise the newly inserted records in the Oracle table without having scan through the Oracle table from record 1 again when doing the comparison?
Our customer has been complaining about this performance issue since the job cannot complete within 15 minutes buffer.
Hope you all could give some suggestions on how to rectify this problem. I would appreciate your help. Thank you.
Regards,
Melvin