Records not getting extracted from Oracle

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
rohanf
Participant
Posts: 30
Joined: Fri Oct 07, 2005 5:39 am
Location: pune india

Records not getting extracted from Oracle

Post by rohanf »

Hi,

We have a strange issue that has come up as soon as we switched to a new database.

We have simple extraction jobs with design as below:-

Oracle Enterprise -----> Copy (Force = True) -----> Dataset

However when we run these there are a few records that do not get extracted. This issue moves from one job to another i.e. one day say Table1 had issue while extraction, and next day table2 had the same issue and Table1 did not have this issue.

We have no warnings except for the 2 NLS warnings that have been supressed.

Moreover when we rerun the job the extraction is fine i.e. the count is same as we have in the database.

Also all the records seems to belong to a particular time frame.

Thanks & Regards
Rohan
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Did you check the row count in director against the rows in the DataSet ?

If they belong to a particular timeframe, then - maybe your scheduler is not covering that period.

For example, you schedule the extract date range from hh:mi:ss to sysdate and then record the sysdate to your control tables for your next load, maybe the delay between extraction and control table load is wide enough to loose some records.

If you have any history, you can identify min and max time values to check this theory.

I am not saying that this is your exact problem...but it may be something alone this line.
rohanf
Participant
Posts: 30
Joined: Fri Oct 07, 2005 5:39 am
Location: pune india

Records not getting extracted from Oracle

Post by rohanf »

Hi Sainath,

We are doing a full extract everytime. Also apologies, my initial assumption of records belonging to a particular time frame was wrong, it was based on looking at a sample of data.

Moreover to add to my previous post, we have also checked and confirmed that there were no parallel Update scripts on the table while we were extracting.

Director shows record count corresponding to the final dataset record count. i.e. changes as the count in dataset changes.

Thanks & Regards
Rohan
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Re: Records not getting extracted from Oracle

Post by Sainath.Srinivasan »

rohanf wrote: Director shows record count corresponding to the final dataset record count. i.e. changes as the count in dataset changes.
What do you mean by this statement ?

I assume there must be a 'read' count from the database. Is that same as the 'write' count into the dataset ?

Does it match to rows
a.) you expect to extract ?
b.) present in your DataSet after extract finishes ?
rohanf
Participant
Posts: 30
Joined: Fri Oct 07, 2005 5:39 am
Location: pune india

Records not getting extracted from Oracle

Post by rohanf »

Hi Sainath,

By saying 'changes as the count in dataset changes' I am referring to my comment in the first post message:

"However when we run these there are a few records that do not get extracted. This issue moves from one job to another i.e. one day say Table1 had issue while extraction, and next day table2 had the same issue and Table1 did not have this issue."

Quoting actual runtime scenario below:

Say Table1 count is 4228279 which is static.

on Run 1:
Count in Dataset: 4050765 (Incorrect count)
Count in Director: 4050765 (Incorrect count)

on Run 2:
Count in Dataset: 4228279 (Correct count)
Count in Director: 4228279 (Correct count)

Regards
Rohan
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

And you claim to have checked
1.) Accuracy of SQL statement
2.) Parameters passed - if any
3.) Log messages and errors
4.) Record count outside DataStage reflecting the correct value when DataStage says incorrect numbers
5.) Table is not "appended" or "updated" by any other sources
6.) No rejects from any stage
rohanf
Participant
Posts: 30
Joined: Fri Oct 07, 2005 5:39 am
Location: pune india

Records not getting extracted from Oracle

Post by rohanf »

Hi Sainath,

Yes we have done all the below checks to the best of our knowledge:

Validity/Accuracy of SQL: The sql has been running for a long time now, the code has been Unit/system tested and is in mid of UAT.

Parameters passed: we have checked this as well.

Logs as mentioned has only two NLS warnings that have been suppressed.

We have checked the record count outside datastage as well.

For your refference I am attaching the sql used:-

SELECT qpt.quote_policy_link_identifier, qpl.qp_reference_number,
qpl.product_identifier, qpl.policy_number,
qpl.quote_audit_trail_identifier, qpl.policy_audit_trail_identifier,
qpl.policy_activation_date, qpl.create_source_table,
qpl.create_source_system, qpt.qptransaction_identifier,
qpt.audit_trail_identifier,
qpt.version_effective_date AS version_effective_date_qpt,
qpt.policy_year_number, qpt.policy_amendment_sequence,
qpt.effective_start_date, qpt.effective_end_date,
qpt.transaction_staff_identifier, qpt.endorsement_reason_identifier,
qpt.fee_amend_type_identifier, qpt.total_premium, qpt.total_taxes,
qpt.total_fees, qpt.total_cost, qpt.total_premium_prorata,
qpt.total_taxes_prorata, qpt.total_cost_prorata, qpt.gross_before_ncd,
qpt.ncd_amount, qpt.total_discount, qpt.discount_clawback,
qpt.customer_permanent_load, qpt.customer_temporary_load_refer,
qpt.penalty_points_amount, qpt.customer_demerit_points,
qpt.fee_original_amount, qpt.fee_overwrite_reason, qpt.active_sequence,
pr.product_code,qpt.tx_bonus_years,
CAST((SELECT MAX (qp_reference_number) AS max_qp_ref_no
FROM quote_policy_link) AS DECIMAL (38, 0)) AS max_qp_ref_no,
qpt.total_cost_old_rates,qpl.WEB_REFERENCE_NUMBER
FROM quote_policy_transaction qpt, quote_policy_link qpl, product pr
WHERE qpt.quote_policy_link_identifier = qpl.quote_policy_link_identifier
AND pr.product_identifier = qpl.product_identifier
AND create_source_table <> 'QQ'
AND policy_year_number > 0
AND version_effective_date >
TO_TIMESTAMP ('01/01/1900 00:00:00.000000',
'DD/MM/YYYY HH24:MI:SS.FF'
)
AND qpt.version_effective_date <=
TO_TIMESTAMP ('31/12/2050 00:00:00.000000',
'DD/MM/YYYY HH24:MI:SS.FF'
)


Also note that we have re-run the ETL(No Changes) and we are not getting this issue. Nevertheless we do require to investigate this since this has repeated twice and may have a potential to repeat during Production.

Thanks & Regards
Rohan
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I don't see anything patently wrong in the SQL. Is there a chance that the quote_policy_link table might have uncommitted transactions at runtime? If this happens again, the quickest way to debug the issue would be to look at the 2 results and see if you can find some rule in the delta between the two, starting with any columns used as part of the select.
Post Reply