Strange problem while doing Update then Insert

Raghava · Post by **Raghava** » Sat Dec 29, 2007 6:57 am

Hi Gurus,

we are migrating Server Job To PX

Actually my flow: Oracle Enterprise stage---->Transformer--->oracle entersrise(update then insert)

This is Update and insert sql(Server job)

UPDATE OFSD.OFSA_LEAF_DESC SET DESCRIPTION=:3,DESCRIPTION_A=:4,SIGN_IND=:5 WHERE LEAF_NUM_ID=:1 AND LEAF_NODE=:2;
INSERT INTO OFSD.OFSA_LEAF_DESC (LEAF_NUM_ID,LEAF_NODE,DESCRIPTION,DESCRIPTION_A,SIGN_IND) VALUES (:1,:2,:3,:4,:5)

Run stats: From Source -615
Target=578

Same Logic i am doing using Oracle enterprise stage in PX(update then insert)

Where as in PX job is giving Same No of records as output..But

Say I have Input rows like this
LEAF_NUM_ID DESCRIPTION LEAF_NODE
100 ABC 6
100 XYZ 6

My server Job Output is inserting(100,ABC,6) in target table
My px Job output is inserting (100,XYZ,6) in target

For some records it may vice Versa.....So Its Very strange to get the same output in both the jobs then only UAT will over

Note:1. We tried with remove duplicates satge before laoding in to target
2.we tried with sequential mode execution of target
3.We tried with DRS Stage

That is also not fulfill our requirement

Can You please any Body give some Inputs to resolve this Strange issue

What might be the reason behind this

Thanks In Advance!!

Rag

Minhajuddin · Post by **Minhajuddin** » Sat Dec 29, 2007 9:10 am

From the above post, I understand that:

LEAF_NUM_ID,LEAF_NODE form the primary key.

Now, if we look at the data you are trying to insert.
-----------------------------------------
LEAF_NUM_ID DESCRIPTION LEAF_NODE
100 ABC 6
100 XYZ 6
-----------------------------------------
You have two records with the same primary key. This should never happen, you should not have any duplicates in the source when you insert into a database.

You can remove duplicates on your primary key before you insert in the database.
What was the problem you get when you did this?
What is the criterion for processing a record when you have duplicates?

ray.wurlod · Post by **ray.wurlod** » Sat Dec 29, 2007 1:34 pm

The Remove Duplicates stage gives you the choice of removing the first or last record in each group. Sort your data by LEAF_NUM_ID, LEAF_NODE and DESCRIPTION (and hash partition by LEAF_NUM_ID and LEAD_NODE) ahead of your Remove Duplicates stage.

laxman.ds · Post by **laxman.ds** » Sun Dec 30, 2007 6:30 am

Thannks Ray ..For Your Inputs

work around with your inputs

Minhajuddin : Thanks For Ur responce!!

For LEAF_NUM_ID,LEAF_NODE are Not Null Columns Only in the Target

But Unique Index is there on Both the columns,I have checked the constraints ,Please find the below script

CREATE TABLE OFSA_LEAF_DESC
(
LEAF_NUM_ID NUMBER(5) NOT NULL,
LEAF_NODE NUMBER(14) NOT NULL,
DESCRIPTION VARCHAR2(80 BYTE),
DESCRIPTION_A VARCHAR2(80 BYTE),
SIGN_IND NUMBER(1) DEFAULT 0
)
TABLESPACE TS_OFDMPX
PCTUSED 0
PCTFREE 10
INITRANS 1
MAXTRANS 255
STORAGE (
INITIAL 1M
MINEXTENTS 1
MAXEXTENTS 2147483645
PCTINCREASE 0
BUFFER_POOL DEFAULT
)
LOGGING
NOCOMPRESS
NOCACHE
NOPARALLEL
MONITORING;

CREATE UNIQUE INDEX LEAF_DESC ON OFSA_LEAF_DESC
(LEAF_NUM_ID, LEAF_NODE)
LOGGING
TABLESPACE TS_OFDMPX
PCTFREE 10
INITRANS 2
MAXTRANS 255
STORAGE (
INITIAL 1M
MINEXTENTS 1
MAXEXTENTS 2147483645
PCTINCREASE 0
BUFFER_POOL DEFAULT
)
NOPARALLEL;

CREATE INDEX LEAF_DESC_NODE1 ON OFSA_LEAF_DESC
(LEAF_NODE)
LOGGING
TABLESPACE TS_OFDMPX
PCTFREE 10
INITRANS 2
MAXTRANS 255
STORAGE (
INITIAL 1M
MINEXTENTS 1
MAXEXTENTS 2147483645
PCTINCREASE 0
BUFFER_POOL DEFAULT
)
NOPARALLEL;

CREATE INDEX LEAF_DESC_NUMID ON OFSA_LEAF_DESC
(LEAF_NUM_ID)
LOGGING
TABLESPACE TS_OFDMPX
PCTFREE 10
INITRANS 2
MAXTRANS 255
STORAGE (
INITIAL 1M
MINEXTENTS 1
MAXEXTENTS 2147483645
PCTINCREASE 0
BUFFER_POOL DEFAULT
)
NOPARALLEL;

Minhajuddin · Post by **Minhajuddin** » Sun Dec 30, 2007 9:10 am

@Laxman.ds: Are you the OP?

If so, did you get the problem resolved?
If not, please answer the questions in my previous post for a more elaborate solution.

laxman.ds · Post by **laxman.ds** » Sun Dec 30, 2007 10:01 am

I am the OP Minhajuddin....Sorry for the Confusion!!

The Remove Duplicates stage gives you the choice of removing the first or last record in each group. Sort your data by LEAF_NUM_ID, LEAF_NODE and DESCRIPTION (and hash partition by LEAF_NUM_ID and LEAD_NODE) ahead of your Remove Duplicates stage.

RaY: All the ways we tried(Remove dup by removing first/last record in each group) ....Still problem is there

Here I am elaborating the problem

Server Job flow:

Oracle_A
---->Linkcollector(RoundR)->Oracle Stage(update then insert)
Oracle_B

Step1:
Extract queies in oracles
(select LEAF_NUM_ID,LEAF_NODE,DESCRIPTION,DESCRIPTION_A,SIGN_IND from A)

(select LEAF_NUM_ID,LEAF_NODE,DESCRIPTION,DESCRIPTION_A,SIGN_IND from B)

Step 2:
Key Columns In the Link Collector:(LEAF_NUM_ID,LEAF_NODE)

Step 3:
Target Options(updte Existing rows or insert new rows)
Query:
UPDATE OFSD.OFSA_LEAF_DESC SET DESCRIPTION=:3,DESCRIPTION_A=:4,SIGN_IND=:5 WHERE LEAF_NUM_ID=:1 AND LEAF_NODE=:2;
INSERT INTO OFSD.OFSA_LEAF_DESC (LEAF_NUM_ID,LEAF_NODE,DESCRIPTION,DESCRIPTION_A,SIGN_IND) VALUES (:1,:2,:3,:4,:5)

I have given target table information in my prev replay

Data : LEAF_NUM_ID , LEAF_NODE,DESC,DESC_A

ROW1:6,118406,HUSSAIN KATTAN,HUSSAIN KATTAN
ROW2:6,118406,NULL,HUSSAIN KATTAN

Say These records are comming from the link collector ,Which record it will insert into the target in ServerJOb?That depends on first occurence or any dependency?...

That will Give us some clue to go ahead

Issue is: Need to match the O/p in Both Server and PX

Sever Job is inserting ROW1 ,PX job is inserting ROW2

Some cases Vise versa.. could n't able to find out Solution?

Thanks for your Time !!!

Thanks

dinthat · Post by **dinthat** » Mon Feb 25, 2008 10:40 am

I am also facing the same issue while performing update then insert in parallel jobs.

Is there any work around or solution for this issue? Pls help.

ray.wurlod · Post by **ray.wurlod** » Mon Feb 25, 2008 3:51 pm

Check that the partitioning is correct. Each key (combination of key columns) must occur on one processing node only.