Strange problem while doing Update then Insert

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Raghava
Participant
Posts: 13
Joined: Tue Jan 24, 2006 1:20 am
Contact:

Strange problem while doing Update then Insert

Post by Raghava »

Hi Gurus,

we are migrating Server Job To PX

Actually my flow: Oracle Enterprise stage---->Transformer--->oracle entersrise(update then insert)

This is Update and insert sql(Server job)

UPDATE OFSD.OFSA_LEAF_DESC SET DESCRIPTION=:3,DESCRIPTION_A=:4,SIGN_IND=:5 WHERE LEAF_NUM_ID=:1 AND LEAF_NODE=:2;
INSERT INTO OFSD.OFSA_LEAF_DESC (LEAF_NUM_ID,LEAF_NODE,DESCRIPTION,DESCRIPTION_A,SIGN_IND) VALUES (:1,:2,:3,:4,:5)

Run stats: From Source -615
Target=578

Same Logic i am doing using Oracle enterprise stage in PX(update then insert)

Where as in PX job is giving Same No of records as output..But

Say I have Input rows like this
LEAF_NUM_ID DESCRIPTION LEAF_NODE
100 ABC 6
100 XYZ 6


My server Job Output is inserting(100,ABC,6) in target table
My px Job output is inserting (100,XYZ,6) in target

For some records it may vice Versa.....So Its Very strange to get the same output in both the jobs then only UAT will over


Note:1. We tried with remove duplicates satge before laoding in to target
2.we tried with sequential mode execution of target
3.We tried with DRS Stage

That is also not fulfill our requirement



Can You please any Body give some Inputs to resolve this Strange issue

What might be the reason behind this

Thanks In Advance!!

Rag
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Post by Minhajuddin »

From the above post, I understand that:

LEAF_NUM_ID,LEAF_NODE form the primary key.

Now, if we look at the data you are trying to insert.
-----------------------------------------
LEAF_NUM_ID DESCRIPTION LEAF_NODE
100 ABC 6
100 XYZ 6
-----------------------------------------
You have two records with the same primary key. This should never happen, you should not have any duplicates in the source when you insert into a database.

You can remove duplicates on your primary key before you insert in the database.
What was the problem you get when you did this?
What is the criterion for processing a record when you have duplicates?
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The Remove Duplicates stage gives you the choice of removing the first or last record in each group. Sort your data by LEAF_NUM_ID, LEAF_NODE and DESCRIPTION (and hash partition by LEAF_NUM_ID and LEAD_NODE) ahead of your Remove Duplicates stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
laxman.ds
Premium Member
Premium Member
Posts: 66
Joined: Thu Mar 02, 2006 9:00 am

Post by laxman.ds »

Thannks Ray ..For Your Inputs

work around with your inputs

Minhajuddin : Thanks For Ur responce!!


For LEAF_NUM_ID,LEAF_NODE are Not Null Columns Only in the Target

But Unique Index is there on Both the columns,I have checked the constraints ,Please find the below script


CREATE TABLE OFSA_LEAF_DESC
(
LEAF_NUM_ID NUMBER(5) NOT NULL,
LEAF_NODE NUMBER(14) NOT NULL,
DESCRIPTION VARCHAR2(80 BYTE),
DESCRIPTION_A VARCHAR2(80 BYTE),
SIGN_IND NUMBER(1) DEFAULT 0
)
TABLESPACE TS_OFDMPX
PCTUSED 0
PCTFREE 10
INITRANS 1
MAXTRANS 255
STORAGE (
INITIAL 1M
MINEXTENTS 1
MAXEXTENTS 2147483645
PCTINCREASE 0
BUFFER_POOL DEFAULT
)
LOGGING
NOCOMPRESS
NOCACHE
NOPARALLEL
MONITORING;


CREATE UNIQUE INDEX LEAF_DESC ON OFSA_LEAF_DESC
(LEAF_NUM_ID, LEAF_NODE)
LOGGING
TABLESPACE TS_OFDMPX
PCTFREE 10
INITRANS 2
MAXTRANS 255
STORAGE (
INITIAL 1M
MINEXTENTS 1
MAXEXTENTS 2147483645
PCTINCREASE 0
BUFFER_POOL DEFAULT
)
NOPARALLEL;


CREATE INDEX LEAF_DESC_NODE1 ON OFSA_LEAF_DESC
(LEAF_NODE)
LOGGING
TABLESPACE TS_OFDMPX
PCTFREE 10
INITRANS 2
MAXTRANS 255
STORAGE (
INITIAL 1M
MINEXTENTS 1
MAXEXTENTS 2147483645
PCTINCREASE 0
BUFFER_POOL DEFAULT
)
NOPARALLEL;


CREATE INDEX LEAF_DESC_NUMID ON OFSA_LEAF_DESC
(LEAF_NUM_ID)
LOGGING
TABLESPACE TS_OFDMPX
PCTFREE 10
INITRANS 2
MAXTRANS 255
STORAGE (
INITIAL 1M
MINEXTENTS 1
MAXEXTENTS 2147483645
PCTINCREASE 0
BUFFER_POOL DEFAULT
)
NOPARALLEL;
2 B 1 4 ALL
Minhajuddin
Participant
Posts: 467
Joined: Tue Mar 20, 2007 6:36 am
Location: Chennai
Contact:

Post by Minhajuddin »

@Laxman.ds: Are you the OP?

If so, did you get the problem resolved?
If not, please answer the questions in my previous post for a more elaborate solution.
Minhajuddin

<a href="http://feeds.feedburner.com/~r/MyExperi ... ~6/2"><img src="http://feeds.feedburner.com/MyExperienc ... lrow.3.gif" alt="My experiences with this DLROW" border="0"></a>
laxman.ds
Premium Member
Premium Member
Posts: 66
Joined: Thu Mar 02, 2006 9:00 am

Post by laxman.ds »

I am the OP Minhajuddin....Sorry for the Confusion!!

The Remove Duplicates stage gives you the choice of removing the first or last record in each group. Sort your data by LEAF_NUM_ID, LEAF_NODE and DESCRIPTION (and hash partition by LEAF_NUM_ID and LEAD_NODE) ahead of your Remove Duplicates stage.


RaY: All the ways we tried(Remove dup by removing first/last record in each group) ....Still problem is there


Here I am elaborating the problem

Server Job flow:

Oracle_A
---->Linkcollector(RoundR)->Oracle Stage(update then insert)
Oracle_B

Step1:
Extract queies in oracles
(select LEAF_NUM_ID,LEAF_NODE,DESCRIPTION,DESCRIPTION_A,SIGN_IND from A)

(select LEAF_NUM_ID,LEAF_NODE,DESCRIPTION,DESCRIPTION_A,SIGN_IND from B)

Step 2:
Key Columns In the Link Collector:(LEAF_NUM_ID,LEAF_NODE)

Step 3:
Target Options(updte Existing rows or insert new rows)
Query:
UPDATE OFSD.OFSA_LEAF_DESC SET DESCRIPTION=:3,DESCRIPTION_A=:4,SIGN_IND=:5 WHERE LEAF_NUM_ID=:1 AND LEAF_NODE=:2;
INSERT INTO OFSD.OFSA_LEAF_DESC (LEAF_NUM_ID,LEAF_NODE,DESCRIPTION,DESCRIPTION_A,SIGN_IND) VALUES (:1,:2,:3,:4,:5)


I have given target table information in my prev replay


Data : LEAF_NUM_ID , LEAF_NODE,DESC,DESC_A

ROW1:6,118406,HUSSAIN KATTAN,HUSSAIN KATTAN
ROW2:6,118406,NULL,HUSSAIN KATTAN

Say These records are comming from the link collector ,Which record it will insert into the target in ServerJOb?That depends on first occurence or any dependency?...

That will Give us some clue to go ahead

Issue is: Need to match the O/p in Both Server and PX

Sever Job is inserting ROW1 ,PX job is inserting ROW2

Some cases Vise versa.. could n't able to find out Solution?

Thanks for your Time !!!


Thanks
2 B 1 4 ALL
dinthat
Participant
Posts: 22
Joined: Tue Sep 25, 2007 8:43 am
Location: Kingdom of Saudi Arabia

Post by dinthat »

I am also facing the same issue while performing update then insert in parallel jobs.

Is there any work around or solution for this issue? Pls help.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Check that the partitioning is correct. Each key (combination of key columns) must occur on one processing node only.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply