Change Capture with sort stage/key partition and sorted

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Change Capture with sort stage/key partition and sorted

Post by karthi_gana »

All,

I have designed a simple job to capture the changed data.


Content of File 1:

col1 col2
1,a
2,b
3,c
4,d
5,e
6,f
7,g
8,h
9,i

Content of File 2:

col1 col2
1,a

3,c
4,d
10,e
6,f
7,g
12,h
9,i
1111,k


Code: Select all


File1 ---------->Sort -------> 
                                        Change Capture ---------------> Output File
File2 ---------->Sort ------->

Sort key = Col1

Key = col1
change value = col2

Output:

col1 col2 change_code
2 b 2
5 e 2
8 h 2
10 e 1
12 h 1
1111 k 1

It is correct. Right?

I just used "Alternative way" to do the same.
The stage assumes that the incoming data is key-partitioned and sorted in ascending order. The columns the data is hashed on should be the key columns used for the data compare. You can achieve the sorting and partitioning using the Sort stage or by using the built-in sorting and partitioning abilities of the Change Capture stage.


I used

a) Hash partition
b) col1 as the key
c) Perform Sort with Stable option

in the change capture stage. I removed the Sort stage. I ran the job and got the below output. it is the reverse of above method which is not correct.

Output:

col1 col2 change_code
2 b 1
5 e 1
8 h 1
10 e 2
12 h 2
1111 k 2

I don't know what is happening. Experts inputs are welcome!
Karthik
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

You probably messed up your link order. Inserts and deletes are reversed when you reverse the before/after datasets.

Mike
Post Reply