Page 1 of 1
Scenario: Identify duplicated in a column
Posted: Tue Jul 22, 2014 4:39 am
by nagadastagirireddy
Hi,
I have a scenario in my project, in a column there are two values separated by comma as shown below.
Input
-----------
AUS,IND
AUS,PAK
IND,AUS
IND,PAK
PAK,AUS
PAK,IND
If you observe in the above data, there a combinations of "AUS,IND" and "IND,AUS". In this case I have to consider them as duplicates and load any one of them in table.
My output should look as below.
Expected Result
--------------------
AUS,IND
AUS,PAK
IND,PAK
PAK,AUS
Please help me with this scenario.
Many Thanks in advance.
Posted: Tue Jul 22, 2014 5:35 am
by stuartjvnorton
Transformer: split using Field and put into alphabetical order, eg: IND,AUS -> AUS,IND
Sort with key change
Filter by key change= 1
Posted: Tue Jul 22, 2014 9:29 am
by nagadastagirireddy
Hi,
If I split and sort the data then it will look like below.
After applying split and sort
-------------------------------
AUS,AUS
AUS,AUS
IND,IND
IND,IND
PAK,PAK
PAK,PAK
This might not give expected result.
Posted: Tue Jul 22, 2014 4:19 pm
by yugee
Simple, but works as required.
Transformer:
COL1 --> is your input with two fields separated by ','
Declare Below Stage Variables:
svIndex - Index(DSLink1.COL1,',',1) This will find the position of ','
svLength - Len(DSLink1.COL1) this is to find the length of your column
svField1 - Left(DSLink1.COL1,svIndex-1) Left portion of your input column
svField2 - Right(DSLink1.COL1,svLength-svIndex) Right portion of your input
Declare an output column:
MERGEKEY -
If Compare(svField1,svField2) > 0 Then
svField1:svField2
Else
svField2:svField1
Finally Have a Sort Stage - to remove duplicates on MERGEKEY
Let me know if it solves your requirement.
Posted: Tue Jul 22, 2014 10:37 pm
by nagadastagirireddy
Thank a lot Yugee, that works perfect.
But one small check, we can directly apply field function to seperate the string, instead why are you using Index and field?
Posted: Wed Jul 23, 2014 12:25 pm
by yugee
Good to know...
No specific reason for using Index - I was testing this function as part of some other issue...you can use any other functions to split the field into two...
Posted: Wed Jul 23, 2014 1:32 pm
by ssnegi
Read the input twice. Number the rows @INROWNUM.
Join COL1 to COL2 and COL2 to COL1 in lookup stage.
So there will be 4 cols col1,col2,col3,col4.
AUS,IND,IND,AUS,1,3
IND,AUS,AUS,IND,3,1
Then add Num1+Num2
AUS,IND,IND,AUS,4
IND,AUS,AUS,IND,4
Then do remove duplicate on Num.