Hi All,
This is something wierd. I need to compare two columns having character strings. The problems is, If they match 75% or more then change them to same as first column. If they match less than 75% then keep them as they are. I was wondering how will this be possible in Datastage. Any help will be appreciated.
Thanks in advance.
Sylvan
Comparing columns
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 47
- Joined: Wed Apr 12, 2006 12:13 pm
Comparing columns
sylvan rydes
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Define "match by 75%". Sounds like the kind of probabilistic match that QualityStage (perhaps invoked through a QualityStage plug-in stage in your DataStage job) is ideally suited to do.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 47
- Joined: Wed Apr 12, 2006 12:13 pm
Hi Ray,
Thanks for your reply. Well...I know that it is possible with quality stage. Even I have a PL/SQL function ready to do that. But just want to know if it could be done using Datastage in a simple way(using some built in function).
Now for example we have Examination type and Exam type. Both are same so must be replaced with the first one.
Thanks again.
Sylvan
Thanks for your reply. Well...I know that it is possible with quality stage. Even I have a PL/SQL function ready to do that. But just want to know if it could be done using Datastage in a simple way(using some built in function).
Now for example we have Examination type and Exam type. Both are same so must be replaced with the first one.
Thanks again.
Sylvan
sylvan rydes
You can probably write a basic routine. Do a character by character comparison. If they match store it in one variable and if it does not, store it in another. Get the size of both these variables and mainpulate the percentage to get the percentage of match and return the appropriate value.
Reference this routine in a stage variable. Pass Col1 as Arg1 and Col2 as Arg2. And in the derivation of the second column check if the stage variable is @True put Col1 else Col2. That routine was off of my head. It needs testing. You can also add code to handle null values and empty columns.
Code: Select all
FUNCTION PercentileMatch(Arg1,Arg2)
in.Col1=trim(Arg1)
in.Col2=trim(Arg2)
Matched = ''
If Len(in.Col1) >= Len(in.Col2)
Then
MaxLen=Len(in.Col1)
End
Else
MaxLen=Len(in.Col2)
End
For i=1 To MaxLen
If in.Col1[i,1] = in.Col2[i,1]
Then
Matched:=in.Col1[i,1]
End
Next i
Matched.Percentage=(Len(Matched)/MaxLen)*100
If Matched.Percentage >= 75
Then
Ans=@True
End
Else
Ans=@False
End
Last edited by DSguru2B on Thu Dec 28, 2006 8:48 am, edited 3 times in total.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
-
- Participant
- Posts: 47
- Joined: Wed Apr 12, 2006 12:13 pm