Sort String
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 7
- Joined: Wed Oct 10, 2007 1:03 pm
Sort String
Hello,
I have a requirement where I need to do a lookup on a field. Before doing the lookup I need to sort the incomming data.
I need to sort it in the ascending order before doing the lookup (The hash file has the values in the sorted manner)
Texas,New York,California --> California,New York,Texas
Vermont,Alaska,New Mexico, New Jersey --> Alaska,New Jersey,New Mexico, Vermont.
Should I write a routine to do this or Can I do it on the transformer?
Please guide..
I have a requirement where I need to do a lookup on a field. Before doing the lookup I need to sort the incomming data.
I need to sort it in the ascending order before doing the lookup (The hash file has the values in the sorted manner)
Texas,New York,California --> California,New York,Texas
Vermont,Alaska,New Mexico, New Jersey --> Alaska,New Jersey,New Mexico, Vermont.
Should I write a routine to do this or Can I do it on the transformer?
Please guide..
-
- Premium Member
- Posts: 503
- Joined: Wed Jun 29, 2005 8:14 am
Re: Sort String
What is the source of your data? If the data is coming from a database you can put order by in the data source.
Thanks
Deepak Patil
Convince Them Confuse Them .. What's the difference?
Deepak Patil
Convince Them Confuse Them .. What's the difference?
-
- Participant
- Posts: 7
- Joined: Wed Oct 10, 2007 1:03 pm
-
- Premium Member
- Posts: 503
- Joined: Wed Jun 29, 2005 8:14 am
-
- Participant
- Posts: 7
- Joined: Wed Oct 10, 2007 1:03 pm
-
- Charter Member
- Posts: 822
- Joined: Sat Sep 17, 2005 5:25 pm
- Location: USA
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Re: Sort String
No it doesn't.datastager wrote:Hello,
I have a requirement where I need to do a lookup on a field. Before doing the lookup I need to sort the incomming data.
I need to sort it in the ascending order before doing the lookup (The hash file has the values in the sorted manner)
Storage in hashed files ("Hashed", you will notice, not "hash") is entirely governed by the hashed file's hashing algorithm.
The hashing algorithm determines the physical location of a record based upon its key value. You do not need to sort anything. A single I/O operation will retrieve any record from a hashed file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 7
- Joined: Wed Oct 10, 2007 1:03 pm
Sorry if i did not put forth my question clearly.
Actually I need to sort data inside a field, not the field itself.
Texas,New York,California Needs to be sorted to California,New York,Texas
and
Vermont,Alaska,New Mexico, New Jersey Needs to be sorted to Alaska,New Jersey,New Mexico, Vermont
and so on....
Actually I need to sort data inside a field, not the field itself.
Texas,New York,California Needs to be sorted to California,New York,Texas
and
Vermont,Alaska,New Mexico, New Jersey Needs to be sorted to Alaska,New Jersey,New Mexico, Vermont
and so on....
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You need a routine. It's not hard; an insert sort will do it.
Code: Select all
FUNCTION SortList(TheList, Delimiter)
* Change to undefine to perform unique sort (remove duplicates).
$DEFINE PreserveDuplicates
* Validate input arguments.
If UnAssigned(TheList) Or UnAssigned(TheDelimiter) Then RETURN(@NULL)
If IsNull(TheList) Or IsNull(TheDelimiter) Then RETURN(@NULL)
Ans = ""
Temp = Convert(Delimiter, @FM, (TheList))
Loop
* Get next element from dynamic array.
Remove Element From Temp Setting MoreElements
Locate(Element, Ans; WhereFound; "AL")
$IFDEF PreserveDuplicates
Then
Ins Element Before(Ans<WhereFound>)
End
$ENDIF
Else
Ins Element Before(Ans<WhereFound>)
End
While MoreElements
Repeat
* Reinstate the original delimiter character.
Convert @FM To Delimiter In Ans
RETURN(Ans)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 7
- Joined: Wed Oct 10, 2007 1:03 pm
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I made it up sitting in an airport lounge. What specific warnings did you get? How did you try to create the routine? Irrespective, if does have some minor errors. Here's a working one.
Code: Select all
FUNCTION SortList(TheList, TheDelimiter)
* Change to undefine to perform unique sort (remove duplicates).
$DEFINE PreserveDuplicates
* Validate input arguments.
If UnAssigned(TheList) Or UnAssigned(TheDelimiter) Then RETURN(@NULL)
If IsNull(TheList) Or IsNull(TheDelimiter) Then RETURN(@NULL)
Ans = ""
Temp = Convert(TheDelimiter, @FM, (TheList))
Loop
* Get next element from dynamic array.
Remove Element From Temp Setting MoreElements
Locate(Element, Ans; WhereFound; "AL")
Then
$IFDEF PreserveDuplicates
Ins Element Before Ans<WhereFound>
$ENDIF
End
Else
Ins Element Before Ans<WhereFound>
End
While MoreElements
Repeat
* Reinstate the original delimiter character.
Convert @FM To TheDelimiter In Ans
RETURN(Ans)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 7
- Joined: Wed Oct 10, 2007 1:03 pm
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I assume that the statements that idiot-proof the arguments are OK.
The main logic converts the delimited string to a dynamic array (which in one sense is just a delimited string with a different delimiter, but one for which there are some useful tools). The delimiter I used is a field mark, which may be represented by the system variable @FM.
The Remove statement takes the next element from a dynamic array. This is executed in a loop, processing (in your case) one city name at a time.
As each city name is found, it is compared with the result (which was initialized to an empty list). The result is asserted to be sorted in ascending left-justified order, by the "AL" argument in the Locate statement. The Locate statement returns the field number within the Ans dynamic array where it stopped searching - that is, the next field number past where the city name was found or ought to have been found.
The city name is then inserted into that location, so that the Ans dynamic array remains sorted. (That's why it's called an insertion sort.)
Finally, once all cities have been processed, the field marks in the Ans dynamic array are converted back to the original delimiter.
Consult the DataStage BASIC manual or on-line help for more information on each of the statements used.
The main logic converts the delimited string to a dynamic array (which in one sense is just a delimited string with a different delimiter, but one for which there are some useful tools). The delimiter I used is a field mark, which may be represented by the system variable @FM.
The Remove statement takes the next element from a dynamic array. This is executed in a loop, processing (in your case) one city name at a time.
As each city name is found, it is compared with the result (which was initialized to an empty list). The result is asserted to be sorted in ascending left-justified order, by the "AL" argument in the Locate statement. The Locate statement returns the field number within the Ans dynamic array where it stopped searching - that is, the next field number past where the city name was found or ought to have been found.
The city name is then inserted into that location, so that the Ans dynamic array remains sorted. (That's why it's called an insertion sort.)
Finally, once all cities have been processed, the field marks in the Ans dynamic array are converted back to the original delimiter.
Consult the DataStage BASIC manual or on-line help for more information on each of the statements used.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Hey Datastager,
Just out of curiosity: how long is it taking the code Ray provided for you to order the information for you? I know I have added this code to my collection of interesting bits and I'm just wondering how it compares to an external call using sed, awk or Perl.
As usual, Ray Rocks!
Bestest!
Just out of curiosity: how long is it taking the code Ray provided for you to order the information for you? I know I have added this code to my collection of interesting bits and I'm just wondering how it compares to an external call using sed, awk or Perl.
As usual, Ray Rocks!
Bestest!
Bestest!
John Miceli
System Specialist, MCP, MCDBA
Berkley Technology Services
"Good Morning. This is God. I will be handling all your problems today. I will not need your help. So have a great day!"
John Miceli
System Specialist, MCP, MCDBA
Berkley Technology Services
"Good Morning. This is God. I will be handling all your problems today. I will not need your help. So have a great day!"
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
f(kN), where k is some constant and N is the number of items in the delimited list. That is, time taken is a linear function of number of items, for sufficiently small items (such as city names).
Remember that this function is sorting a delimited list of city names for each record processed - it is not sorting the file.
Remember that this function is sorting a delimited list of city names for each record processed - it is not sorting the file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.