Zip_Code, is obtained by expanding the zip-code range starting from the MIN_ZIP_CODE, incrementing it by 1 until the MAX_ZIP_CODE value is reached.
Any idea of how this can be achieved? My brain's half dead. Any ideas/suggestions/pointers are greatly appreciated.
Thanks,
Whale.
Last edited by I_Server_Whale on Mon Jan 14, 2008 9:37 pm, edited 2 times in total.
Anything that won't sell, I don't want to invent. Its sale is proof of utility, and utility is success.
Author: Thomas A. Edison 1847-1931, American Inventor, Entrepreneur, Founder of GE
4 not 14? The reason is already in original post: "brain's half dead".
External filter stage that runs a shell script to iterate through the values is one way, if you're comfortable with scripting.
Build stage to iterate through the values is another way, if you're comfortable with C++ coding.
There's nothing out of the box for generating multiple ranges of rows.
With only one range you could, of course, use a Row Generator stage.
There is an easy solution using a pair of jobs, a server job calling UtilityRunJob and passing the limits and a parallel job invoked from it that uses a Row Generator stage.
Just some ideas.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Oops!! Sorry, Folks. Had a long day. Yes. It should have been 04 and not 14. I don't think range lookup will be of help.
I have corrected the previous post.
Thank you, Ray!. I will try your suggestions. What about using stage variables to that effect? Read in the two columns into two stage variables and generate records based on the logic and send them to the output link.
Just wondering. But thanks again for your valuable ideas. Will try them and get back to you.
Best Regards,
Whale.
Anything that won't sell, I don't want to invent. Its sale is proof of utility, and utility is success.
Author: Thomas A. Edison 1847-1931, American Inventor, Entrepreneur, Founder of GE
If you can tell me how to generate an arbitrary number of rows out of a Transformer stage for each row in, I'd be happy to know it.
There is a kluge where you use - typically - a routine to generate a long string with a line terminator between each generated "output row", and output that to a sequential file so that the line terminators turn into, umm..., line terminators, but that's just a kluge and you would still need a separate job to read the sequential file (since "blocking" - passive stage with input and output links - is expressly forbidden in parallel jobs.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
I had a similar issue few weeks ago. Here is how I got around the problem.
1. Take the source file in to a transform. In the transform add a field call dummy key, place a constant value in it.
2. Use the row generator with a counter say 1 to 100. The output of the stage will have the counter plus a dummy key field with the same value as in step one.
3. Take the output of both 1 and 2 and join them together in a join stage. Each source file record will now have 100 records.
4. Send the output of the join stage into another transform to removed all the extra fields (dummy key, counter) and possible remove any records you don't want in the final output.
I had a similar issue few weeks ago. Here is how I got around the problem.
1. Take the source file in to a transform. In the transform add a field call dummy key, place a constant value in it.
2. Use the row generator with a counter say 1 to 100. The output of the stage will have the counter plus a dummy key field with the same value as in step one.
3. Take the output of both 1 and 2 and join them together in a join stage. Each source file record will now have 100 records.
4. Send the output of the join stage into another transform to removed all the extra fields (dummy key, counter) and possible remove any records you don't want in the final output.
That technique would work, but it's wasteful, and would require a priori knowledge of the maximum number of rows that might need to be generated from any one range (in your case 100 was sufficient - or was it ?!! How would you know (in all cases)?).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod wrote:That technique would work, but it's wasteful, and would require a priori knowledge of the maximum number of rows that might need to be generated from any one range (in your case 100 was sufficient - or was it ?!! How would you know (in all cases)?).
Exactly! ATBolton's solution wouldn't work as Ray pointed out. Kumar's solution points to Server solution which I'm trying to avoid. I'm on my way to building a build-op (one of Ray's suggestion).
Will let you know how it goes!
Thanks for all the help rendered! Appreciate it!
Whale.
Anything that won't sell, I don't want to invent. Its sale is proof of utility, and utility is success.
Author: Thomas A. Edison 1847-1931, American Inventor, Entrepreneur, Founder of GE
Not an excellent way of doing it but would like to suggest an option below. else i thibnk you have to go for custom stage.
1> Create a file in O/S (UNIX) to be used as lookup with layout as below.
count, records
1,1
2,1
2,2
3,1
3,2
3,3
and so on
2> Use this file as multiple lookup to generate your output. i.e.
a> In your original file , generate a new column call record_count = max_zip_code - min_zip_code
b> Do multiple lookup of record_count(source) with count(step 1 lookupfile)
santosh wrote:Not an excellent way of doing it but would like to suggest an option below.
Santosh,
Can you explain why a "Build-Op" is not an excellent way of doing it compared to your round about method?
Thanks,
Whale.
Anything that won't sell, I don't want to invent. Its sale is proof of utility, and utility is success.
Author: Thomas A. Edison 1847-1931, American Inventor, Entrepreneur, Founder of GE