How to do iterative lookups

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

How to do iterative lookups

Post by tonystark622 »

Hi all,

I've got a situation where I need to calculate the number of parts in a product. The product consists of a number of assemblies. Each assemblies may consist of other assemblies. Each of these assemblies will consist of one or more parts.

From the part perspective I have a quantity per assembly (QPA) and I know what part or parts are this parts next higher assembly (NHA).

What I've been told that I need to do is: For this part, take it's QPA and multiply that times it's NHA's QPA and multiply that time's the next higher assembly's NHA QPA. I need to repeat this process until I arrive at the top assembly (I don't have any more next higher assemblies). I have no idea how many levels deep these can be and the number of levels between any given part and the top assembly can vary, so I can't just string a specific number of transformers and lookup stages together until I had them all.

I have no idea how to do this without coding it procedurally in a DS routine. I would really rather code avoid coding this in a Routine. Does anyone have any suggestions on how I might accomplish this?

Thanks for your help,
Tony
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Sorry, not going to help much here I'm afraid, but...

I don't recall what DB you have, but can't you do this thru sql? Some sort of recursive union... heck, I'm trying to remember some of the odd sql syntax you typically only use in a test. [:I] How many tables are involved here? Only one? [thinking self-join] More than one?

Failing that, in Oracle I would think you could write a pl/sql package or function and then use it in a lookup, have it return the sum you are looking for.

I'm not sure how you'd do this inside DataStage. I usually try to fall back to the database, if at all possible, for stuff like this.

-craig
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Thanks, Craig.

We're using Oracle. I'm not real experienced with Oracle, but the suggestions that you've made make good sense to me. I'll look into them. It is only one table (I think). A function or stored procedure sounds like a likely solution.

I guess I was in the "if i only have a hammer, everything looks like a nail" mindset. [:D]

Have a good weekend.

Tony
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

The Oracle procedure is a good option, however the major drawback is that fumbling your way through PL/SQL will be time consuming.

There is a way of doing it within a DataStage server job which is not pretty but it is very sneaky and it requires no code! First you have to know how many levels of NHA you can get to by finding the longest chain. Then you create a server job with enough transfomers to reach the top of that chain.

You start with an input stage pulling in the parts list. This input is fed to a chain of transfomers, let's say there are 10 of them. Each transformer has a lookup retrieving the next NHA and the QPA for that NHA. If an NHA is found it multiples the lookup QPA by the passed in QPA, increasing the value with each transform. If you reach a transfomer where no NHA is found you exit out of the chain and write out the record and move to the next row.

The thing that makes it maintainable is the rather nice feature of shared containers that lets you add multiple copies of a container to a job. First create the job with just one transformer with two output links leading to your database/file output stage (as though you only had 1 level). There is one output for when NHA is found and one for when it is not. Now highlight the transformer, the lookup and the NHA not found output stage and turn it into a shared container. Now find that shared container in your repositry browser and drag a few copies of it onto your job. Join them together in a chain with the last one leading an output stage. This job should now process each input row and output a calculated QPA result.

Obviously if you have 100 levels of assemblies to reach the bottom then it starts to look a bit silly and can take too long to process. If a row passes through your chain and out the other end then you don't have enough transformers and you should produce a warning or error for that row.

Vincent McBurney
Data Integration Services
www.intramatix.com
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Hi Vincent!

I am considering lookup stages for each level of NHA. I hadn't considered shared containers (I've never made one work in v6.0). That's something to think about... I'm not sure of the maximum number of levels at this point. If I can get my users and analysts to admit to a reasonable number, this may be the solution I end up with. As I said before, I'm not looking forward to coding this procedurally.

I've been having some weird behavior when I group stages in a job and create a shared container. Specifically, when I open the container, nothing's there. Is this normal? The stages that I selected before I tried to create the shared container disappeared from the job and a container stage appeared, although the input and output links that were originally connected to the selected stages aren't connected to anything any more. When I open the shared container, it doesn't have anything in it.

I appreciate your time and your suggestions.

Tony
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

I'd get onto Ascential technical support about that one, it does not sound right at all. Sounds like your shared container is a shared black hole. Try to reproduce it in a simpler scenario, for example job with a sequential file with one transform outputting to another sequential file with just one field. Turn the transformer stage into a shared container to see if the same thing happens. If the same thing happens then you have a problem somewhere in your server installation.

Vincent McBurney
Data Integration Services
www.intramatix.com
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Thanks. As soon as I have a minute to look into this, I'll look into it.

Tony
inter5566
Premium Member
Premium Member
Posts: 57
Joined: Tue Jun 10, 2003 1:51 pm
Location: US - Midwest

Post by inter5566 »

Tony,

Do you have a way to identify the bottom level parts? Does each part row in your DB have only one parent(NHA), leading to multiple rows for parts which belong to multiple assemblies?

If you say yes to these two questions, then the logic is fairly simple. Initialize the QPA for all base level parts to one(1). Create a sequential file unload of your table. Create a DS job which will read in this sequential file and do a lookup to your table. This lookup will then split the stream into zero(0) and non-zero QPA streams. The zero QPA stream should be written to a sequential file (to be used by subsequent iterations of this DS job). The non-zero QPA stream should go into another transformer lookup to your table. This lookup should be based on finding the keys to the parent part/assembly and add (not multiply) the current stream QPA to the parent part/assembly QPA. Pass the parent key and new QPA to an update of the part table, and update the parent row. Now create a looping structure in JCL which will execute this update process and then rename the zero QPA sequential file to the name of the original file. Have it run this loop until the sequential file is empty.


This is just a quick suggestion, and has not been tested. I'm not sure about the lookup and update on the same table with timing issues. You may have to expand on this to cover such problems.

Hopefully this might get you on the path.

Steve
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Hi Steve,

Thanks for the reply.

Unfortunately, each part can have multiple next higher assemblies. And, at this point, at least, I'm not sure how to tell if I have bottom level part, or not.

I'll read through your post a couple of times and make sure that I understand it, to see if I can adapt it to my situation.

Thanks for your time.

Tony
inter5566
Premium Member
Premium Member
Posts: 57
Joined: Tue Jun 10, 2003 1:51 pm
Location: US - Midwest

Post by inter5566 »

Tony,
If you have no way of telling when you are at the bottom level and able to look up, or at the top level and able to look down. Then you can't easily do what your tring to do. Cartesian product?

Steve
inter5566
Premium Member
Premium Member
Posts: 57
Joined: Tue Jun 10, 2003 1:51 pm
Location: US - Midwest

Post by inter5566 »

Tony,
Sorry to keep rambling on here, but this might help.

How about establishing your bottom level parts by selecting all parts which are not within a sub-select of parent parts. That is obtain a list of all parts which are not the parent(s) of another part.

Then try what I listed above.

Steve
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Well, as of a few minutes ago, the requirement to calculate the entire QPA of the entire product was removed. Apparently, the data is incomplete, so all they want is the current part QPA * the NHA QPA. I can do that.

For the archives--- I decided to take the original part info and put it in a hash file with the original QPA. I would also write some of the Original part info, the original part QPA, and which NHA part to look for next into a flat file.

A separate job (Job B) would read the flat file, do the NHA lookup, update the original part row in the hash file with current product of QPA * NHA QPA. Then write out some orig part and which part to look for next into a flat file. A batch job or job sequencer would look at this file and the separate job (Job B) would be executed repeatedly until there were no more rows that needed the NHA lookup (until no rows written to this file). When no more NHA lookups were needed the batch job or job sequencer job would then run the job that finished the processing (Job C).

Sorry if this is confusing. I didn't have all the exact implementation details implemented when they changed my design, so this is as far as it goes for now.

Hope this helps someone,
Tony
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Quick, deliver it before they change their minds. Once again crap data comes to the rescue.

Vincent McBurney
Data Integration Services
www.intramatix.com
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

Too late! They put it back in... kinda.

I'm going to have to study this and make sure I know EXACTLY what they want before I start coding on this...

Tony
Post Reply