Limit of Stage Variables in version 8

Infosphere's Quality Product

Moderators: chulett, rschirm

divstands
Participant
Posts: 128
Joined: Wed Jun 03, 2009 9:48 am

Limit of Stage Variables in version 8

Post by divstands »

What is the limit of stage variables in version 8?

I want to perfrom the following task:

Replace the a list of strings with oher defined strings for each(in the list) if they occur and wherever they occur in a given string.

For example

List:
Pedro Tom
Loc Localidad
San Man
Col Colonia

Hence the string
Loc San Pedro Arriba La Rosa Del CoL de Calle
should become
Localidad Man Tom La Rosa Del Colonia de Calle


In the above example, the list had only 4 entries, but if the list has 200-300 entries, then can this be done by using stage variables? What is the limit of the number of stage variables in QS v 8.0?

I tried doing it with 4 strings, it works( i use a stage variable repetitvely to replace each entry in the abobve list). But with 50 it doesnt?



If not possible by Stage variables , will it work in a function routine

If a routine with the following syntax for the above task correct:

Replace(Arg1,"Pedro","Tom")
Replace(Arg1,"Loc","Localidad")
Replace(Arg1,"San","Man")
Replace(Arg1,"Col","Colonia")
Replace(Arg1,"Entry5","replace5")
Replace(Arg1,"Entry6","replace6")
.
.
.
.
Replace(Arg1,"Entry50","replace50")
Ans=Arg1
Divya
Sreenivasulu
Premium Member
Premium Member
Posts: 892
Joined: Thu Oct 16, 2003 5:18 am

Post by Sreenivasulu »

We are working with more than 50 variables. What i am trying to convey is that it is nothing to do with variables. It takes more time to compile thats it but the functionality works.

There is something else in the code where-in the values held by the variables is forming a circular loop. Add another transformer to remove
this circular "strain".

Regards
Sreeni
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There is no limit on the number of stage variables until you run out of memory. I have seen jobs with a couple of hundred stage variables (legitimately) that worked fine.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
JRodriguez
Premium Member
Premium Member
Posts: 425
Joined: Sat Nov 19, 2005 9:26 am
Location: New York City
Contact:

Post by JRodriguez »

The limitation of stage variables, as other poster stated, will depend on your system resources ... and your initial list of string could be implemented using stage variables

But unless you have a static source, more new strings will come soon after the code got into production environment, and for each wave of string you would need to modify the code

If you would like to stay put with a DataStage approach, using a lookup table should add more flexibility, adding new strings should be fairly simple, and no code promotion needed

If your input is one long string, as you show in your sample, the parsing capability, plus the classification and lookup tables makes Quality Stage the way to go to tackle this requirement
Last edited by JRodriguez on Mon Aug 03, 2009 11:12 am, edited 1 time in total.
Julio Rodriguez
ETL Developer by choice

"Sure we have lots of reasons for being rude - But no excuses
divstands
Participant
Posts: 128
Joined: Wed Jun 03, 2009 9:48 am

Post by divstands »

Sreenivasulu wrote:We are working with more than 50 variables. What i am trying to convey is that it is nothing to do with variables. It takes more time to compile thats it but the functionality works.

There is something else in the code where-in the values held by the variables is forming a circular loop. Add another transformer to remove
this circular "strain".

Regards
Sreeni

But how would one know at which stage variable the "cirucular strain" is originating and from where to break.

I tried putting all the 440 stage variables( for the requirement 440 replacements i need) now.
I had a doubt because version 7.5 had the limitation of 20 stage variables.

I do not know why it is not working.
Divya
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

divstands wrote:I had a doubt because version 7.5 had the limitation of 20 stage variables.
Never, ever heard such a thing and stage variables came into the product well before 7.5. And sorry, can't really help with the "circular strain" question as I have no idea what that means. Never mind the fact that all you've said is when you go from 4 to 50 variables it "doesn't work". How can anyone on the other side of the glass help you when all we have to work with is "it doesn't work"? :?

Me, I would never consider doing something like this with a crapload of stage variables, this should be encapsulated into a routine of some kind. IMHO.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

divstands wrote:I had a doubt because version 7.5 had the limitation of 20 stage variables.
There has never been any limit on the number of stage variables (not counting very early versions of DataStage in which there were no stage variables at all).

Perhaps you are confusing this with the number of intermediate result variables (@1 through @20) in UniVerse I-descriptor expressions?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Based on your rather loose description it certainly looks like the best - most flexible - solution is a lookup table of some kind.

You could do it in a routine but, being a parallel job, the routine would have to be in C++ and, therefore, would not have the Replace() function (unless you created that also). So a huge dispatch table of some kind would probably form the fundamental design of your routine. Or perhaps two arrays, of old and replacement values, the first of which could be searched.

Stick with a lookup table.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
divstands
Participant
Posts: 128
Joined: Wed Jun 03, 2009 9:48 am

Post by divstands »

chulett wrote:
divstands wrote:I had a doubt because version 7.5 had the limitation of 20 stage variables.
Never, ever heard such a thing and stage variables came into the product well before 7.5. And sorry, can't really help with the "circular strain" question as I have no idea what that means. Never mind the fact that all you've said is when you go from 4 to 50 variables it "doesn't work". How can anyone on the other side of the glass help you when all we have to work with is "it doesn't work"? :?

Me, I would never consider doing something like this with a crapload of stage variables, this should be encapsulated into a routine of some kind. IMHO.
Please read the posts above for "circular strain"
Divya
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

What makes you think I haven't? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
divstands
Participant
Posts: 128
Joined: Wed Jun 03, 2009 9:48 am

Post by divstands »

chulett wrote:What makes you think I haven't? :?
Hi Chulett

The question was valid. I tried splittling the stuff in two transformers and a part of it works. So as mentioned by Sreeniwas, it is rather an issue of a of the sequencing/logic of the list(strings to be checked and replaced).

I m further investigating about the sequence and will let you know where i was going wrong.
Divya
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Perhaps so, but I can't comment on a term that someone made up and "circular strain" falls squarely into that camp. Seems pretty clear that a best guess would be that you had some kind of sequencing issue and while I sincerely doubt you had to split things up between transformers it may help "simplify" (if you can use that term with that many in use) things so you can find the remaining issue or issues.

And unfortunately, with an issue like that, you're pretty much on your own to solve it as no-one here will know precisely where you are going wrong.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

My take on the term is circular reference, for example where variable svOne depends on variable svFour for its definition while at the same time variable svFour depends on variable svOne for its definition. In practice this ought not to present a problem provided both have been initialized because, when svOne is being evaluated, svFour contains its value from the previous row (or its initial value for row #1).

Sreeni, what did you mean by the term circular strain?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
divstands
Participant
Posts: 128
Joined: Wed Jun 03, 2009 9:48 am

Post by divstands »

chulett wrote:Perhaps so, but I can't comment on a term that someone made up and "circular strain" falls squarely into that camp. Seems pretty clear that a best guess would be that you had some kind of sequencing issue and while I sincerely doubt you had to split things up between transformers it may help "simplify" (if you can use that term with that many in use) things so you can find the remaining issue or issues.

And unfortunately, with an issue like that, you're pretty much on your own to solve it as no-one here will know precisely where you are going wrong.
yeah i understand but. But my doubt is will the lookup table work for partial replacement of strings.

I have seen the use where say if DataElement=1, then "Amsterdam"
when DataElement=2, then "Sydney" and so on.... but does it work for partial replacement?
Divya
divstands
Participant
Posts: 128
Joined: Wed Jun 03, 2009 9:48 am

Post by divstands »

Also, can we use CONVERT or EREPLACE function in combination with lookup table

for example, will this command work in a transfromer:

CONVERT REFLOC@LOOKUP_1

where: REFLOC is the field name and LOOKUP_1 is the lookup table name
Divya
Post Reply