Page 1 of 2

Selecting distinct values and creating a list/array

Posted: Wed Aug 10, 2011 3:22 pm
by pdntsap
Hello,

The input is a flat file that has 'n' columns. I need to create an array/list to store the distinct values of one of the columns. I then need to execute a sequence of steps for each distinct value in the array/list. My questions are:

1. How can I create an array/list based on the distinct values of a column?
2. Can I then use a Start Loop End Loop acitvity and execute the sequence of steps for each value in the array/list?

Thanks for the help.

Posted: Wed Aug 10, 2011 3:48 pm
by ray.wurlod
1. Vertical pivot.
2. Yes. Use a "list of things" loop.

Posted: Thu Aug 11, 2011 9:51 am
by pdntsap
I have a sequential file input, then a Transformer Stage that uses stage variables to implement the pivot logic. Stage variables will be processed and output for each input row.
As an example:

Input Stage
column Variable
  • 1 1
    1 1
    2 1,2
    2 1,2
    3 1,2,3
    4 1,2,3,4
Howcan I just store the final Stage variable value (in this case 1,2,3,4) into a project level varaible so that it can be accessed in future jobs?

Thanks.

    Posted: Thu Aug 11, 2011 10:19 am
    by arunkumarmm
    Set the stage variable as user status of the job.

    Posted: Thu Aug 11, 2011 11:56 am
    by pdntsap
    Can user status be set for a parallel job? I understand that user status can be set only for server jobs.

    Thanks.

    Posted: Thu Aug 11, 2011 12:21 pm
    by arunkumarmm
    I never tried it before but cant you use a BASIC transformer and set the status?

    Posted: Thu Aug 11, 2011 3:03 pm
    by pdntsap
    Yes Arun. You are right.

    I created the following routine

    Code: Select all

    FUNCTION UserStatus(Arg) 
      Call DSSetUserStatus((Arg)) 
      Ans=Arg 
    RETURN(Ans) 
    and assigned UserStatus(StageVarn) to StageVarm. Now the problem is the routine gets called for every input row but I need it to be called only once at the very end.
    The thread below: viewtopic.php?t=100809 talks about calling it once either before or after the transform stage. I am looking into doing this but any help will be great.

    Thanks.

    Posted: Fri Aug 12, 2011 7:46 am
    by pdntsap
    Hello,

    I have the following routine

    Code: Select all

    FUNCTION UserStatus(Arg) 
      Call DSSetUserStatus((Arg)) 
      Ans=Arg 
    RETURN(Ans) 
    I would like to set the UserStatus to the final value of a Stage Variable used in the transformer. I believe I can use the After-stage subroutine in the Transformer stage. My question is how what should I choose from the drop-down list of After-stage subroutine so that

    Code: Select all

    UserStatus(StageVariable)

    gets called.

    Thanks for the help.[/code]

    Posted: Fri Aug 12, 2011 8:11 am
    by chulett
    If you have created it as a 'before/after' routine, it will show up in the drop down.

    Posted: Fri Aug 12, 2011 8:19 am
    by pdntsap
    But from the thread below:
    viewtopic.php?t=107023
    I thought that the routine must be created of type 'transform function'.

    I changed it to type 'before/after routine' but it does not return any value.

    Thanks.

    Posted: Fri Aug 12, 2011 9:44 am
    by chulett
    For one used in a field derivation, yes. For use before/after use you must create it as such... and it doesn't need to "return" anything other than success, it just needs to put a value in USERSTATUS.

    Also realize that it uses a hashed structure to store it, so it can only store one value and it's the normal 'last one in wins' rule as with hashed files, so no matter how many values you send all you'll end up with is the last one.

    Posted: Fri Aug 12, 2011 11:52 am
    by pdntsap
    I created the routine below as before/after subroutine

    Code: Select all

    SUBROUTINE UserStatus(InputArg,ErrorCode) 
      Call DSSetUserStatus((InputArg)) 
      ErrorCode = 0
      Return
    
    Now UserStatus shows in the before/after stage routine of the transform stage. I seLECT UserStatus in the After-stage routine and pass input values as StageVarn, Errornum. But the value of StageVarn is not being passeD to the subroutine but the literal text StageVarn. Please let me know what I am doing wrong here.


    I remember reading in the forum about the 'last one in wins' issue and to overcome that I believe I need to store the values using a delimiter and then read the individual values using the filed() function. But that is a challenege is for another day I guess.

    Posted: Fri Aug 12, 2011 1:28 pm
    by chulett
    What is "StageVarn"? Realize that after stage (or after job) all you'll have access to are job parameters, anything else is out of scope. If you need to pass out processing or derived data, you'll need to go back to a transform function inside a Transformer.

    Posted: Fri Aug 12, 2011 2:13 pm
    by pdntsap
    StageVarn was supposed to represent Stage variables, for example, StageVar1, StageVar2. Sorry that I was not more clear.

    So, like you said,if I need the stage variables to be stored and passed onto subsequent jobs/stages I need to go back to a Transform function and assign the transform function to a stage variable in a Transformer stage which will result in the transform function being exected for each input row.

    Thanks.

    Posted: Fri Aug 12, 2011 4:42 pm
    by chulett
    Correct... unless you can conditionally make the call. I've done something similar in the past where I call it once when the first record comes through and then never again as I know it would be a waste of time. To do this on the last record, however, would be a little more problematic unless you are running 8.5 which (from what I've read) has a new system variable (@EOD?) that is set to true on the last record.