Page 1 of 2

stage variables

Posted: Tue Apr 04, 2006 11:08 am
by ds1user1
Hi can someone help me in defining stage variables for a simple logic.

I am having the input as:

No Pos
1 A
1 A
1 A
1 A
2 B
2 C
3 D
3 D
3 D
4 C

And i need o/p to be

No Pos
1 A
1 A
1 A
1 A
3 D
3 D
3 D
4 C

Reject
2 B
2 C

I am pretty new to datastage.Anyhelp is appreciated.

Thanks
siri

Re: stage variables

Posted: Tue Apr 04, 2006 12:00 pm
by deployDS
Please explain your logic. What is the basis for rejecting those 2 records?

Re: stage variables

Posted: Tue Apr 04, 2006 12:10 pm
by ds1user1
deployDS wrote:Please explain your logic. What is the basis for rejecting those 2 records?
I am rejecting those two rows because the POS is not same for that particular ID.

For the given ID the positions should be same orelse reject that complete set.

Thanks

Posted: Tue Apr 04, 2006 12:29 pm
by pavankvk
i have another solution for this problem.i will not use stage variables though.

Since u said ur input will have same IDs and Pos and u need to filter out the odd ones, i suggest you use remove duplicate stage with keys id and pos to identify all the records that you want to keep. Collect them into a dataset.

Now take ur original input and do a left join and check for records which are in the input and not in ur desired dataset. that eliminates the odd records from ur file and u can have a final output with the desired data.

hth

Posted: Tue Apr 04, 2006 12:58 pm
by ds1user1
pavankvk wrote:i have another solution for this problem.i will not use stage variables though.

hth
Its not working that way pavan.

Thanks

Posted: Tue Apr 04, 2006 1:02 pm
by pavankvk
ds1user1 wrote:
pavankvk wrote:i have another solution for this problem.i will not use stage variables though.

hth
Its not working that way pavan.

Thanks
i am sorry..i will give another logic..even i realised it just now..thats not duplicate stage,i need to use aggregator

Posted: Tue Apr 04, 2006 1:09 pm
by pavankvk
Can you try this.

use an aggregator on id to filter out the single occurances.since they will be part of the final out put.

this identifies records like 4,C

now filter out these unique occurances from ur input and then use aggregator again on id and pos and to get records with count=1.this identifies the records like

2,B
2,C

Now for the final output concat 4,C records and records with ID,POS count > 1.

that shud give the final output.

Posted: Tue Apr 04, 2006 2:56 pm
by pavankvk
i tested this and its working..it may not be a elagant solution..but i go with it because i dont like to use stage variables.

Posted: Tue Apr 04, 2006 3:16 pm
by ds1user1
pavankvk wrote:i tested this and its working..it may .

Hi Pavan,

I am having like 20000 rows in the input and like 30 columns.The method you had shown me seems to be little bit complicated thats y i preferred asking for stage variables.Anyway can u just give me some more idea like while aggregating what should be my key and how to filter the data.

Thanks

Posted: Tue Apr 04, 2006 3:17 pm
by ray.wurlod
An approach using stage variables may be found here

Posted: Tue Apr 04, 2006 3:20 pm
by ds1user1
ray.wurlod wrote:An approach using stage variables may be found here
I am unable to see the replies.

Thanks

Posted: Tue Apr 04, 2006 3:41 pm
by pavankvk
ok,

give me the details like ur key columns..better give the lay out and tell me what u r trying to do..

its not complicated..if u can give details,i can explain step by step

Posted: Tue Apr 04, 2006 3:55 pm
by pavankvk
Here is ur file

ID,POS
1,A
1,A
1,A
1,A
2,B
2,C
3,D
3,D
3,D
4,C

Now to identify single occurence records like 4,C use an aggregator stage.
In the stage specify the
group as ID
Aggregation type: count rows

give some name to output column..say ID_Count

Have a transformer after the aggregator and check for ID_count=1 and ID_Count > 1 in the constraint. u get 2 links as output from transformer.

Take the link with id_count=1 and do a left outer with the input. Left being input stream.You can use a copy stage to get ur input stream again for join.
After the left outer join,u can check for null value in the non key columns of the right stream and then output the non-null stream to a funnel.----> 1

take the null stream to a aggregator again and this time, ur keys are ID,POS.
now do the same as u did earlier..check for count=1 and count>1

This time count =1 will be 2,b and 2,c.

Inner Join the count>1 with the input to get all the records ----> 2

Funnel 1 and 2 to get the final output and collect the rejects in another file.

Posted: Tue Apr 04, 2006 3:56 pm
by ds1user1
pavankvk wrote:ok,i can explain step by step
Thanks pavan

I am having 30 columns.Out of these,i am mainly concerned with 2 columns,but in output i should have all these columns.

ID is the main key column.

POS is the column on which my transformation depends.

I am having multiples rows for the ID like he use to hold different POS in the company.

I need to check that for a particular ID there should be only one POS allotted.

if any ID is assigned with more than one POS then i had to capture these rows in the Reject file.

Here for the ID=1 the POS in all the rows is A thats y it is there in the output.Whereas for ID=2 the POS is B and C which are different so i had to reject this whole set.

I think this is clear.

Thanks

Posted: Tue Apr 04, 2006 4:00 pm
by ray.wurlod
ds1user1 wrote:
ray.wurlod wrote:An approach using stage variables may be found here
I am unable to see the replies.

Thanks
For less than $1 per week (which goes towards paying for this site's bandwidth) you can.