happy thanksgiving guys,
in one of my jobs iam getting 50 rows for each store/date combination into a transformer.i will have to check the 50 rows for one condition and i can output the 50 rows only if 80% of the 50 rows satisfy that condition.the problem over here is with a single transformer stage i cannot do it as i can decide whether i can output the 50 rows after i read the 50 rows so i will have to buffer the rows instead of writing it to the output. i need suggestions on this regarding the usage of the trasnformer stage for this purpose. if anyone can give a better idea i can change the logic.
thank you,
ravi
how to check the set of 50 rows
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 38
- Joined: Thu Nov 04, 2004 10:36 am
-
- Premium Member
- Posts: 66
- Joined: Wed Mar 05, 2003 6:03 pm
- Location: Brisbane, Australia
- Contact:
Hi Ravi,
I would suggest you do this in a 2 part process. The first part takes your input data passes it through a transform and gives a rank\weight to each record based on your condition. Then load this rank along with your keys (store\date) into a UV table.
The second part again takes your input data into a transformer which has a lookup to the pre-loaded UV table. Use a group by on the UV table with a sum on the rank to identify for each store\date combination the total rank (this would most likely be a sum aggregation). Within the transformer match your input data to the lookup data and place a constraint on the output side of the transformer that only passes data through if the total rank from the lookup is >= 80%.
If you have a large volume of data you may find the lookup is slow, so it would be worth while aggregating the data before you load the lookup.
Sorry if this isn't what you were after, at least it should give you another approach to how you are doing things.
I would suggest you do this in a 2 part process. The first part takes your input data passes it through a transform and gives a rank\weight to each record based on your condition. Then load this rank along with your keys (store\date) into a UV table.
The second part again takes your input data into a transformer which has a lookup to the pre-loaded UV table. Use a group by on the UV table with a sum on the rank to identify for each store\date combination the total rank (this would most likely be a sum aggregation). Within the transformer match your input data to the lookup data and place a constraint on the output side of the transformer that only passes data through if the total rank from the lookup is >= 80%.
If you have a large volume of data you may find the lookup is slow, so it would be worth while aggregating the data before you load the lookup.
Sorry if this isn't what you were after, at least it should give you another approach to how you are doing things.
Cameron Boog
-
- Participant
- Posts: 38
- Joined: Thu Nov 04, 2004 10:36 am
can we do it without storing
hi thanx for our reply,
can we do the job without storing it in a database as it takes lot of time.
can we do the job without storing it in a database as it takes lot of time.
Input -> Transformer -> Sort -> Transformer -> Output.
First transformer = use a counter method (search for it) that handles your conditions. Pass the counter output appended to each record.
Sort the data so that the maximum value of the counter are ranked first.
Second transformer would handle the logic of "If this group of key have a value above this, trigger this behavior (output record to output 1 using constraint)."
Good luck.
First transformer = use a counter method (search for it) that handles your conditions. Pass the counter output appended to each record.
Sort the data so that the maximum value of the counter are ranked first.
Second transformer would handle the logic of "If this group of key have a value above this, trigger this behavior (output record to output 1 using constraint)."
Good luck.