Transformer loop, cache or anything else

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
deps1972
Premium Member
Premium Member
Posts: 12
Joined: Thu Sep 16, 2010 10:42 pm

Transformer loop, cache or anything else

Post by deps1972 »

Hi Everyone,

I have complex case of transactions where I have to match sale to the purchase transactions by say starting with a sale transaction and compare to the purchase transaction for units sold . Based on the units sold GT or LT units purchased I have to split the sale or the corresponding buy transactions. For Example

Code: Select all

"contract_id"	"asset_id"	"tran_type_flag"	"process_orderKey"	"salelog_unitssold"	"allocatedSaleUnits"	"sale_balance"	"parcel_units"	"allocatedPurUnits"	"pur_balance"
"IH00049955"	"IP00000722"	"SALE"	"1"	"4500.0000000"	"0.0000000"	"0.0000000"	\N	"0.0000000"	"0.0000000"
"IH00049955"	"IP00000722"	"SALE"	"2"	"7690.0000000"	"0.0000000"	"0.0000000"	\N	"0.0000000"	"0.0000000"
"IH00049955"	"IP00000722"	"SALE"	"3"	"7423.1898000"	"0.0000000"	"0.0000000"	\N	"0.0000000"	"0.0000000"
"IH00049955"	"IP00000722"	"PURC"	"1"	\N	"0.0000000"	"0.0000000"	"449.0203000"	"0.0000000"	"0.0000000"
"IH00049955"	"IP00000722"	"PURC"	"2"	\N	"0.0000000"	"0.0000000"	"603.9952000"	"0.0000000"	"0.0000000"
"IH00049955"	"IP00000722"	"PURC"	"3"	\N	"0.0000000"	"0.0000000"	"499.6607000"	"0.0000000"	"0.0000000"
"IH00049955"	"IP00000722"	"PURC"	"4"	\N	"0.0000000"	"0.0000000"	"470.5078000"	"0.0000000"	"0.0000000"
"IH00049955"	"IP00000722"	"PURC"	"5"	\N	"0.0000000"	"0.0000000"	"7398.2733000"	"0.0000000"	"0.0000000"
"IH00049955"	"IP00000722"	"PURC"	"6"	\N	"0.0000000"	"0.0000000"	"2129.2241000"	"0.0000000"	"0.0000000"
"IH00049955"	"IP00000722"	"PURC"	"7"	\N	"0.0000000"	"0.0000000"	"1868.8725000"	"0.0000000"	"0.0000000"
"IH00049955"	"IP00000722"	"PURC"	"8"	\N	"0.0000000"	"0.0000000"	"1463.5798000"	"0.0000000"	"0.0000000"
"IH00049955"	"IP00000722"	"PURC"	"9"	\N	"0.0000000"	"0.0000000"	"4730.0561000"	"0.0000000"	"0.0000000"
So here I will have to take 1st SALE transaction start comparing to PURC TRANSACTIONS, COMPARE SALE amd PURC UNITS and allocate values in the rest of the columns. IF my PURC units are more than SALE units I need to split PURC Transaction until SALE UNITS matches with PURC. Then again start this process from SALE 2 and so on. These transactions need to be processed in a certain order and can't be separated.
I am just wondering if some has done something like that in DS Transformer or any other way in Datastage or it is case to be done outside Datastage.

Any comments I would appreciate
dj
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I confess to not understanding what you are trying to do, despite the good description. Could you explain in a bit more detail the steps that you need to process for "SALE" "1"?
deps1972
Premium Member
Premium Member
Posts: 12
Joined: Thu Sep 16, 2010 10:42 pm

Post by deps1972 »

it is difficult to explain exactly in words what I need to do. let me try again.

1. Read the 1st Record in the example above (which is 1st SALE transaction) and get the value from column salelog_unitssold
2. Go to 1st PUR Transaction which is the 4th record in the example
3. IF salelog_unitssold > parcel_units set allocatedSaleUnits, allocatedPurUnits = parcel_units and sale_balance = salelog_unitssold - allocatedSaleUnits
4. Now take the sale_balance from step 3 and keep doing step 3 for next PUR transaction until sale_balance = 0

5 when doing step 4 if sale_balance < parcel_units then start doing the above mentioned steps for SALE transaction with the parcel units

I am not sure if I am confusing you more or making some sense but basically I need to have SALE transaction and PURC transaction side by side and based on sale units and purchase units (which ever is bigger) I need to allocate sale units to purchase transaction and vice versa. Also while allocating units across the transactions say if sale_balance < parcel_units(PURC transaction) I will have to create multiple transactions for PURC transaction so that sale_balance = parcel_units. So I have move in between SALE and PURC transactions.

I hope it make some sense
dj
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Hmm... I'm still not quite sure of what your goal is here.

1. SALE 1 Units Sold 4500
2.
3. Record 4 PUR 1 has 449.0203000

So it SALE 1 written now with the modified value of 4050.9797? Or is the second PUR record now processed and so on?

It would seem, at first, that this isn't really possible within a transform stage. Although at 8.7 with loops you might reach the PUR records first, adding their column values to a stage variable, and then looping through that list once you read the associated SALE record.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Transformer looping was introduced in 8.5, was it not?
-craig

"You can never have too many knives" -- Logan Nine Fingers
deps1972
Premium Member
Premium Member
Posts: 12
Joined: Thu Sep 16, 2010 10:42 pm

Post by deps1972 »

3. 1st PUR record will get a value 4050.9797 - 449.0203 in the column allocated_saleunits and then balance will be taken to the next PUR record for setting value in allocated_saleunits column of PUR record. This will continue until the sale_unit_balance is 0. This is the 1st step
When sale_unit_balance is 0 I need to pick up the pur_unit_balance and do the same process for NEXT SALE record (Record 2 in my example above) until my pur_unit_balance = 0. So it will be a new loop inside the 1st loop I guess.

We are using 8.5 and I was playing around with the examples given in the Developer guide for looping in Transformer how ever in my case it is not simply adding up the column value for a key change and doing some column calculation, it is bit more complicated than that and that's why I was wondering if anyone has experienced doing similar stuff in data stage in transformer or in any other stage
dj
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

So you are attempting to generate a running record of the "SALE" rows as they are reduced by "PURC" rows? If that is correct, then you expect 5 rows to be created to account for "SALE" "1", right?

If I'm still somewhere near the requirement, you will need to split the SALE from PURC records. The PURC becomes the driving record and SALE a "lookup". A server job with a self-managed hashed file may do the trick, storing the SALE records and then requesting the current record back for each PURC record (via a routine).

Or maybe a running total within each PURC record holding the pre_sub_total and post_sub_total.
PURC 1 449.020... 0 449.020...
PURC 2 603.995... 449.020... 1053...
PURC 3 499.660... 1053... 1500+...

and a running total of SALE
SALE 1 4500... 0 4500...
SALE 2 7690 4500 12190

then join the two streams together on PURC.pre >= SALE.pre and PURC.post < SALE.post. Something of that ilk?
deps1972
Premium Member
Premium Member
Posts: 12
Joined: Thu Sep 16, 2010 10:42 pm

Post by deps1972 »

Yes pretty much what you understood, I need to do. So you are suggesting that no options in Parallel jobs and I may be looking at writing custom code. Is that correct?
dj
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

You can do the running total method in Enterprise (assuming you partition appropriately)... it's an idea, can't say it is fool proof as haven't put a lot of thought in to the ins and outs
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I can't think of a viable means of doing this efficiently in PX without resorting to writing a bit of code, unfortunately. If doing it in server without resorting to hashed files a routine that stores the records between calls (using a COMMON block) would work quite well and not be too complicated to code; doing it in c++ as an external call would also be viable.
Post Reply