Assign a value by weight~

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
gwen
Participant
Posts: 16
Joined: Fri Mar 11, 2005 4:16 am

Assign a value by weight~

Post by gwen »

Hello Again
:oops:

there is another job I have to modify but I'm still wondering if I can do it in DS job. I have 1000 person_id in a file and 3 ao_code in another file.
Each ao_code has its weight.
For example:
ao_code weight
______________
901 1
902 2
903 3

After the caculation,1/6 of 1000 person_id are assigned to 901,2/6 of 1000 person_id are assigned to 902 and 3/6 of 1000 person_id are assigned to 903.

Is any simple way to do this kind of job?So far I just caculated number of person_id for each ao_code, how do I find out one of the ao_codes is full and then run to next ao_code to be assigned?

thanks~
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Hi,
Is the number of records is constant? is the weight is constant? is it only 3 in number?
i have a solution, if requirement varies, change as required...
as u have already know to calculate the number of records per weight
( if not can be calculated by using @INROWNUM/@OUTROWNUM function. Maintain a new column say weight_col. Use transformer to populate that column as

Code: Select all

weight_col = if @INROWNUM < 1/6*1000 Then 1 else if @INROWNUM < 2/6*1000 Then 2 else 3.
)
and do a lookup to the file which has the ao_code to get the corespoding output.

hope this help u
regards
kumar
gwen
Participant
Posts: 16
Joined: Fri Mar 11, 2005 4:16 am

Post by gwen »

THANK YOU~KUMAR :)

Actually,this job is more complicate.I need to look up the ao_code by branch_nbr and weight
Table 1 has fields: branch_nbr ,person_id
Table 2 has fields:branch_nbr,ao_code,weight
Table 3 has fields:branch_nbr ,person_id,ao_code

for example:

Table1
branch_nbr person_id
________________________
200 19631025PI
200 A101160444
200 A103291755
200 A102040049
200 A103579754
200 A104066827
200 A104273377
200 A111209967
200 A120839464
306 H121599635
306 G221549430
306 F224525208
306 F221589880
306 F220783124
306 F220751819
306 F123168190
306 F122215949
306 F120493383
306 E222705529
306 A226688641
306 Y200025116
306 T221886536
306 R221636776
306 R122089895

Table 2
branch_nbr ao_code weight portion
_______________________________
200 901 1 1/6
200 902 2 2/6
200 903 3 3/6
306 909 1 1/3
306 907 2 2/3

use branch_nbr as key to lookup table 2 and caculate the portion by weight to assignt the match ao_code and insert the result to Table 3.
According to Table2,I should assign 1/6 of the person_ids at branch_nbr 200 to 901,2/6 of the person_ids at branch_nbr 200 to 902 ,3/6 of the person_ids at branch_nbr 200 to 903 and so on for branch_nbr 306.
From Kumar's solution, I need to set @INROWNUM to 0 when the job switch to assign another branch_nbr.
How can I check branch_nbr of the current row is different from the last row and set @INROWNUM to 0?

Gwen
clarcombe
Premium Member
Premium Member
Posts: 515
Joined: Wed Jun 08, 2005 9:54 am
Location: Europe

Post by clarcombe »

Not my solution but I found this from Ray W about how to compare two rows

You could probably find the technique by searching the Forum. The trick is that stage variables are evaluated in the order in which they are defined, so that, if SV1 access the value of SV3, it is accessing the value from the previous row.
Code:
SV1 derivation InLink.C1 <> SV3 And @INROWNUM >= 1
SV2 derivation InLink.C2 <> SV4 And @INROWNUM >= 1
SV3 derivation InLink.C1
SV4 derivation InLink.C2
Colin Larcombe
-------------------

Certified IBM Infosphere Datastage Developer
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It might be clearer - easier to maintain - if you take two passes through the data.

The first pass populates a hashed file after totalling (summing) the ao_code for each branch number. Depending on source, you can get this via SQL, or via an Aggregator stage.

The second pass then looks up the total ao_code for the current branch number from the hashed file, divides this into the ao_code of the current row to get the proportion.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
gwen
Participant
Posts: 16
Joined: Fri Mar 11, 2005 4:16 am

Post by gwen »

THANKS ALL~

From all your suggetions, I found some problems to acheive this job.

Code: Select all

weight_col = if @INROWNUM < 1/6*1000 Then 1 else if @INROWNUM < 2/6*1000 Then 2 else 3.
Table 2
branch_nbr ao_code weight portion
_______________________________
200 901 1 1/6
200 902 2 2/6
200 903 3 3/6
306 909 1 1/3
306 907 2 2/3

since the propotion is in the same column and dymanic, how do I put in the code above?
dzdiver
Participant
Posts: 36
Joined: Tue May 25, 2004 8:55 am
Location: global

Re: Assign a value by weight~

Post by dzdiver »

gwen wrote: there is another job I have to modify but I'm still wondering if I can do it in DS job. I have 1000 person_id in a file and 3 ao_code in another file.
Each ao_code has its weight.
For example:
ao_code weight
______________
901 1
902 2
903 3

After the caculation,1/6 of 1000 person_id are assigned to 901,2/6 of 1000 person_id are assigned to 902 and 3/6 of 1000 person_id are assigned to 903.
Hi Gwen,
As you use Oracle 9i, you could use 9i's NTILE(n) function which will divide query result into n buckets. eg if n=4 get quartiles, and if n=100, get percentiles. The buckets will all have roughly same number of rows.

e.g.
person_id ntile
1 1
2 1
...
34 2
35 2
...
66 2
67 3
...
99 3

(I know you mentioned 1/6, 2/6, 3/6 but whats about other 3 * 1/6 ?
did you mean 1/3, 2/3, 3/3? Or just use ntile(6) and multiply by 2?)

Alternatively, should you wish all buckets to have a same range, you can use WIDTH_BUCKET() function. Each bucket has same range but different num of rows.

You can then join the ntile() or width_bucket() result back to the weight table.

Hope thats of interest.

rgds,
B.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Red Brick also has NTILE - has had since day 1 - it's an idea that Oracle "borrowed".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Re: Assign a value by weight~

Post by bcarlson »

dzdiver wrote: Alternatively, should you wish all buckets to have a same range, you can use WIDTH_BUCKET() function. Each bucket has same range but different num of rows.

I am looking for solutions using width_bucket. Does anyone know if DB2 can do this as well? Has it caught up to the rest of the pack when it comes to ANSI standard window aggregates?
Post Reply