Process

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
shrey3a
Premium Member
Premium Member
Posts: 234
Joined: Sun Nov 21, 2004 10:41 pm

Process

Post by shrey3a »

Hi,
I've a scenario:
For example, say a student id record has been received that has an id#: 123 in Table A and a duration of 5 hours.
123 has 2 courses QW, PO associated with it in Table B. The record for student id: 123 will therefore be divided into 2 training records;

Record1: QW with Duration: 2 hrs 30 mins
Record2: PO with duration: 2 hrs 30 mins.

I'm not able to to generate the Distribution Duration field which has logic divide the field with number of reecords found for particular student id
below is an example

Input

Table A
Student id | Duration
123 | 5
124 | 6

Table B

Student id | Course
123 | QW
123 | PO
124 | AWER
124 | ASD

Output

Student id | Course | Distributed Duration
123 | QW | 2.5
123 | PO | 2.5
124 | AWER | 3
124 | ASD | 3

Thanks in Advance

Shreya
kris
Participant
Posts: 160
Joined: Tue Dec 09, 2003 2:45 pm
Location: virginia, usa

Re: Process

Post by kris »

Hi Shreya,
shrey3a wrote: Input

Table A
Student id | Duration
123 | 5
124 | 6

Table B

Student id | Course
123 | QW
123 | PO
124 | AWER
124 | ASD

Output

Student id | Course | Distributed Duration
123 | QW | 2.5
123 | PO | 2.5
124 | AWER | 3
124 | ASD | 3
Here is one solution to this problem.

First you need to find out how many different courses are there for each Student id. so you can refernce it and calculate distributed duration for each student id.

May be you can do all in one sql but am giving you this simple solution on top of my head.

Since you have those records in Tables. Do this userdefined sql in your database stage and write records to hashfile call it as 'tableC' with student_id as key. this can be a separate job or can do it in the main job.

Code: Select all

  select student_id, count(*) as HOWMANY from TableB group by student_id;
Now the main job:

Join TableA and TableB(inner join on student_id) in database stage and drive student_id, course, duration into transformer and do a lookup on TableC. You can calculate distributed duration (which would be duration/HOWMANY) in derivation and map it into the target. Don't forget to put constraint to make your lookup as inner join which you can give in different ways....................................one would be

Code: Select all

 NOT(IsNull(hashfilelink.student_id)) 
:) Should work for you.

Kris~
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

select s.student_id, m.course_id, m.marks / r.reccount
from students s, marks m,
(select student_id, count(*) reccount from marks group by student_id) r
where s.student_id = m.student_id
and s.student_id = r.student_id

You may need outer join if the links are not always there.
Post Reply