Represent multiple records into one with conditions

gateleys · Post by **gateleys** » Tue Feb 07, 2006 9:16 am

I have a file which maintains data of StudentID and Grade(A,B,C,D or F).

StudentID Grade
-----------  ------
111        B
222        A
222        C
111        B
111        A

Note: Duplicate rows are permitted.
Now, I need to transform the file into -

Code: Select all

StudentID Grade
-----------  ------
111        B B A
222        A C

Essentially, get one record for each student and concatenate all his grades in the same record.

I have tried to tackle the problem by sorting my data, then using stage variables to keep record of previous values. Use RowProcCompareWithPreviousVal() function, then pass it through Aggregator, GROUP BY StudentID and return the Last(Grade). But it seems to fail when the number of records are more than two. Please suggest.

kalpna · Post by **kalpna** » Tue Feb 07, 2006 10:07 am

u can do it these 2 ways..
First:
1) sort ur data on student id
2) take 4 stage variables in transformer in the order of
Curr = StudentID
temp = if Curr = Prev then 1 else 0
Prev = Curr
Grade = if temp = 1 then Grade:StudentID else StudentID
Assign this Grade value to Grade
3)Take an aggregator. group by StudentID take the Last Grade. Or u can directly write to hash file.

Second:
U can use Hash file with StudentID as key and take Grade as a multi valued column. then u can use Feild marks as seperators.

Cheers
kalpna

gateleys · Post by **gateleys** » Tue Feb 07, 2006 10:21 am

Thanks Kalpana,
I did it using a much simpler way. Sorted input based on StudentID. I declared a stage variable 'stgPrevValue' whose derivation was-

Code: Select all

If(RowProcCompareWithPreviousValue(InLink.StudentID)) Then stgPrevValue:InLink.Grade Else InLink.Grade

Then voila!! All I needed to do next was to group by StudentID and get Last row for Grade.

Thanks.
gateleys

Thibal · Post by **Thibal** » Tue Feb 07, 2006 10:22 am

You can also use the same Hashed files for reference and in target (I hope this sentence looks like english

)

-> Hash_ref and Hash correspond to the same Hashed File

In your transformer, when a line is looked-up : write.GRADE = look.GRADE : " " : read.GRADE
Otherwise : write.GRADE = read.GRADE

At the start, the Hashed file must exist, use Validate to create an empty Hashed file. (In every case, you must empty your hashed file before writing).

sjhouse · Post by **sjhouse** » Tue Feb 07, 2006 2:28 pm

I do this regularly using hash files and stage variables (sv).

sv1-NewRec; Derivation1-If STUDENT_ID = LastId Then 0 Else 1
sv2-Grade; Derivation2-If NewRec = 1 Then STUDENT_GRADE Else Grade : @VM : STUDENT_GRADE
sv3-LastId; Derivation3-STUDENT_ID

Write the results to a hash file (or other database file) with STUDENT_ID as the key field. The resulting record would be the following:

KeyFld - STUDENT; Derivation - STUDENT_ID
Field 1 - GRADE; Derivation - Grade

With STUDENT as the key field, each record will be overwritten until you get to the last record for that student. Also, you can use any delimiter you would like within the field. When the hash file is populated, you can read from it and write it to the desired location.

Stephen

acavarra · Post by **acavarra** » Fri Aug 20, 2010 7:24 am

Once I have the hashed file populated and read it as a reference lookup, how do I take a record out of the hash file like...

StudentID Grade
1 A@vmB@vmC
2 D@vmD

and populate a single row with multiple columns:

StudentID Grade1 Grade2 Grade3
1 A B C
2 D D

chulett · Post by **chulett** » Fri Aug 20, 2010 7:29 am

If you are using "value marks" between the fields you should (from what I recall) be able to "unnest" that field during the select of that data and automatically get all values in separate columns. Having it in a lookup may complicate that, I don't know off the top of my head and have no way to test it.

Otherwise, simply take the combined field from the lookup and use the Field() function to parse out the three values, one to each output column.

acavarra · Post by **acavarra** » Fri Aug 20, 2010 8:45 am

I tried the FIELD function, but the @VM was not recognized as the delimiter so it did not parse the data.

The entire source column was written to the first target column.

Any ideas?

Thanks.

chulett · Post by **chulett** » Fri Aug 20, 2010 8:50 am

Not sure why @VM would not work as the delimiter. Why not use a different delimiter then, something 'simple' like a pipe for instance?