Loop Conditions

New2DS · Post by **New2DS** » Wed Mar 09, 2005 10:06 pm

Hello All,

My source table have 10 columns and I have to add a new column count in my trasformation to load into the target table which have 11 columns.

I need some idea in implementing the logic below.

I have to compare the first 5 columns with the previous rows and if they are equal loop through the 6th column which is code and check the conditions. the rest of the columns will load straight into the target without any logic.

if code = 20 and 30 then delete row of code = 20 and count should equal to 1

example

if
a,b,c,d,e,20
a,b,c,d,e,30

then delete record of code =20 (or take only record code = 30)

output: a,b,c,d,e,30

some more conditions
if code =4 then delete records of code = 0 and 2
if code <> 4 and code = 0 and 2 then take record of code = 02.

IF code =8 occured for the same person in more than one time during the year count only the first occurence.

I desperately need your help.

thanks,

kcbland · Post by **kcbland** » Wed Mar 09, 2005 10:17 pm

You will need to do two passes thru the source data.

The first pass will establish the highest importance code encountered for a group. You will want to write that to a hash file with a group key and the highest importance code encountered. Your job will reference the same hash file it is writing to, first to check if the row is already in the hash file and then to write to the hash file if a more important code is found. Be sure to use separate hash file icons, and manually create the hash file first.

Then, in the second pass at the source data, reference the hash file using the group key and with a constraint throw away the appropriate rows and let pass thru the others.

In your example, a code 30, if found, means throw away all rows that aren't the 30. In the first pass you can establish that for one group, the most important code found is a 30, whereas another group only 20's were found. In another group, you found a 0, but then found a 2, which overwrites the 0, but then in the next row you found a 4, which is allowed to overwrite the 2. When the first pass is done, your hash file has the greatest importance code for each group.

New2DS · Post by **New2DS** » Wed Mar 09, 2005 10:25 pm

Thanks for the quick reply. As I am new to this can you please briefly explain what you exactly mean by highest importance code encountered

kcbland wrote:You will need to do two passes thru the source data.

The first pass will establish the highest importance code encountered for a group. You will want to write that to a hash file with a group key and the highest importance code encountered.

and how we compare with the previous rows (only 5 columns out of 10 and code is not part of these columns)

thanks

kcbland · Post by **kcbland** » Wed Mar 09, 2005 10:46 pm

Look back at your examples in your post. You refer to "code" as the column that indicates what should be done with the rows.

You gave some sketchy rules, such as a code 30 means throw away all rows with a code 20. That tells me that 30 is important. You gave more rules such as a code 4 means throw away rows with code 0 and 2.

So, you have to determine which code is the one of most importance for a group of rows.

Your last statement about code is not part of the columns, well, where is it? You know we can only be helpful if you clearly articulate your issue and use real examples. Your a,b,c,d,e,20 example has probably not helped if it truly doesn't represent the situation. To me, I would group all rows with "a,b,c,d,e" and take the code value of highest importance according to the rules you haven't even posted. What constitutes importance you have to decide.

New2DS · Post by **New2DS** » Wed Mar 09, 2005 11:44 pm

Sorry for the inconvinence. Here is the example

Names of the 10 columns

SR_ID,PAT_ID,TYP_CD,CNT_NBR,MB_NBR, EFF_DT,PR_CD,AGE, GRP_NBR,DIV_NBR,PR_CODE

If first six columns have the same values with different PR_CODE then based on the rules delete the complete records from the list.

If PR_CODE = 4 then do not count PR_CODE = 0 or 2

example:

Code: Select all

SR_ID,PAT_ID,TYP_CD,CNT_NBR,MB_NBR, EFF_DT,PR_CODE
123,H,CD,5986,666622,2004-12-31, 4
123,H,CD,5986,666622,2004-12-31, 0
123,H,CD,5986,666622,2004-12-31, 2

Output file should contain only the record below
123,H,CD,5986,666622,2004-12-31, (remaining columns), 4

Some of the rules:
If pr_code <> 4 and pr_code = 0 and 2 then count 2( meaning taking only the row of pr_code =2)

if code =8 and 6 then count 8
if code 32 and 10 then count 10

If PR_CODE = 8 occured for the same person more than one time during the reporting year count only the first occurence.

If PR_CODE = 32 and pr_code=10 occured for the same person more than one time during the reporting year count only the first occurence.

Is it a good idea to take only the six columns into a hash file with the key columns CNT_NBR,MB_NBR.

Code: Select all

SELECT SR_ID,PAT_ID,TYP_CD,CNT_NBR,MB_NBR, EFF_DT, PR_CODE
where PR_CODE in ('4','2','8','10'....)
group by 1,2,3,4,5,6

Need help in performing the conditions.

thanks

kcbland · Post by **kcbland** » Thu Mar 10, 2005 12:09 pm

You need to construct a truth table. One of your codes is the highest pecking order value. Another code is the second, another the third.

For example:

A > B > C > D > E > F

If you read a row for a group, and its code is E, then lookup the current value in your hash file. If the current value is less important than E (like F), then overwrite the row in the hash lookup with the E value. On the next row, the same comparison will be performed. If the incoming value is more important than the current reference value, then overwrite it, otherwise you can throw it away.

You need to write a Function that establishes this truth table. It will be a big case statement, like:

Code: Select all

IncomingValue = Arg1
CurrentValue = Arg2
BEGIN CASE
   Case IncomingValue = 'A' and (CurrentValue = 'B' OR CurrentValue = 'C' OR CurrentValue = 'D' OR CurrentValue = 'E' OR CurrentValue = 'F')
      Ans = IncomingValue
   Case IncomingValue = 'B' and (CurrentValue = 'C' OR CurrentValue = 'D' OR CurrentValue = 'E' OR CurrentValue = 'F')
      Ans = IncomingValue
   Case IncomingValue = 'C' and (CurrentValue = 'D' OR CurrentValue = 'E' OR CurrentValue = 'F')
      Ans = IncomingValue
   Case IncomingValue = 'D' and (CurrentValue = 'E' OR CurrentValue = 'F')
      Ans = IncomingValue
   Case IncomingValue = 'E' and (CurrentValue = 'F')
      Ans = IncomingValue
   Case @TRUE
      Ans = CurrentValue
END CASE

Now in your transformation job, reference that hash file and reference the hash file. Depending on the value of the code in the hash file, act accordingly to the row.

New2DS · Post by **New2DS** » Thu Mar 10, 2005 12:25 pm

Kenneth, thanks for the help.

ray.wurlod · Post by **ray.wurlod** » Thu Mar 10, 2005 6:59 pm

Could that be done with input sorted by PR_CODE (ascending) and running through an Aggregator stage, grouping by the other columns (which seem to be constant for variations in PR_CODE), and using Last as the aggregate function on the PR_CODE column?

If my assumption holds, this should offer a simpler solution than writing a routine.