Page 2 of 2

Re: How to Select distinct records from Files

Posted: Wed Dec 21, 2005 3:44 am
by anil kumar
Hi thanks for ur quick response...
but still i have one doubt that ..in that scenario if u have duplicate records it will work..as u explained in the example what happens to first record itself..the condition will not satisfy..right....for the second record it is ok...

please clarify me ....

thanks

Re: How to Select distinct records from Files

Posted: Wed Dec 21, 2005 4:53 am
by DataStageCnu
anil kumar wrote:Hi thanks for ur quick response...
but still i have one doubt that ..in that scenario if u have duplicate records it will work..as u explained in the example what happens to first record itself..the condition will not satisfy..right....for the second record it is ok...

please clarify me ....

thanks
You can do it by using Hash stage or.. by loading into any database stage and then select distince.

Is it make sence...Let me know, if you have any questions

Posted: Wed Dec 21, 2005 7:00 am
by Tasneem
If u r sorting the records using the Sort stage,use the cluster key change feature to remove duplicates.
This works on the same logic as the StgVariables. :)

Posted: Wed Dec 21, 2005 7:07 am
by ravij
Hello Kumar

Where can difine the Sort command?


thanks in advance
RK

Posted: Wed Dec 21, 2005 1:40 pm
by ray.wurlod
Anywhere you like, but typically in the Input Values field when using ExecSH as a before-job or before-stage subroutine.

Re: How to Select distinct records from Files

Posted: Wed Dec 21, 2005 6:28 pm
by yaminids
Anil,

When you define the Stage variables assign them an initial value which wont appear in the input data

Yamini

Re: How to Select distinct records from Files

Posted: Thu Dec 22, 2005 2:07 am
by anil kumar
thank you very much...

it is working fine.

Use Sort Stg or command and then place stage variable

Posted: Thu Dec 22, 2005 2:20 am
by sumeet
I agree with this logic that Stage variable can be used to remove the duplicate files but stage variable stores the previous field value. so its better to have sort the data and club the duplicate recs together and then use the stage variable concept.