DSXchange

Posted: **Thu Jun 29, 2006 6:05 am**

Hi All,

while loading data from input file to ODBC i want to select rows basing on distinct values from a column.

In oracle data base if we want to select the records from a data base we will use "SELECT DISTINCT COLUMN_NAME FROM TABLE_NAME".

how can i apply this logic while loading data from file to odbc.

Waiting for your reply

Posted: **Thu Jun 29, 2006 6:17 am**

Use aggregator stage.....

Thanks,
Anupam

Posted: **Thu Jun 29, 2006 6:20 am**

If you are talking about removing duplicates then you can search for this. There are lots of posts on removing duplicates. You can do it using a hash file or land your data in a staging table and then select distinct.

Posted: **Thu Jun 29, 2006 6:22 am**

1) Load it into a work table first, then 'select distinct' from there to your target table.

2) Sort your data based on that field, then use stage variables to only pass the row with the first occurance of 'column name'.

3) Load into a hashed file keyed on 'column name' then source to ODBC from there.

4) Use a hashed reference with 'column name' as key. Lookup and also write to it. Miss = record to ODBC and hashed file. Hit = duplicate to pitch.

There's four ways off the top of my head...

Posted: **Thu Jun 29, 2006 6:22 am**

Hi,

Than You fro your response.Can You Please Expalin to Me in Breif how to use Aggregator Stage.

Posted: **Thu Jun 29, 2006 6:26 am**

yalamanchili wrote:Hi,

Than You fro your response.Can You Please Expalin to Me in Breif how to use Aggregator Stage.

Allow row pass through aggregator make all column as group by except one column make that as FIRST/LAST/MAX/MIN in stage.....

Thanks,
Anupam

Posted: **Thu Jun 29, 2006 6:51 am**

When reading the file, use sort -u as a filter command in the Sequential File stage.

Or, if you want a really slow solution, use the ODBC driver for text files, and use your SQL statement from an ODBC stage.

Posted: **Thu Jun 29, 2006 7:02 am**

Or, load the data into a hashed file (in account) with a dummy key thats nothing but a sequential number. Then access that hashed file via universe stage and run the sql query on it.

Posted: **Thu Jun 29, 2006 7:04 am**

Or... well, never mind. That's about enough different ways for one day.

Posted: **Thu Jun 29, 2006 7:26 am**

I agree, i guess thats enough options for the OP to pick and choose from.

Posted: **Thu Jun 29, 2006 11:44 am**

kduke wrote:If you are talking about removing duplicates then you can search for this. There are lots of posts on removing duplicates. You can do it using a hash file or land your data in a staging table and then select distinct.

Kim has mentioned the option available in both Server and PX. You still left out with manual option. Open in a text editior and pick up the required column.

DSXchange

SELECTING ROWS FROM A FILE

SELECTING ROWS FROM A FILE