SELECTING ROWS FROM A FILE

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
yalamanchili
Participant
Posts: 30
Joined: Fri May 26, 2006 6:01 am

SELECTING ROWS FROM A FILE

Post by yalamanchili »

Hi All,

while loading data from input file to ODBC i want to select rows basing on distinct values from a column.

In oracle data base if we want to select the records from a data base we will use "SELECT DISTINCT COLUMN_NAME FROM TABLE_NAME".

how can i apply this logic while loading data from file to odbc.

Waiting for your reply
sb_akarmarkar
Participant
Posts: 232
Joined: Fri Sep 30, 2005 4:52 am
Contact:

Post by sb_akarmarkar »

Use aggregator stage.....


Thanks,
Anupam
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

If you are talking about removing duplicates then you can search for this. There are lots of posts on removing duplicates. You can do it using a hash file or land your data in a staging table and then select distinct.
Mamu Kim
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

1) Load it into a work table first, then 'select distinct' from there to your target table.

2) Sort your data based on that field, then use stage variables to only pass the row with the first occurance of 'column name'.

3) Load into a hashed file keyed on 'column name' then source to ODBC from there.

4) Use a hashed reference with 'column name' as key. Lookup and also write to it. Miss = record to ODBC and hashed file. Hit = duplicate to pitch.

There's four ways off the top of my head...
-craig

"You can never have too many knives" -- Logan Nine Fingers
yalamanchili
Participant
Posts: 30
Joined: Fri May 26, 2006 6:01 am

Post by yalamanchili »

Hi,

Than You fro your response.Can You Please Expalin to Me in Breif how to use Aggregator Stage.
sb_akarmarkar
Participant
Posts: 232
Joined: Fri Sep 30, 2005 4:52 am
Contact:

Post by sb_akarmarkar »

yalamanchili wrote:Hi,

Than You fro your response.Can You Please Expalin to Me in Breif how to use Aggregator Stage.
Allow row pass through aggregator make all column as group by except one column make that as FIRST/LAST/MAX/MIN in stage.....

Thanks,
Anupam
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

When reading the file, use sort -u as a filter command in the Sequential File stage.

Or, if you want a really slow solution, use the ODBC driver for text files, and use your SQL statement from an ODBC stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Or, load the data into a hashed file (in account) with a dummy key thats nothing but a sequential number. Then access that hashed file via universe stage and run the sql query on it.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Or... well, never mind. That's about enough different ways for one day.
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

I agree, i guess thats enough options for the OP to pick and choose from.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

kduke wrote:If you are talking about removing duplicates then you can search for this. There are lots of posts on removing duplicates. You can do it using a hash file or land your data in a staging table and then select distinct.
Kim has mentioned the option available in both Server and PX. You still left out with manual option. Open in a text editior and pick up the required column. :lol:
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Post Reply