Hi All,
I have a requirement to get only distinct values from the flat file and to load the distinct into Database. The source file contains multiple records.
GD
Distinct values
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
There are a few options. You could extract it into a local hash file then put it into the database. The primary key on the hash file should ensure you only get distinct rows added to the database.
If you are using Oracle you could use the option to only add new rows and not update existing rows.
You could follow the advice in Ray's recent email, presort the file by primary key, use stage variables to identify and ignore duplicate rows.
Vincent McBurney
Data Integration Services
www.intramatix.com
If you are using Oracle you could use the option to only add new rows and not update existing rows.
You could follow the advice in Ray's recent email, presort the file by primary key, use stage variables to identify and ignore duplicate rows.
Vincent McBurney
Data Integration Services
www.intramatix.com
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Hi Ray,
To test this i just added a source ODBC stage to pick data of single column which is not unique followd by a pass through transformer stage which iam just using to connect to the target ODBC stage for writing unique values to oracle 8i database.
I see that the target ODBC stage does have the option "Insert new/update existing row " but no "insert new only option". This is in version 6.0. Please let me know if i am missing something.
Thanx
Rahul
To test this i just added a source ODBC stage to pick data of single column which is not unique followd by a pass through transformer stage which iam just using to connect to the target ODBC stage for writing unique values to oracle 8i database.
I see that the target ODBC stage does have the option "Insert new/update existing row " but no "insert new only option". This is in version 6.0. Please let me know if i am missing something.
Thanx
Rahul
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
Do you have an Oracle client installed on your datastage server? If so then you should be able to use the Oracle OCI stage instead of the ODBC stage. This will give you more update options and should have faster performance.
There are some configuration instructions in the PDF Upgrade guide and the install readme that will help you get this stage working. There are also quite a few threads in the forum archives on this subject.
Vincent McBurney
Data Integration Services
www.intramatix.com
There are some configuration instructions in the PDF Upgrade guide and the install readme that will help you get this stage working. There are also quite a few threads in the forum archives on this subject.
Vincent McBurney
Data Integration Services
www.intramatix.com
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
If you're using ODBC as source you could, of course, use user-defined SQL including the DISTINCT keyword.
If you have Microsoft Office, you have an ODBC driver for text files; would not like to predict the performance (!) of SELECT DISTINCT though!
I'm very surprised that you say that the 6.0 ODBC stage lacks "insert new rows only" as an update option. Right now I do not have access to a 6.0 system, so cannot check and advise. Somebody else?
If you have Microsoft Office, you have an ODBC driver for text files; would not like to predict the performance (!) of SELECT DISTINCT though!
I'm very surprised that you say that the 6.0 ODBC stage lacks "insert new rows only" as an update option. Right now I do not have access to a 6.0 system, so cannot check and advise. Somebody else?
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: