Joining the Hash file

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
MOHAMMAD.ISSAQ
Participant
Posts: 78
Joined: Fri Mar 02, 2007 4:54 am
Location: CHENNAI

Joining the Hash file

Post by MOHAMMAD.ISSAQ »

I have a text file as source and hash file as reference.
The key fields in hash file are k1,k2.
But in source file only one key field is present i,e. k1.

How can i join the hash file with the source file?

Please help me out..
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You can't, using a Hashed File stage.

And it's "hashed" file, not "hash" file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
MOHAMMAD.ISSAQ
Participant
Posts: 78
Joined: Fri Mar 02, 2007 4:54 am
Location: CHENNAI

Post by MOHAMMAD.ISSAQ »

Thanks ray for the correction.

Can you please suggest any other possible way to do it.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You can use a UniVerse stage. If the hashed file was created in the account (project) that's fine. If the hashed file was created in a directory you will need to create a pointer in VOC. Search the forum for the SETFILE command and how to use it. The UniVerse stage also supports multi-row return from a lookup - search on-line help or, again, DSXchange for more information.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Or build a version of the hashed file with only the one key that is appropriate for this task.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Your described "join" will create a product, unless you resolve your lookup to 1:1. Is this your desire? If you can create a hashed file that gives a 1:1 result then there's no issue. You need to decide if every source row needs to produce as many output rows as there are matching k1 rows in the reference data set.

The hashed file stage is inappropriate for your needs if you need a product because it must be a 1:1 reference. The UV/ODBC stage is a possibility with its multi-row result capability on reference, but you may be better served doing this inside a database with a work table.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
MOHAMMAD.ISSAQ
Participant
Posts: 78
Joined: Fri Mar 02, 2007 4:54 am
Location: CHENNAI

Post by MOHAMMAD.ISSAQ »

You're right kcbland, my hash file has only 1:1 reference.
I have two jobs:
1) In job1 one i'm creating a Hashed file with source as a sequential file.
There i have two columns say k1,k2.
To fetch the complete set of records from input i've set them as keys k1,k2.If i set k2 as the key only i'm not able to get the complete set.

2) In job2 i'm using the hashed file as a referenece to the another
source file(.txt).I'm reading the source file as a single output column.
The length of the column i've given around 50000.

In this source file the text from 1285 to 1295 say as 'key' will match with that of k2 of Hashed file.But k1 is not at all present in my source file.

So, how can i join this two files to extract the data..
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Load your input file into a work table in your database of choice, then use sql to create your output. Load both if the hashed file source isn't present there.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Re-read the the posts from Ray, Craig and myself. You do not understand the terms 1:1 or the difference between a reference and a join.

A reference returns 0 or 1 rows only. A join can have a multiplying effect. A primary input stream only does reference lookups, unless you're using a stage that can produce a multi-row return set which gives the functionality of a join. A Hashed stage is used for reference only. You need database functionality to get to multi-row return set. The UV/ODBC (Universe) stage gives you database functionality on a hashed file stage but with a performance penalty.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
MOHAMMAD.ISSAQ
Participant
Posts: 78
Joined: Fri Mar 02, 2007 4:54 am
Location: CHENNAI

Post by MOHAMMAD.ISSAQ »

Craig as you've said to load data into the database i've done for the input source of hash file.But for the source of job2 i'm not able to create a table in oracle 9i of a single column of length around 50000 and datatype is CHAR.

How can i create a table of a single of around 50000 and CHAR datatype?

I've done in other way also:
As i said i've loaded data for the input source of Hashed file.Now my Job1 will be:
Input - .txt file and Ouput - Database lookup.
Job2:
Now here instead of using a hashed lookup in job2 i'm using database lookup.The job is running fine but still i'm not getting all the required rows.
My ouput for Job2 is sequential file.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

MOHAMMAD.ISSAQ wrote:Craig as you've said to load data into the database i've done for the input source of hash file.But for the source of job2 i'm not able to create a table in oracle 9i of a single column of length around 50000 and datatype is CHAR.
Of course not, there's no such thing. That's even too large for a VARCHAR2 which has a 4K limit. You'd need to use a CLOB and all of the problems that come with that.

:? But do you really need to load the whole dang thing? Why not just "the text from 1285 to 1295"?
-craig

"You can never have too many knives" -- Logan Nine Fingers
MOHAMMAD.ISSAQ
Participant
Posts: 78
Joined: Fri Mar 02, 2007 4:54 am
Location: CHENNAI

Post by MOHAMMAD.ISSAQ »

I've used LONG and created the table also.Now i have database as a source file and a database lookup as reference.My reference lookup has total 304 rows and i have selected only one key here as k2.But there are so many duplicates key.For e.g:
k1 k2
Text_Id Key_1
Text_Name Key_1
Term Key_1
But my output i'm getting as the first row not 2nd and 3rd.
I didn't put any primary key in the table, but in Datastage i've used k2 as key.

Please help me out...
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:shock: Oh Gawd, not a LONG. Use a CLOB instead, unless your version of Oracle is so old it doesn't support them?
-craig

"You can never have too many knives" -- Logan Nine Fingers
MOHAMMAD.ISSAQ
Participant
Posts: 78
Joined: Fri Mar 02, 2007 4:54 am
Location: CHENNAI

Post by MOHAMMAD.ISSAQ »

Thanks chulett
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Again - do you really need to load the entire record into the work table for this join? If yes then carry on, just wanted to make sure.

As to your problem, I'm not understanding your issue so can't give much advice - other than to ask you to try to explain it again. Perhaps you could post the table definitions and your sql? Or perhaps someone else could offer something up?
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply