Lookup problem with a multi-instance job - Strange !

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I think your key to the hash file needs to be just the first field not the first 3 fields.
Mamu Kim
Anjan Roy
Participant
Posts: 46
Joined: Mon Apr 12, 2004 9:51 am
Location: USA

Post by Anjan Roy »

kduke wrote:I think your key to the hash file needs to be just the first field not the first 3 fields.
I need to determine if the record exists for the given security, start date and type and based on that I need to either insert or update the record. That is the reason I have the 3 fields as keys.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I do not get it. It looks like it is working perfect then. What is it that we do not understand.
Mamu Kim
Anjan Roy
Participant
Posts: 46
Joined: Mon Apr 12, 2004 9:51 am
Location: USA

Post by Anjan Roy »

kduke wrote:I do not get it. It looks like it is working perfect then. What is it that we do not understand.
I explained in the initial post.
I am having a strange problem with one particular security. There are 8 records for this security in my main input file. After split, all 8 records are falling into the same input file. The lookup is not finding the record in the hash file and sending it as an insert instead of an update. However, when I trim my main input file and keep only this security, all the 8 records are scattered across 8 different files, the lookup is finding the record in the hash file and processing it as an update.

Scenario One -
Before Split -Initial File = 1.6M rows.
After Split -8 Files of 200K each.
All 8 records belonging to security 101 are placed in ONE split file.
Each split file is processed by a separate job thread.
In this case all the 8 records for security 101 are processed by the SAME job thread.
Lookup DOES NOT find the record with 101+ Date = 1/1/2001 00:00:00 + Type = Maturity.

Scenario Two -
Before Split - Initial File = 8 Rows (Only rows belonging to security 101)
After Split - 8 files of 1 record each.
All 8 records belonging to security 101 are placed in DIFFERENT split files.
Each split file is processed by a separate job thread.
In this case all the 8 records for security 101 are processed by the DIFFERENT job thread.
Lookup DOES find the record with 101+ Date = 1/1/2001 00:00:00 + Type = Maturity.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

When you insert a record do you also update your hash file?

Ken is correct cache needs to off. All these need to be in one split file as well.
Mamu Kim
Anjan Roy
Participant
Posts: 46
Joined: Mon Apr 12, 2004 9:51 am
Location: USA

Post by Anjan Roy »

kduke wrote:When you insert a record do you also update your hash file?

Ken is correct cache needs to off. All these need to be in one split file as well.
I don't update the hash file in the job. It is only read. The hash file is created by another job that runs before this job.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

There is your problem. You need to reflect in the hash file all the keys in the target otherwise your insert needs to be an update.
Mamu Kim
Anjan Roy
Participant
Posts: 46
Joined: Mon Apr 12, 2004 9:51 am
Location: USA

Post by Anjan Roy »

kduke wrote:There is your problem. You need to reflect in the hash file all the keys in the target otherwise your insert needs to be an update.
Kim - I don't need to update the hash file. My objective is to find the action (insert or update) for the incoming record based on the records already existing in the target.

For example, if there is already a record with security number = 101, Date = 1/1/2001 00:00:00 and type= MATURITY, I need not insert it... I would update the existing record in the database.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

What if you inserted 2 records before? It is not in the hash file. You job says it is an insert. Really it is an update.
Mamu Kim
Anjan Roy
Participant
Posts: 46
Joined: Mon Apr 12, 2004 9:51 am
Location: USA

Post by Anjan Roy »

kduke wrote:What if you inserted 2 records before? It is not in the hash file. You job says it is an insert. Really it is an update.

I will ALWAYS be getting a single record for a given security number + start date + type combination.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

How can you update anything then if you only have one of these?
Mamu Kim
Anjan Roy
Participant
Posts: 46
Joined: Mon Apr 12, 2004 9:51 am
Location: USA

Post by Anjan Roy »

kduke wrote:How can you update anything then if you only have one of these?
If there is a change to an existing record (a record that was written during a previous day's batch).

The hash file is created by a job that reads from the database and populate the hash file. This job runs at the beginning of the batch before any of the other jobs start.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Verify the path to the hash file. If you're using the project as the default, do a sanity check and View data. Sometimes the hash file being written to is not the same one being referenced. Also double-check the filename is still the same. DataStage Server has this "feature" of updating the hash file name sometimes when you rename the link connected to the stage. Could you be running the job differently under the multi-instance mode (intelligent job control), maybe using a different path to the hash file?

You've made me a believer that you know what you're doing, it's just something pesky somewhere. It seems you've constructed things correctly.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Anjan Roy
Participant
Posts: 46
Joined: Mon Apr 12, 2004 9:51 am
Location: USA

Post by Anjan Roy »

kcbland wrote:Verify the path to the hash file. If you're using the project as the default, do a sanity check and View data. Sometimes the hash file being written to is not the same one being referenced. Also double-check the filename is still the same. DataStage Server has this "feature" of updating the hash file name sometimes when you rename the link connected to the stage. Could you be running the job differently under the multi-instance mode (intelligent job control), maybe using a different path to the hash file?

You've made me a believer that you know what you're doing, it's just something pesky somewhere. It seems you've constructed things correctly.
I verified the path and data. It all looks correct. Maybe I am missing something here. Seems like I will have to take it as one of those datastage 'don't-know-why-it-happens' "features"... :roll: :roll:
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Anjan Roy wrote:Seems like I will have to take it as one of those datastage 'don't-know-why-it-happens' "features"... :roll: :roll:

NO.

I have trained, taught, and used this product since 1998. I am certified in deploying this product (got the paper to prove it) and served over 4 years with Ascential as a consultant. I have never encountered your error, and after deploying thousands of jobs in 7+ years in mission critical environments I should have.

You need to methodically trace down, from the beginning, this hash file and how it's created and referenced. You may consider going into a separate project, using a reduced dataset, and separate unix work area. You will find it, and you will be angry when you do. I seriously doubt this is a bug.

Please let us know when you do find it, as it will bother some of us (like me obviously) that you may give up and attribute it to "one of those things". If you need help, export the jobs involved and send them to me and I'll take a quick look: Ken@KennethBland.com
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Post Reply