Maximum number of KEY Fields in a Hash File

bapajju · Post by **bapajju** » Thu Nov 02, 2006 5:59 pm

Hello,

We are writing records to a hash file that has 70 KEY fields. But we are unable to write data to the hash file. We get the below error message.

1.EDWSTGCsksINS_debugjob..Hashed_File_42.All_Ins_to_HASH1: ds_uvput() - Write failed for record id '500.....

and

2.record ID size of 768 characters.
Program "JOB.618388182.DT.1418661486.TRANS1": Line 508, Attempted WRITE with record ID larger than file/table maximum
record ID size of 768 characters.

Any clue on this? Is there a limit for the Size or number of key field?

Thanks

chulett · Post by **chulett** » Thu Nov 02, 2006 6:03 pm

Yes, there is a maximum set in the uvconfig file and guess what? It's 768.

Code: Select all

# MAXKEYSIZE - This sets the maximum number of characters that can
#   be used within a primary key.  The minimum value of 255 retains
#   the historical maximum key length as well as the historical encoding
#   of record locks within a 64-byte entry.  Any other value, which must
#   be a multiple of 64 between 256 and 2048, will extend the maximum key
#   length to that value.  Additionally, the full record id will be stored
#   within the record lock entry.  This value, once raised, can only be
#   reset by calling Ascential Support. Do NOT change this from the default
#   value of 768 without understanding its effect on the record lock table.
MAXKEYSIZE 768

ray.wurlod · Post by **ray.wurlod** » Thu Nov 02, 2006 7:27 pm

... and can not be tuned any higher.

Prove to me that you can not obtain uniqueness with fewer than 70 columns. This is a ridiculous amount! The most I've ever seen in very many years in database technology is nine. Sheesh!

bapajju · Post by **bapajju** » Thu Nov 02, 2006 7:55 pm

Thanks for the response Ray. I do understand what you are saying.

IT IS RIDICULOUS:)

I am not using the has file for uniqueness. I am trying to use hash file TO REPLACE Link Collector. Link Collector sometimes throws "MUTEX" error. I am trying to avoid link collector. I am trying to bring data from two transformer stages and trying to collect all the records and pass it to a single target. To pass all the records from two sources I have made all the fields as key fields. In this was I will not eliminate any record.

Is there any other way to replace link collector?

Thanks in Advance

ray.wurlod wrote:... and can not be tuned any higher.

Prove to me that you can not obtain uniqueness with fewer than 70 columns. This is a ridiculous amount! The most I've ever seen in very many years in database ...

ShaneMuir · Post by **ShaneMuir** » Thu Nov 02, 2006 7:59 pm

bapajju wrote:
Is there any other way to replace link collector?

Thanks in Advance

You could just write them to sequential files and concatenate them in UNIX.

And besides, surely you don't need all the columns to be keys to make a record unique - I think that is what Ray was alluding to.

bapajju · Post by **bapajju** » Thu Nov 02, 2006 8:28 pm

Out of the 70 fields any one field (except for the fierst field) can change and rest of them will remain same. I do not know which field might change.

ShaneMuir wrote:
bapajju wrote:
Is there any other way to replace link collector?

Thanks in Advance

You could just write them to sequential files and concatenate them in UNIX.

And besides, surely you don't need all the columns to be keys to make a record unique - I think that is what Ray was alluding to.

ray.wurlod · Post by **ray.wurlod** » Thu Nov 02, 2006 8:41 pm

Do you have an English language specification of what you are trying to accomplish, that you could share with us?

There's usually another way, but we don't want to waste time suggesting things that aren't what's needed.

bapajju · Post by **bapajju** » Thu Nov 02, 2006 9:29 pm

Ray,

Let me explain what I am trying to achive.

1. I have two sources.
2. I am trying to process these records in to a single target table (TERADATA) through one datastage job
3. I am trying to merge these records and push them into the target table
4. There might be a case when the two sources can send exactly the same record, but I do not want to eliminate any one of them. I want to process both the records.

Please let me know if I can achive this through a single job.

Thanks in advance

ray.wurlod wrote:Do you have an English language specification of what you are trying to accomplish, that you could share with us?

There's usually another way, but we don't want to waste time suggesting things that ...

talk2shaanc · Post by **talk2shaanc** » Thu Nov 02, 2006 9:34 pm

bapajju wrote:
I am not using the has file for uniqueness. I am trying to use hash file TO REPLACE Link Collector. Link Collector sometimes throws "MUTEX" error. I am trying to avoid link collector. I am trying to bring data from two transformer stages and trying to collect all the records and pass it to a single target. To pass all the records from two sources I have made all the fields as key fields. In this was I will not eliminate any record.

If you want to replace link-collector, that is the worst way to do it. Not only there is size limitation on record length, but also writing to hash is expensive affair. Every record has to go through hashing algorithm which would take lot of time if you have so many keys and the time would increase with number of record. secondly hash files has 2GB size limitation...

Suggestion 1: Enable Row Buffer > Inter process > Increase the time-out to a higher number. You will have to do some R&D to find the optimum value for time-out.

Suggestion2: Write it to, two Seqiential file and in the after job script or after trasform script append one file over another.

ShaneMuir · Post by **ShaneMuir** » Thu Nov 02, 2006 9:36 pm

bapajju wrote:Ray,

Let me explain what I am trying to achive.

1. I have two sources.
2. I am trying to process these records in to a single target table (TERADATA) through one datastage job
3. I am trying to merge these records and push them into the target table
4. There might be a case when the two sources can send exactly the same record, but I do not want to eliminate any one of them. I want to process both the records.

What is the key on the target table? Surely if you have 2 records the same they can't both be loaded? What processing are you doing?

It would seem to me that you need to create a new key (eq a generated number) to append to the front of the row - if you want to keep duplicates.

Yuan_Edward · Post by **Yuan_Edward** » Sun Nov 05, 2006 6:53 pm

A simple way (probably not a good way) is to put two input links to one TeraData OCI, something like:

source1=>Transformer1=>
TeraData OCI
source2=>Transformer2=>

bapajju wrote:Ray,

Let me explain what I am trying to achive.

1. I have two sources.
2. I am trying to process these records in to a single target table (TERADATA) through one datastage job
3. I am trying to merge these records and push them into the target table
4. There might be a case when the two sources can send exactly the same record, but I do not want to eliminate any one of them. I want to process both the records.

Please let me know if I can achive this through a single job.

Thanks in advance

ray.wurlod wrote:Do you have an English language specification of what you are trying to accomplish, that you could share with us?

There's usually another way, but we don't want to waste time suggesting things that ...

ray.wurlod · Post by **ray.wurlod** » Sun Nov 05, 2006 7:45 pm

Simple is often the most effective. But, with this design, how do you handle the duplicate key? Or are you generating surrogate (artificial) keys into the target table?

talk2shaanc · Post by **talk2shaanc** » Mon Nov 06, 2006 9:19 am

Yuan_Edward wrote:A simple way (probably not a good way) is to put two input links to one TeraData OCI, something like:

source1=>Transformer1=>
TeraData OCI
source2=>Transformer2=>

I don't know about Teradata, but DB2 and Oracle establish a lock before writing to the table. So in your scenario, there will be contention for establishing a lock. While one of the link(process) is writing to the table after establishing a lock, the other will be in wait stage and it may abort after some time.
There is chances that your table and tablespace goes in unusable state. for DB2 it would go in Load pending state; I dont remember what it is called in Oracle.

DeepakCorning · Post by **DeepakCorning** » Wed Nov 08, 2006 9:27 am

I do not think declaring all the columns as Key is the solution here , but the simplest that I can think of is to declare an extra column in the target and assign it a counter (may be a row counter) - same concept as surrogate key generator.
So if your source 1 has 10 and source 2 has 10 rows then the target will have 20 rows with unique counter values.