Update Hash file / Update hashedfile

jonathanhale · Post by **jonathanhale** » Thu Jan 15, 2009 11:13 am

I have already carefully read the Posts: Hashed file update - only to *some* existing columns; Delete specific records from hashfile / hashed-file; Usage Of Universe Stage.

I have managed to get the UV stage functioning to get to the hash file and show data.

Where I'm struggling is understanding the relationship between the columns defined in the columns tab of the output tab of the UV stage, and any columns referenced in the user-defined SQL.

e.g. to execute the following statement:

Update HF_COP_Process set Status_Code = 'Test2!!!' where Day_Number = 2 and Partition_Number = 1 and Container_ID = 'XXXYYYZZZ'

(the columns in the where clause are keys)

What columns should I have defined in the columns tab of the output tab of the UV stage?

Also, as this is the first time I've tried to use UV stages (Thanks Ray for the info in the above referenced posts), I'm not sure of the best way of using them.

If all I want to do is update one row once, how should the UV stage be implemented? Do I just make a tiny server job with 1 UV stage and a dummy output?

Are there any UV Stage properties I should be paying special attention to?

Many thanks for any additional information on updating hash file.

chulett · Post by **chulett** » Thu Jan 15, 2009 11:28 am

Not sure what you're asking re: the columns. Once you've got the Universe stage talking to the hashed file you built then it isn't treated any differently than any other sql against a 'normal' table would be. Mention the columns you need to mention. Mark the columns used in the where clause as key columns, data columns as non key. Not sure if that helps or not...

And yes, a one time update could be as simple as a Server job with a UV stage sourced from a transformer constrained to send one row to it with the proper metadata.

ray.wurlod · Post by **ray.wurlod** » Thu Jan 15, 2009 1:17 pm

An Output tab is only pertinent for a SELECT. If you want to be updating, then your stage requres an Input tab.

If the chosen action is update, then any columns that you define in your job design in the columns grid go

into the UPDATE clause if they are not marked as Key

into the WHERE clause if they are marked as key

jonathanhale · Post by **jonathanhale** » Wed Jan 21, 2009 11:34 am

Hi Guys,

I now have mini-job as described, reads 1 row into a transform, the transform is then input to the UV stage.

The UV stage user-SQL is:

Update HF_COP_Process_#COP_PS.COP_DN#_#COP_PS.COP_PN# set Docs_Extracted = ? where Day_Number = ? and Partition_Number = ? and Container_ID = ?

Docs_Extracted is the only non-key column. All is correct in the columns grid (column order, column names, and data types).

Seems to work... i.e. Director reports 1 row read, 1 row written, everything is green, I'm certain that the data values are being passed correct.

Problem is - when I manually re-open the "updated" hashed file in another job, nothing has been updated.

Is there a commit that I'm missing? Any other mistake you can imagine I might be making?

ray.wurlod · Post by **ray.wurlod** » Wed Jan 21, 2009 11:53 am

There's a terminator (";") missing from the SQL. Apart from that the SQL looks OK. Is there a VOC pointer to the hashed file, and is the hashed file name being resolved correctly from the job parameters? (Look in the job log for the answer to this question.) The commit is managed automatically; the UV stage uses auto-commit.

jonathanhale · Post by **jonathanhale** » Thu Jan 22, 2009 5:24 am

Do I need a VOC pointer if the target hash file is "Use Account name", and the UV Stage datasource name = "localuv"?

With or without the SQL terminator ";" the behaviour is the same. Could that mean that the SQL statement is not being processed?

The Entire Job is

Hashed File ----- Transformer ----- Universe Stage

Do I need an Output from the UV?

I have verified that the file name is resolving correctly, and that the data items in the columns are correct. Although I had to "break" the job to see any detail in the log.

If the job completes OK, I get no detail of the processed SQL statement in Director.

Below the logs from the broken runs to verify file name resolution and column content:

testUV_inp..Transformer_4.DSLink6: DSD.BCIPut call to SQLExecute failed.
SQL statement:Update HF_COP_Process_2_1 set Docs_Extracted = ? where Day_Number = ? and Partition_Number = ? and Container_ID = ?
SQLSTATE=S1000, DBMS.CODE=930164
[DataStage][SQL Client][UNIVERSE]DataStage/SQL: Parameter marker 2 does not match expected data type.
SQLSTATE=S1000, DBMS.CODE=930121
[DataStage][SQL Client][UNIVERSE]DataStage/SQL: CursorDefineParam failure.

Day_Number = "1"
Partition_Number = 2
Container_ID = "S0P18HEP60000LGJ"
Docs_Extracted = 100

testUV_inp..Transformer_4: |testUV_inp..UniVerse_39.DSLink6: DSD.BCIOpenW call to SQLPrepare failed.
Update HF_COP_Process_2_#1COP_PS.COP_PN# set Docs_Extracted = ? where Day_Number = ? and Partition_Number = ? and Container_ID = ?
SQLSTATE=S0002, DBMS.CODE=950390
[DataStage][SQL Client][UNIVERSE]DataStage/SQL: Table "HF_COP_Process_2_" does not exist.|

chulett · Post by **chulett** » Thu Jan 22, 2009 6:34 am

Hmmm... three questions, all with the same answer: no.

Not sure what's going on, but I'm curious - why are you going to all this trouble when you could simply do a reference lookup to the hashed file, get all of the current values for that key, update the ones you need and then write the entire record back out? Sure, you need to play games like this for a delete as that functionality does not exist natively in a hashed file but this seems like much brain cell death for very little (if any) gain. Just wondering.

jonathanhale · Post by **jonathanhale** » Thu Jan 22, 2009 10:17 am

Hi craig, very valid point.

But I am also interested in getting this UV stage to work as an academic exercise.

Which method would you guess to be the best performer?

chulett · Post by **chulett** » Thu Jan 22, 2009 10:23 am

Was wondering if the exercise was purely academic. Nothing wrong with that in the least, learning new stuffs is a good thing.

I've never seen a need to clock one against the other, the normal lookup / overwrite process has always been a good performer for me over the years. Besides, we "path" all of our hashed files and the thought of playing VOC games with them or moving them into the Project... not very high up on my list of Things To Try When I Get Bored.

I'm curious, though, what the resolution to your issue turns out to be. Thought perhaps your bind parameters (I really don't like the positional question marks, far prefer the numbered ones like OCI uses) weren't quite lining up with your columns the way you thought they were.

jonathanhale · Post by **jonathanhale** » Thu Jan 22, 2009 10:46 am

I'm as sure as I can be that the columns are lined up correct.

Perhaps you can verify my interpretation of the log extracts I put in a previous post?

In the first log, I had deliberately set the data type of day_number to incorrect varchar.

My interpretation of the message in the log is that "Parameter marker 2 does not match expected data type" is referring to the 2nd question mark in the SQL. Which is correct. (same destructive testing also confirms the correct placement of the other 2 key columns).

In the columns grid of the UV stage I have the columns orders as Day_Number, Partition_Number, Container_ID, Docs_Extracted.

I also tried re-ordering with Docs_Extracted, Day_Number, Partition_Number, Container_ID.

Also doesn't work...

ray.wurlod · Post by **ray.wurlod** » Thu Jan 22, 2009 3:05 pm

Import the table definition and use whatever is imported. When there is more than one key column, you have to be correct with the order of the columns within the key as well as the columns within the data record. Storage in a hashed file record is by field number, not by name, although the UniVerse stage (which uses SQL) *should* be able to resolve the column names no matter where/how they are stored. Further investigation is required.

jonathanhale · Post by **jonathanhale** » Thu Jan 22, 2009 3:58 pm

do all columns in the table (hashed file) need to be addressed in the update statement?

ray.wurlod · Post by **ray.wurlod** » Thu Jan 22, 2009 5:13 pm

No. A hashed file is simply a mechanism by which a database table is implemented. Any valid UPDATE statement will work, provided no table security or integrity constraints are violated (and, by default, there aren't any of these).

DSXchange

Update Hash file / Update hashedfile

Update Hash file / Update hashedfile

UV Stage Update needs committing?