Delete specific records from hashfile / hashed-file
Moderators: chulett, rschirm, roy
Delete specific records from hashfile / hashed-file
Hi there,
I have a flat-file A and a hashed file B. Records in A are a subset of records in B. I need to delete those records in the hashfile B that are present in flat-file A. Can anyone help me on how to go about it in a server job/routine?
Thanks,
Kaus
I have a flat-file A and a hashed file B. Records in A are a subset of records in B. I need to delete those records in the hashfile B that are present in flat-file A. Can anyone help me on how to go about it in a server job/routine?
Thanks,
Kaus
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Use a UV stage with user-defined SQL containing a DELETE statement such as Provide the key value from your text file.
If the hashed file has been created in a directory you will need a VOC pointer to it. Search the forum for the SETFILE command.
Code: Select all
DELETE FROM tablename WHERE key = ?
If the hashed file has been created in a directory you will need a VOC pointer to it. Search the forum for the SETFILE command.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Did you mean the UniVerse stage? I tried using the stage but am not sure what is to be filled in the Data Source Name etc. I tried filling in the name of the hashed-file (images_metadata_hashed) but it didn't work. I did a view data by putting in a SELECT SQL and got the following error:
Code: Select all
JB_SVTT_Delete_HashedFile_UV..UniVerse_1: DSD.BCIConnect connecting to images_metadata_hashed, call to SQLConnect failed.
SQLSTATE=IM002, DBMS.CODE=0
[DataStage][SQL Client][ODBC][DataDirect][ODBC lib] Data source name not found and no default driver specified
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The data source name needs to be localuv which is pre-defined in your uvodbc.config file. User name and password will be disabled, because localuv is a pre-established connection to the DataStage project itself.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks Ray. I tried with localuv after confirming that it is defined in the uvodbc.config file. I gave the hashed-file name in the Table Name field of the Outputs-> General tab. Now I'm getting error "table does not exist"
Code: Select all
JB_SVTT_Delete_HashedFile_UV..UniVerse_1.DSLink2: DSD.BCIOpenR call to SQLExecDirect failed.
Statement was:SELECT "ship_code", "doc_key", "piece_id", "event_key", "data_source_type_id", "coda_timestamp", "etl_timestamp", "pcdoc_sent_timestamp" FROM "images_metadata_hashed"
SQLSTATE=S0002, DBMS.CODE=950390
[DataStage][SQL Client][UNIVERSE]DataStage/SQL: Table "images_metadata_hashed" does not exist.
Another thought - why not logically delete the records? Rather than futz with all this, add a column and update it for the 'deleted' records. Check for the flag whenever you access the hashed file, treat it like a hashed miss when you find that it is set.
That will keep your account based hashed file out of your 'account' aka Project and put it back in the original directory without any extra shenanigans.
That will keep your account based hashed file out of your 'account' aka Project and put it back in the original directory without any extra shenanigans.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
I had to rule that out because the size of my hashed-file would keep on increasing with logical deletes. My process might encounter about a million records a day.
Now, when I try deleting, I am giving the query as below (user-defined) but it is not working. What is the correct way of specifying the input column? I am getting event_key and doc_key from my flat-file
Now, when I try deleting, I am giving the query as below (user-defined) but it is not working. What is the correct way of specifying the input column? I am getting event_key and doc_key from my flat-file
Code: Select all
DELETE FROM "images_metadata_hashed" WHERE "event_key" = :event_key AND "doc_key" = :doc_key;
First generate sql to get the syntax right, then tweak from there. For example 'Replace existing rows completely' includes a delete. I'll let others chime in on the space issue but I don't believe deletes like this will actually get you any space back.
Not sure because I've never had a need to do this, I'll rebuild a hashed file before I'd ever try deleting anything from it.
Not sure because I've never had a need to do this, I'll rebuild a hashed file before I'd ever try deleting anything from it.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Thanks for the hint
For the record, the query needed to be:
Can someone throw more light on whether or not I will be able to reclaim space this way?
For the record, the query needed to be:
Code: Select all
DELETE FROM "images_metadata_hashed" WHERE ("doc_key" = ? AND "event_key" = ?);
Why dont you check it out yourself by running 'ANALYZE.FILE images_metadata_hashed' command on your hashed file before and after your delete and check out the size.kausmone wrote: Can someone throw more light on whether or not I will be able to reclaim space this way?
This command can be run from the administrator.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You do not reclaim space through DELETE operations. Free space is preserved in hashed files against the likelihood that it will be needed again in the future.
The only command that can reclaim space (apart from deleting and re-creating the hashed file) is RESIZE.
The only command that can reclaim space (apart from deleting and re-creating the hashed file) is RESIZE.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
If the freed space is going to be reclaimed by the hashed-file, it still serves the purpose, since my hashed-file is going to get appended to every hour or so. I think I should stick to this approach of deleting records 'physically' instead of logical deletes, where there will be no reclaimable space?