dumping the hash file data to flat file easily

pxr87 · Post by **pxr87** » Fri Dec 08, 2006 1:54 pm

I need to dump hash file data to flat file and I can't create a datastage job for creating a job to dump it. are there any way I can create by running a uv command or some thing like that to create flat file in UNIX. we have hundreads of hash files and we can't create a job for each hash file to dump it. please suggest.

DSguru2B · Post by **DSguru2B** » Fri Dec 08, 2006 2:49 pm

I doubt you can do a bulk dump to a text file.(Someone correct me if i am wrong) You can surely read it row by row depending upon what key you pass in. You will also be needing to know all the keys for all the hashed files. I think you will be better off creating jobs.
You can also pass a select command from DSExecute() in a routine and get all the values of that particular hashed file. But for that you need column names of all the hashed files. You cannot just do select * ...

ArndW · Post by **ArndW** » Fri Dec 08, 2006 3:22 pm

Do all your hashed files have the same or similar numbers of columns? If yes it is very easy to write a generic job to dump all hashed file contents to flat files.

ray.wurlod · Post by **ray.wurlod** » Fri Dec 08, 2006 4:02 pm

Why do you want to do this?

There is a bulk unloader, called UNLOAD.FILE, that creates a text file that can be used by the bulk loader loadfile.

Before you can use UNLOAD.FILE it needs to be cataloged.

However, the format of the text file is specific to loadfile, and contains tabs and other markers to indicate things like multi-valued fields.

It may not be the tool you require.

Another possibility is "query to disk". You do this by setting up a print channel in mode 3 (to disk), into a file in the &HOLD& directory. For example

Code: Select all

SETPTR 0,132,2000,0,0,3,BANNER NEXT Hashed,BRIEF

Then add the keyword LPTR into any query you issue against the hashed file. The result will be in a file called Hashed_nnnn in the &HOLD& directory.

There are other options. But I can't think of a good reason to want to do this, or why you couldn't craft a DataStage job template and just load the different metadata into jobs created from that.

pxr87 · Post by **pxr87** » Sun Dec 10, 2006 12:18 am

[quote="ray.wurlod"]Why do you want to do this?
[/quote]
we have literally 100's of temporary hash files, which are very huge( 200 Meg ) in some cases and often we need to debug the data based one field or other which matches in multiple rows or 1000's rows. simple search on VIEW data in stage is not possible when multiple rows are available. so I was looking for some script uv/basic/or general datastage job that can dump to file, so that I can do simple UNIX search

[quote="ray.wurlod"]
There is a bulk unloader, called UNLOAD.FILE, that creates a text file that can be used by the bulk loader loadfile.

Before you can use UNLOAD.FILE it needs to be cataloged.

However, the format of the text file is specific to loadfile, and contains tabs and other markers to indicate things like multi-valued fields.
It may not be the tool you require.
[/quote]
As we have not used much of multi-value fields, having a markers may not be problem, but creating a loadfile for each hash file is also not practicable, unless there is dynamic load file creation script available.
[quote="ray.wurlod"]

Another possibility is "query to disk". You do this by setting up a print channel in mode 3 (to disk), into a file in the &HOLD& directory. For example
[code]SETPTR 0,132,2000,0,0,3,BANNER NEXT Hashed,BRIEF[/code]
Then add the keyword LPTR into any query you issue against the hashed file. The result will be in a file called Hashed_nnnn in the &HOLD& directory.
[/quote]
this is some thing like this I want. but not sure about the procedure you mentioned. where ( unix uv prompt line or administrater command window or some other location ) do I need to do them. when u said 'sql', I believe it is for UV related to sql rather than ANSI sql or sqlplus?. Iam not sure how to write this sql. instead I will look at the other possibiliyt

[quote="ray.wurlod"]
There are other options. But I can't think of a good reason to want to do this, or why you couldn't craft a DataStage job template and just load the different metadata into jobs created from that.[/quote]
I am not sure how a template can help. for example, if i create a template job with some general hash file name as input and some dynamic output flat file name, but how can I populate the input column names ( or metadata) dynamically ?. or may be I need to write some script which identifies the column names based on the hash files and concatenates all the column names with some delimiter ('like '~' etc), with '~' as delimiter and dumps on to sequential file with some generic file name. is this possible. are there any thing I can get in the forum.
Please suggest.

mystuff · Post by **mystuff** » Wed Dec 31, 2008 2:03 pm

I am trying to do same thing.. i.e dumping the hashed file data
A)

Code: Select all

SETPTR 0,132,2000,0,0,3,BANNER Hashed,BRIEF

i) This option is too slow (especially for huge files), almost 20 times slower than a datastage job to unload.
ii) The page depth also limits the number of records that can be unloaded.

Is it the same for everyone or just me?

b) When I catalog the unload command and then try to unload hashed file

Code: Select all

CATALOG APP.PROGS UNLOAD UNLOAD.FILE.B LOCAL 
UNLOAD hashedfilename /somedirectory/Extfilename

I get an error

Code: Select all

Program "UNLOAD": pc = 204, Improper data type.

chulett · Post by **chulett** » Wed Dec 31, 2008 2:28 pm

Just you.

Doesn't really surprise me, seeing as how you are basically printing a 'report' to disk. Those numbers in the SETPTR function are line length, page length, top margin, bottom margin, etc. The UniVerse User Reference has all of the gory details.

Me, I'd just stick to using a job.

mystuff · Post by **mystuff** » Wed Dec 31, 2008 2:33 pm

chulett wrote:Just you.

Doesn't really surprise me, seeing as how you are basically printing a 'report' to disk. Those numbers in the SETPTR function are line length, page length, top margin, bottom margin, etc. The UniVerse User Reference has all of the gory details.

Me, I'd just stick to using a job.

I just edited my post, could you look at that.

chulett · Post by **chulett** » Wed Dec 31, 2008 2:39 pm

Sorry, for what you've added you're going to have wait for Grognard help.

I'm curious why all the shenanigans, why not just stick with the job(s) you've obviously already built and which are apparently 20x faster?

mystuff · Post by **mystuff** » Wed Dec 31, 2008 2:45 pm

chulett wrote:Sorry, for what you've added you're going to have wait for Grognard help.

I'm curious why all the shenanigans, why not just stick with the job(s) you've obviously already built and which are apparently 20x faster?

The idea of not having to create a job everytime a hashed file has to be accessed.