Page 1 of 1

Multiple seq file to Hash file conversion

Posted: Wed Dec 10, 2003 7:30 am
by raj_cipher
Hi,

I want to convert some 100 seq file to 100 hash files.I found that i could convert it one by one.Any option of doing it at one stretch.

Posted: Wed Dec 10, 2003 8:23 am
by ray.wurlod
Do they all have the same record (line) layout? If so you can create a multi-instance job with the file name and hashed file name parameterized, and fire off all instances out of a job control that uses a loop of some kind to deliver parameter values to the separate instances.
If they have differing layouts, or even have to have different column names, you're pretty much stuck with the one at a time approach. :cry:

Posted: Wed Dec 10, 2003 8:27 am
by chulett
Well, you don't need 100 jobs to do this. You could do it in one or perhaps a small number of jobs by dropping multiple "copies" into the same job. A series of streams in the same job will (essentially) run in parallel.

I'm assuming here that the 100 files are not identically formatted. If they are, you could build a looping construct in a Batch to iterate thru the files one-by-one and call a single parameterized job. Assuming, again, that the files identical from a metadata standpoint and you could generate (parameterize) the names of the hash files to build, perhaps from the filename.

Posted: Wed Dec 10, 2003 10:57 am
by inter5566
Why do they have to be put into 100 hash files rather than just one hash file?
If all file layouts are equal, then maybe you could tack on an indicator to each row, concatonate the files together and load into one hash file.

Just thinking out loud.
Steve