Page 1 of 1

How to Read zip file ?

Posted: Fri Mar 15, 2013 2:52 am
by pkll
Hi,
I have one requirment, I need to read zip file from source sequential file.My data size is more than 4GB.

Could you please help me how to read zip file?

Posted: Fri Mar 15, 2013 2:54 am
by prasannakumarkk
In the before job subroutine use gzip or any other zip utility command and unzip the file. Then read the sequential file. You cannot read the zip file directly. If it is there, then it is a grt surprise for me.

Posted: Fri Mar 15, 2013 3:15 am
by ray.wurlod
Easiest is to unzip; you can make this a Filter command in the Sequential File stage and read the output of the unzip utility.

Another possibility, if you have a Java class that can read the zip file (which assumes a known format) is to use the Java pack in DataStage to read the file.

Posted: Sun Mar 17, 2013 8:51 pm
by pkll
Thanks for your Reply...

But, I have tried before job subroutine, Filter command in sequential file and command activity. These are supporting only below 2GB of data. It is not supporting 4GB.When I tried to 4GB of data job is aborted.

Error is :
Sequential_File_0,0: Consumed more than 100,000 bytes looking for record delimiter; aborting.

Could you please help me the same?

Posted: Sun Mar 17, 2013 11:18 pm
by chulett
What exactly is in your zip file? 4GB of what? That just means it couldn't find the end of the first record... assuming you even have records in there.

Posted: Sun Mar 17, 2013 11:28 pm
by pkll
Hi Chulett,

My source is sequential file.I have to read 4GB zip data from source.i am able to read 2GB of zip data.but, i am unable to read morethan 4GB zip data.
Could you please help me how to read morethan 4GB zip data from source?

Posted: Sun Mar 17, 2013 11:35 pm
by prasannakumarkk
Can you please specify what type of file it is.
Fixed length or delimited.
What is the total length of all columns? Do all the records in the file have this length.
Specify all the property mentioned in format tab.


Also try specifying "Number of readers per node" propery in sequential file stage.

Posted: Sun Mar 17, 2013 11:55 pm
by jwiles
Have you examined the unzipped data from the larger (>4GB) files to verify that the data is in the expected format?

Perhaps the larger files contain multiple smaller files which are also zipped?

Regards,

Posted: Sun Mar 17, 2013 11:58 pm
by ray.wurlod
Another possibility is that you have a 32-bit unzip utility on your Engine. To unzip more than 2GB you require a 64-bit-capable unzip utility.

Posted: Mon Mar 18, 2013 6:50 am
by venkateshrupineni
i think it might be useful
gunzip the file and spilt the file into multiple files and covert that job into multiple instances, parameterizing the source file name
and passing the source file parts at sequence level.