How to Read zip file ?

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
pkll
Participant
Posts: 73
Joined: Thu Oct 25, 2012 9:45 pm

How to Read zip file ?

Post by pkll »

Hi,
I have one requirment, I need to read zip file from source sequential file.My data size is more than 4GB.

Could you please help me how to read zip file?
prasannakumarkk
Participant
Posts: 117
Joined: Wed Feb 06, 2013 9:24 am
Location: Chennai,TN, India

Post by prasannakumarkk »

In the before job subroutine use gzip or any other zip utility command and unzip the file. Then read the sequential file. You cannot read the zip file directly. If it is there, then it is a grt surprise for me.
Thanks,
Prasanna
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Easiest is to unzip; you can make this a Filter command in the Sequential File stage and read the output of the unzip utility.

Another possibility, if you have a Java class that can read the zip file (which assumes a known format) is to use the Java pack in DataStage to read the file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
pkll
Participant
Posts: 73
Joined: Thu Oct 25, 2012 9:45 pm

Post by pkll »

Thanks for your Reply...

But, I have tried before job subroutine, Filter command in sequential file and command activity. These are supporting only below 2GB of data. It is not supporting 4GB.When I tried to 4GB of data job is aborted.

Error is :
Sequential_File_0,0: Consumed more than 100,000 bytes looking for record delimiter; aborting.

Could you please help me the same?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

What exactly is in your zip file? 4GB of what? That just means it couldn't find the end of the first record... assuming you even have records in there.
-craig

"You can never have too many knives" -- Logan Nine Fingers
pkll
Participant
Posts: 73
Joined: Thu Oct 25, 2012 9:45 pm

Post by pkll »

Hi Chulett,

My source is sequential file.I have to read 4GB zip data from source.i am able to read 2GB of zip data.but, i am unable to read morethan 4GB zip data.
Could you please help me how to read morethan 4GB zip data from source?
prasannakumarkk
Participant
Posts: 117
Joined: Wed Feb 06, 2013 9:24 am
Location: Chennai,TN, India

Post by prasannakumarkk »

Can you please specify what type of file it is.
Fixed length or delimited.
What is the total length of all columns? Do all the records in the file have this length.
Specify all the property mentioned in format tab.


Also try specifying "Number of readers per node" propery in sequential file stage.
Thanks,
Prasanna
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Have you examined the unzipped data from the larger (>4GB) files to verify that the data is in the expected format?

Perhaps the larger files contain multiple smaller files which are also zipped?

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Another possibility is that you have a 32-bit unzip utility on your Engine. To unzip more than 2GB you require a 64-bit-capable unzip utility.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
venkateshrupineni
Participant
Posts: 15
Joined: Wed May 02, 2012 3:38 am

Post by venkateshrupineni »

i think it might be useful
gunzip the file and spilt the file into multiple files and covert that job into multiple instances, parameterizing the source file name
and passing the source file parts at sequence level.
Post Reply