Page 1 of 1

Need to Untar the tar files

Posted: Wed Feb 08, 2012 4:21 am
by deepa.y
Hi,
I have Tar files which i need to Untar and read them without landing them in any directory.How can i read those files using parallel job?

Posted: Wed Feb 08, 2012 4:30 am
by pandeesh
Why you want to accomplish in a px job rather than unix command?
Check Compress and Expand stages, whether they satiate your requirement.

Posted: Wed Feb 08, 2012 4:34 am
by deepa.y
I checked Expand stage but it works only with Datasets.

Posted: Wed Feb 08, 2012 4:39 am
by kandyshandy
Untar files in DS? :shock: I guess you need to rely on some UNIX commands to achieve this.

I am not sure whether there is a new feature in 8.7!!

Posted: Wed Feb 08, 2012 4:41 am
by deepa.y
I came across some posts which had shown that it could be done using Named Pipe concept in server jobs.Is there any way to do that in parallel job?

Posted: Wed Feb 08, 2012 6:33 am
by ray.wurlod
What if you were to use tar as the filter command in a Sequential File stage?

Posted: Wed Feb 08, 2012 7:13 am
by deepa.y
Hi Ray,
Each tar file contains multiple folders and each folder inturn contain few files.Filter command is not working in this case as it cannot be read like file.

Posted: Wed Feb 08, 2012 6:33 pm
by ray.wurlod
You neglected to mention that little fact.

I don't believe there will be a solution that does not involve saving the content of the archive to disk.

Posted: Fri Feb 10, 2012 2:15 pm
by qt_ky
tar is a unix command and there is not a stage you can use out of the box that will read or write tar files without actually calling the tar command.

Could you share the links you found about using tar with named pipes?

Posted: Thu Feb 16, 2012 8:47 pm
by anriliu
Why not considering embedded perl/python script in prerun step?

Posted: Thu Feb 16, 2012 9:51 pm
by qt_ky
Why not land the data? Is it to help performance or a lack of disk space?

Have you asked your UNIX admin about the named pipe method?

Posted: Fri Feb 17, 2012 12:32 am
by deepa.y
Hi,
As we have huge files,we cannot store it on disk.The issue is resolved.We are using External source stage to read the tar file to standard output and using them further .
tar -xOvf path of file.

Posted: Fri Feb 17, 2012 7:08 am
by qt_ky
It costs less to store huge files on disk than in memory. I'm just saying, that's what disk is for...

I don't see any tar -O option on my flavor of UNIX... What does -O do?

Posted: Fri Feb 17, 2012 7:18 am
by deepa.y
qt_ky wrote:I don't see any tar -O option on my flavor of UNIX... What does -O do?
-O is for standard output