Page 1 of 1

Possibility of partial reading of flat file

Posted: Tue Mar 20, 2007 7:07 am
by vnspn
Hi,

We are working in version 7.1. In our DS Job, we would be reading data from a hugh file either using a FTP stage or using a Sequential File stage.

The file could contain about 40 million records. So, I would like to know is it possible to read part of the file either through different Jobs or through different flows in the same canvas. That is, it is possible to read only records only from 1 to 10 million in the first Job and then records from 10000001 to 20 million in the second Job, etc.

Please let me know if this is possible to implement. Thanks.

Posted: Tue Mar 20, 2007 7:11 am
by DSguru2B
Sure you can. You can read different amount of records and run multiple instance job on it. Look into @INROWNUM.

Posted: Tue Mar 20, 2007 7:21 am
by chulett
Actually, no you can't. Not technically. That's the nature of sequential media.

You'll need to read the entire contents of the file each time. However, each instance can choose to process a distinct portion of the file in parallel. Or you can chunk up the original file and then process the chunks in parallel.

Or go under the covers and write something to seek to the appropriate starting point, read a number of records and then get out.

Posted: Tue Mar 20, 2007 7:36 am
by DSguru2B
Well yes, thats what I meant. The sequential file stage will read the entire file but process can be written to then , depending upon the instance, process different chunks of records.
I have done something like this in the past, using Instance number and
@INROWNUM.

Posted: Wed Mar 21, 2007 4:14 am
by thurmy34
Hi,

Do you try the Complex Flat File,who is designed to treat special file.
In that pluggin you can specify the start and the end row.