Hi,
We are working in version 7.1. In our DS Job, we would be reading data from a hugh file either using a FTP stage or using a Sequential File stage.
The file could contain about 40 million records. So, I would like to know is it possible to read part of the file either through different Jobs or through different flows in the same canvas. That is, it is possible to read only records only from 1 to 10 million in the first Job and then records from 10000001 to 20 million in the second Job, etc.
Please let me know if this is possible to implement. Thanks.
Possibility of partial reading of flat file
Moderators: chulett, rschirm, roy
Actually, no you can't. Not technically. That's the nature of sequential media.
You'll need to read the entire contents of the file each time. However, each instance can choose to process a distinct portion of the file in parallel. Or you can chunk up the original file and then process the chunks in parallel.
Or go under the covers and write something to seek to the appropriate starting point, read a number of records and then get out.
You'll need to read the entire contents of the file each time. However, each instance can choose to process a distinct portion of the file in parallel. Or you can chunk up the original file and then process the chunks in parallel.
Or go under the covers and write something to seek to the appropriate starting point, read a number of records and then get out.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Well yes, thats what I meant. The sequential file stage will read the entire file but process can be written to then , depending upon the instance, process different chunks of records.
I have done something like this in the past, using Instance number and
@INROWNUM.
I have done something like this in the past, using Instance number and
@INROWNUM.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.