Page 1 of 1

How to import data files in parallel

Posted: Mon Dec 28, 2015 8:09 pm
by Johnny0638
We use the osh scripts to run the DS jobs, and we have configure the config.apt file to run it in 3 servers 6 nodes.
But we found that the import progress is only run in 1 server, the other operaters is running in 3 servers.
We have huge data files, and the import progress take too much time, i think it is the bottleneck in performance.
How to import data files in parallel?
Thanks!

Posted: Tue Dec 29, 2015 7:26 am
by ArndW
Unless your input sequential file has a fixed line length, your reader process must run sequentially and can only run on one server and on one node.

Posted: Tue Dec 29, 2015 11:49 am
by roy
Have you thought of taking a sample file for the import table definition?

something like:

Code: Select all

head -1000 source_file.txt > source_sample.txt
IHTH (I Hope This Helps),

Posted: Tue Dec 29, 2015 5:16 pm
by chulett
Import as in the operator, Roy... i.e. read the file.

Re: How to import data files in parallel

Posted: Fri Jan 01, 2016 1:23 am
by ray.wurlod
Johnny0638 wrote:How to import data files in parallel?
Specify the multiple readers per node property.

Posted: Fri Jan 01, 2016 10:25 am
by qt_ky