DSXchange

wuruima

Dear all, May I ask a quick question? If I have 4 file system(every one is 10G), and I set the apt file like this node "node1" { fastname "xxx" pools "" resource disk "/Node1/DataSets81" {pools ""} resource scratchdisk "/Node1/Scratch81" {p...

wuruima

ray.wurlod wrote:Did you increase the memory using the Sort stage? ...

i didn't change any stage setting for my jobs, let me have a try..

wuruima

You could avoid splitting the file by using the "multiple readers per node" capability. Use Sort stage to sort the individual partitions (partition by the first sort key so that results are correct), ... I find that, use the sort function in Aggregator instead of using a sort stage, perfo...

wuruima

Thanks for your reply, yes the big file is a sequential file. I did some testing to get the best practise, and find that if I split this big file to 4 small files, and use 4 aggregator stages to do the pre sort/sum for each file after reading, and then use funnel to collect all the 4 links and use t...

wuruima

dear all, I would like to sort a file of more than 80,000,000 records. i find that if i use one file(named file1) to save all these records, and build a job(4 nodes in cfg file) to read/sort/sum result/output, the job takes long time to read. However, if i break this file1 to 4 files(file1,file2,fil...

DSXchange

Search found 65 matches

Q. apt file Node setting

datastage sort best performance