PX with Sequential File
Moderators: chulett, rschirm, roy
PX with Sequential File
if i have a parallel job and I use a sequential file to read data and to write data, i have no better performance than using a server job, isn't?
regards,
Cristina
regards,
Cristina
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
What's happening in between reading and writing? Parallel processing may assist if some heavy duty transformation is occurring.
You can allocate multiple readers to the file - particularly effective if the file has fixed width records - and achieve parallelism in reading.
Unfortunately the operating system limits what we can do at the other end - "one file, one writer" is the rule here.
You can allocate multiple readers to the file - particularly effective if the file has fixed width records - and achieve parallelism in reading.
Unfortunately the operating system limits what we can do at the other end - "one file, one writer" is the rule here.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
I have 2 nodes in configuration file.
I have 4 cpus.
if a haven't do a lot of transformatios y better use parallel than server.
I think that if you the job imports data to memory , do transforms , and write data in seq, becouse of working in sequential mode, is no better performance.
regards,
Cristina
I have 4 cpus.
if a haven't do a lot of transformatios y better use parallel than server.
I think that if you the job imports data to memory , do transforms , and write data in seq, becouse of working in sequential mode, is no better performance.
regards,
Cristina
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
If you don't mind support both server and parallel jobs on your site the server one will be a low fuss method - less warning messages for sequential file data. Parallel job becomes better if you have a sort requirement or you have more than a couple stages between the input and output.
2 nodes and 4 CPUs? Shouldn't you at least have one node per CPU?
2 nodes and 4 CPUs? Shouldn't you at least have one node per CPU?
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Two nodes is exactly right for a development environment. If it works on two it works on more. Number of CPUs is irrelevant - sufficiently complex jobs will use all of them even with two nodes, in an SMP ("shared everything") environment.
Use two readers per node when reading the file. For larger files you will notice (a) parallelism in the Sequential File stage (check that this occurs) and (b) faster completion time.
Use two readers per node when reading the file. For larger files you will notice (a) parallelism in the Sequential File stage (check that this occurs) and (b) faster completion time.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
A "processing node" is a logical concept - a subset of your available processing power and resources such as memory and disk. The degree of parallelism is determined by the number of nodes defined in the current configuration file.
It is unrelated to the number of CPUs - it may be less, it may be slightly more. The number you choose will be a function of the resources demanded by the composed version of your job design, and the number of jobs that you might want to run at once.
It is unrelated to the number of CPUs - it may be less, it may be slightly more. The number you choose will be a function of the resources demanded by the composed version of your job design, and the number of jobs that you might want to run at once.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
please Ray,
i can't see your entire message.
could you please tell me another time.
Regards,
Cristina
i can't see your entire message.
could you please tell me another time.
Regards,
Cristina
ray.wurlod wrote:A "processing node" is a logical concept - a subset of your available processing power and resources such as memory and disk. The degree of parallelism is determined by the number of nodes defined in ...
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
For only a few cents per day you can purchase a premium membership that allows you to see the premium posts in full, and helps to fund the bandwidth required to sustain DSXchange. Maybe your employer would buy it for you. There is a link on the home page to a page on which corporate discounts can be found.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.