Runtime column propagation
Moderators: chulett, rschirm, roy
Runtime column propagation
What is Runtime column propagation? What is the use?
Hello Venkatesh,
The documentation in the Parallel Job Developer's Guide on your PC goes into great detail explaining this.
In short, you can write jobs that contain no column metadata that will function using column propagation on many different input formats. For example you can have a job that processes input data that comes from different files with different numbers of columns and fields - something that cannot be done in that fashion in Server.
The documentation in the Parallel Job Developer's Guide on your PC goes into great detail explaining this.
In short, you can write jobs that contain no column metadata that will function using column propagation on many different input formats. For example you can have a job that processes input data that comes from different files with different numbers of columns and fields - something that cannot be done in that fashion in Server.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
This illustrates the problem with the GUI interface to the original Orchestrate engine.
On Orchestrate, all you have to really do is define the input and the output, and only the fields you need to transform throughout the OSH code. However, the metadata does not show up on the GUI, so it is difficult to manage the column propagation that way.
However, if you turn 'off' the RCP, it does not really turn off the engine's RCP. It just place a modify stage on every single stage that you use in the job. It is definitely faster if you use RCP compared to having it off.
There are tradeoffs, and hopefully Ascential will eventually come up with a better compromise in the upcoming versions.
On Orchestrate, all you have to really do is define the input and the output, and only the fields you need to transform throughout the OSH code. However, the metadata does not show up on the GUI, so it is difficult to manage the column propagation that way.
However, if you turn 'off' the RCP, it does not really turn off the engine's RCP. It just place a modify stage on every single stage that you use in the job. It is definitely faster if you use RCP compared to having it off.
There are tradeoffs, and hopefully Ascential will eventually come up with a better compromise in the upcoming versions.
T42,
that was very interesting information about the RCP and runtime performance; I've been turning it off for the jobs where I don't use it, even though I was coding in such a way that it would work. Now I'll reactivate it for performance sake!
-Arnd.
that was very interesting information about the RCP and runtime performance; I've been turning it off for the jobs where I don't use it, even though I was coding in such a way that it would work. Now I'll reactivate it for performance sake!
-Arnd.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>