I am reading the Parallel Job's Developers Guide. For the Data Set stage,
I cam across this sentence: "The Data Set stage allows you to store data
in a persistent form, which can then be used by other DataStage jobs".
I also searched through this forum but did not get an answer.
What does "persistent form" mean?
What is "persistent form" in this context?
Moderators: chulett, rschirm, roy
If you build a reference lookup set of data that multiple jobs could share, write it as a persistent dataset (.ds) so that multiple jobs can leverage it. Otherwise, all jobs that need to access that data must also have all the necessary logic to go gain the data, and then redundantly use resources to go gain the same data again (and potentially transform, sort, and partition it).
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
From your manuals under the Start button:
You can use the Data Set stage:
You can use the Data Set stage:
Or the Lookup File Set stage:The Data Set stage is a file stage. It allows you to read data from or
write data to a data set. The stage can have a single input link or a
single output link. It can be configured to execute in parallel or
sequential mode.
What is a data set? DataStage parallel extender jobs use data sets to
manage data within a job. You can think of each link in a job as
carrying a data set. The Data Set stage allows you to store data being
operated on in a persistent form, which can then be used by other
DataStage jobs. Data sets are operating system files, each referred to by
a control file, which by convention has the suffix .ds. Using data sets
wisely can be key to good performance in a set of linked jobs. You can
also manage data sets independently of a job using the Data Set
Management utility, available from the DataStage Designer, Manager,
or Director, see Chapter 57.
The Lookup File Set stage is a file stage. It allows you to create a lookup
file set or reference one for a lookup. The stage can have a single input link
or a single output link. The output link must be a reference link. The stage
can be configured to execute in parallel or sequential mode when used
with an input link.
When creating Lookup file sets, one file will be created for each partition.
The individual files are referenced by a single descriptor file, which by
convention has the suffix .fs.
When performing lookups, Lookup File stages are used in conjunction
with Lookup stages.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You don't get any choice. If you create a Data Set - using a Data Set stage - then it's persistent.splayer wrote:Kenneth, so are you saying that we can choose when creating a data set whether we want it persistent or not? Where can we make that choice?
Only running jobs create virtual Data Sets. You can inspect the generated OSH to determine the names of their control files, which names always end in ".v".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.