What is "persistent form" in this context?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

What is "persistent form" in this context?

Post by splayer »

I am reading the Parallel Job's Developers Guide. For the Data Set stage,
I cam across this sentence: "The Data Set stage allows you to store data
in a persistent form, which can then be used by other DataStage jobs".

I also searched through this forum but did not get an answer.

What does "persistent form" mean?
seanc217
Premium Member
Premium Member
Posts: 188
Joined: Thu Sep 15, 2005 9:22 am

Post by seanc217 »

It means it is stored on disk, instead of memory.

HTH.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

If you build a reference lookup set of data that multiple jobs could share, write it as a persistent dataset (.ds) so that multiple jobs can leverage it. Otherwise, all jobs that need to access that data must also have all the necessary logic to go gain the data, and then redundantly use resources to go gain the same data again (and potentially transform, sort, and partition it).
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

Post by splayer »

Kenneth, so are you saying that we can choose when creating a data set whether we want it persistent or not? Where can we make that choice?
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

From your manuals under the Start button:

You can use the Data Set stage:
The Data Set stage is a file stage. It allows you to read data from or
write data to a data set. The stage can have a single input link or a
single output link. It can be configured to execute in parallel or
sequential mode.
What is a data set? DataStage parallel extender jobs use data sets to
manage data within a job. You can think of each link in a job as
carrying a data set. The Data Set stage allows you to store data being
operated on in a persistent form, which can then be used by other
DataStage jobs. Data sets are operating system files, each referred to by
a control file, which by convention has the suffix .ds. Using data sets
wisely can be key to good performance in a set of linked jobs. You can
also manage data sets independently of a job using the Data Set
Management utility, available from the DataStage Designer, Manager,
or Director, see Chapter 57.
Or the Lookup File Set stage:
The Lookup File Set stage is a file stage. It allows you to create a lookup
file set or reference one for a lookup. The stage can have a single input link
or a single output link. The output link must be a reference link. The stage
can be configured to execute in parallel or sequential mode when used
with an input link.
When creating Lookup file sets, one file will be created for each partition.
The individual files are referenced by a single descriptor file, which by
convention has the suffix .fs.
When performing lookups, Lookup File stages are used in conjunction
with Lookup stages.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

splayer wrote:Kenneth, so are you saying that we can choose when creating a data set whether we want it persistent or not? Where can we make that choice?
You don't get any choice. If you create a Data Set - using a Data Set stage - then it's persistent.

Only running jobs create virtual Data Sets. You can inspect the generated OSH to determine the names of their control files, which names always end in ".v".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Its better you can search on 'Virtual Dataset' and have a look into it, so that you can easily find the difference and understand persistent dataset.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Post Reply