Staging Data In Sequential File

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
speedsterrules
Participant
Posts: 28
Joined: Thu Sep 29, 2005 1:50 pm

Staging Data In Sequential File

Post by speedsterrules »

Is staging data in a Sequential File a good practice?

In other words, is it resonable to create a sequential file during processing and use it to load the table or should we directly load the table?

I guess the amount of data can be a factor but am not sure if we have any related standards.

Thanks
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Its not against best practices. I use it all the time. But if the data size is huge, it might not be a good practice. Therefore it depends. But then again you should also have a cleanup process in place that cleans out files so that huge files dont just keep sitting there for months.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Ralph Kimball recommends it in his books.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
speedsterrules
Participant
Posts: 28
Joined: Thu Sep 29, 2005 1:50 pm

Post by speedsterrules »

Thanks for the prompt responses.

The file gets over-written every time the job is executed. I am doing it even for huge data size since it enhances the processing time.

Do you think it is advisable?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Yes.
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Its a three time 'Yes' from Craig. Its stone written now. :wink:
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Too slow. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

It all depends on what you're doing. If you're building a data warehouse, with staging, storage, and presentation layers, your ETL coe will look differently for each aspect. If you're a DBA and just moving data around, you'll question even using an ETL tool.

In the straight performance question, copying data between two tables within the same database instance is faster to stay within the database and use parallel DML.

If you are going between two disparate databases, you're better served by extracting will as many parallel processes as possible, transforming with as many parallel processes as possible, and then loading as efficiently as possible. By building sequential "staging" files within that processing, you have the ability to shift parallel processing degrees, as well as have restart points and audit trails. You can also bulk load the results for fastest loading allowed. In addition, the database target server may be elsewhere, so that prepared load files can be moved to the database for localized loading.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

chulett wrote:Too slow. :wink:
Alas, my slow fingers.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
speedsterrules
Participant
Posts: 28
Joined: Thu Sep 29, 2005 1:50 pm

Post by speedsterrules »

I got all the information I need through everyone's comments. Thanks everyone and especially Ken for the elaborate description.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

You can rate everyone's replies depending upon how helpful the comments were.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes.

:lol:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply