Please provide alternative method of loading

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
rhaddur
Participant
Posts: 52
Joined: Mon Mar 13, 2006 7:33 am
Location: mumbai

Please provide alternative method of loading

Post by rhaddur »

Hi Gurus

1) I have Multi Instance job which will run hourly basis it create 24*4=96(MONTH) flat files we used load data using sqllder conventioal method

2)Data volume is 80G to 100 GB

3)the data we get is every day with one field day timestamp
some time we get backdated data of current + past 3 months with
Our target is ORACLE day wise partitioned table

2) Now the volume is increased to 100 GB to 150 GB now we are following sqlldr dirct method

3)but if the back dated data will come the back dated indexes are going into invalid state.

4) so everyday data i am loading in to temp table then end of the bussines day loading finished I am using exhcanging that partition and copy the small amount of back dated data to my origanl TaBle

5) We unable to separate back dated data day wise cause the XFM stage already overloaded

Please provide any alternative method to load this data to reduce the manual intervention of exhange partion and copy of old dated data
Rhaddur
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Welcome Aboard,
Dont overload the transformer then. Modularize your job.
I was not able to understand what you were saying about back end date. If you mean that its not ordered, then you can always sort your data on the date field and then load it.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You can add a second Transformer stage to your job design to filter the backdated records onto a separate stream before progressing with whatever processing is currently in your job. The backdated records could be processed in the same job, or written to a staging file for processing by a subsequent job conditionally started if there are any backdated records: several approaches are available for the latter.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rhaddur
Participant
Posts: 52
Joined: Mon Mar 13, 2006 7:33 am
Location: mumbai

thanx Ray

Post by rhaddur »

ray.wurlod wrote:You can add a second Transformer stage to your job design to filter the backdated records onto a separate stream before progressing with whatever processing is currently in your job. The backdated re ...
Every day i will get 300 millions of records each record is of 300 to 400 bytes .If I will do add another tranfromation my transfromation will take double time

In my tranformation I am havine 5 constraints (curr + past 3 month no and one for error) to fileter the separate month records in seperate file

its better to go for another alternative loading method
Rhaddur
rhaddur
Participant
Posts: 52
Joined: Mon Mar 13, 2006 7:33 am
Location: mumbai

thanx Brian Kernighan

Post by rhaddur »

DSguru2B wrote:Welcome Aboard,
Dont overload the transformer then. Modularize your job.
I was not able to understand what you were saying about back end date. If you mean that its not ordered, then you can always sort your data on the date field and then load it.
Pls do read again
Rhaddur
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Re: thanx Ray

Post by ray.wurlod »

rhaddur wrote:If I will do add another tranfromation my transfromation will take double time.
That is not correct, and shows a weak understanding of how DataStage works.

Without buffering, the Transformer stages will execute in the same process, and therefore add nothing to the overall run time - if you are performing the same processing in the two that you were performing in the one.

If you enable inter-process row buffering, or explicitly use an IPC stage, then the Transformer stages will be forced to execute in separate processes, but rows will be transferred a buffer-full at a time rather than one at a time (this is called "pipeline parallelism").
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rhaddur
Participant
Posts: 52
Joined: Mon Mar 13, 2006 7:33 am
Location: mumbai

Re: thanx Ray

Post by rhaddur »

ray.wurlod wrote:
rhaddur wrote:If I will do add another tranfromation my transfromation will take double time.
That is not correct, and shows a weak understanding of how DataStage works.

Without bufferi ...
If I will re Process the same files in separate transformation

how the time will not increase .

any how where you suggest to add another transformation.


my job work


sorc_file--------XFM------Target files
Rhaddur
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:idea: Strongly suggest you read the Server Job Developer's Guide pdf that is installed on your PC. Particularly:

Chapter 2: Optimizing Performance in Server Jobs

Chapter 12: Inter-Process Stages

The former will get you a better understanding of performance in general and the latter specifics on how IPC stages can help with "pipeline parallelism". You'll also find some caveats and issues with them (the IPC stage) as well if you search the forum.
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Re: thanx Ray

Post by DSguru2B »

rhaddur wrote:
any how where you suggest to add another transformation.
my job work
sorc_file--------XFM------Target files
Two Transformers next to each other. If you want, now you can even split the load in between two transformers. Let both of them take equal load.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Post Reply