Please provide alternative method of loading

rhaddur · Post by **rhaddur** » Fri Mar 02, 2007 6:18 am

Hi Gurus

1) I have Multi Instance job which will run hourly basis it create 24*4=96(MONTH) flat files we used load data using sqllder conventioal method

2)Data volume is 80G to 100 GB

3)the data we get is every day with one field day timestamp
some time we get backdated data of current + past 3 months with
Our target is ORACLE day wise partitioned table

2) Now the volume is increased to 100 GB to 150 GB now we are following sqlldr dirct method

3)but if the back dated data will come the back dated indexes are going into invalid state.

4) so everyday data i am loading in to temp table then end of the bussines day loading finished I am using exhcanging that partition and copy the small amount of back dated data to my origanl TaBle

5) We unable to separate back dated data day wise cause the XFM stage already overloaded

Please provide any alternative method to load this data to reduce the manual intervention of exhange partion and copy of old dated data

DSguru2B · Post by **DSguru2B** » Fri Mar 02, 2007 8:59 am

Welcome Aboard,
Dont overload the transformer then. Modularize your job.
I was not able to understand what you were saying about back end date. If you mean that its not ordered, then you can always sort your data on the date field and then load it.

ray.wurlod · Post by **ray.wurlod** » Fri Mar 02, 2007 3:05 pm

You can add a second Transformer stage to your job design to filter the backdated records onto a separate stream before progressing with whatever processing is currently in your job. The backdated records could be processed in the same job, or written to a staging file for processing by a subsequent job conditionally started if there are any backdated records: several approaches are available for the latter.

rhaddur · Post by **rhaddur** » Sat Mar 03, 2007 4:29 am

ray.wurlod wrote:You can add a second Transformer stage to your job design to filter the backdated records onto a separate stream before progressing with whatever processing is currently in your job. The backdated re ...

Every day i will get 300 millions of records each record is of 300 to 400 bytes .If I will do add another tranfromation my transfromation will take double time

In my tranformation I am havine 5 constraints (curr + past 3 month no and one for error) to fileter the separate month records in seperate file

its better to go for another alternative loading method

rhaddur · Post by **rhaddur** » Sat Mar 03, 2007 4:37 am

DSguru2B wrote:Welcome Aboard,
Dont overload the transformer then. Modularize your job.
I was not able to understand what you were saying about back end date. If you mean that its not ordered, then you can always sort your data on the date field and then load it.

Pls do read again

ray.wurlod · Post by **ray.wurlod** » Sat Mar 03, 2007 5:51 am

rhaddur wrote:If I will do add another tranfromation my transfromation will take double time.

That is not correct, and shows a weak understanding of how DataStage works.

Without buffering, the Transformer stages will execute in the same process, and therefore add nothing to the overall run time - if you are performing the same processing in the two that you were performing in the one.

If you enable inter-process row buffering, or explicitly use an IPC stage, then the Transformer stages will be forced to execute in separate processes, but rows will be transferred a buffer-full at a time rather than one at a time (this is called "pipeline parallelism").

rhaddur · Post by **rhaddur** » Sat Mar 03, 2007 6:20 am

ray.wurlod wrote:
rhaddur wrote:If I will do add another tranfromation my transfromation will take double time.
That is not correct, and shows a weak understanding of how DataStage works.

Without bufferi ...

If I will re Process the same files in separate transformation

how the time will not increase .

any how where you suggest to add another transformation.

my job work

sorc_file--------XFM------Target files

chulett · Post by **chulett** » Sat Mar 03, 2007 8:06 am

Strongly suggest you read the Server Job Developer's Guide pdf that is installed on your PC. Particularly:

Chapter 2: Optimizing Performance in Server Jobs

Chapter 12: Inter-Process Stages

The former will get you a better understanding of performance in general and the latter specifics on how IPC stages can help with "pipeline parallelism". You'll also find some caveats and issues with them (the IPC stage) as well if you search the forum.

DSguru2B · Post by **DSguru2B** » Sat Mar 03, 2007 10:59 am

rhaddur wrote:
any how where you suggest to add another transformation.
my job work
sorc_file--------XFM------Target files

Two Transformers next to each other. If you want, now you can even split the load in between two transformers. Let both of them take equal load.

DSXchange

Please provide alternative method of loading

Please provide alternative method of loading

thanx Ray

thanx Brian Kernighan

Re: thanx Ray

Re: thanx Ray

Re: thanx Ray