Data Loading Based On Size of the Source

shivakumar · Post by **shivakumar** » Thu Aug 02, 2007 3:34 am

Hi ,

I am having one requirement based on the Source Size I have to load the Data in to target.For each run I have to execute the job.

For Example My Source table size is 100 MB then I have to run my Job for 10 times and each run I have yo load 10MB data in the Target table.

If the job fails in the First run then I have to execute first 10 MB data again,If the 100 MB Data is completed then again I have to Start from First on 11th Day.

Can any one help me regarding this?

Thanks and Regards
Siva

Maveric · Post by **Maveric** » Thu Aug 02, 2007 4:12 am

Think of a multi instance job. determine the range of values in the source data. Give the range in the SQL where clause. Probably pass the where clause as parameter or the SQL statement as parameter.

shivakumar · Post by **shivakumar** » Thu Aug 02, 2007 6:06 am

[quote="Maveric"]Think of a multi instance job. determine the range of values in the source data. Give the range in the SQL where clause. Probably pass the where clause as parameter or the SQL statement as parameter.[/quote]

Hi ,

Actually here the requirement is I have to run the Same job One after another because as per the requirement I have to load the 10mb data first after that another 10mb.

If I create the Multiple Instance then job will run parallely but not sequentially.

Regards
Siva

ray.wurlod · Post by **ray.wurlod** » Thu Aug 02, 2007 6:18 am

Not true. Multi-instance jobs run as and when you request.

That said, you don't need multi-instance to run the same job over and over but only one instance at any one time.

Why do you have this strange requirement to load only 10MB at a time?
Is this your design, or an imposed requirement? If the latter require them to justify it; it defeats the whole purpose of an ETL tool.

chulett · Post by **chulett** » Thu Aug 02, 2007 6:31 am

Well, that sure gets an honorary membership into our Hall of Very Odd Requirements. While you may be able to limit something based on a record count, no clue how you do the same for 'source size' unless you compute the average record length and turned that into a record count limit on a case-by-base basis.

Sounds like a job for a Looping Sequence.

Maveric · Post by **Maveric** » Thu Aug 02, 2007 6:35 am

By making the job multi instanced will give u the flexibility of running the job simultaneously with different instance ID. You can still run the job instance after instance. If you are scheduling the runs then it would be easier to identify the instance which failed and run just that instance.