Mload Vs Enterprise stages while using Teradata as target DB

kishorenvkb · Post by **kishorenvkb** » Wed Aug 20, 2008 6:14 am

Hello,

It may be a very basic question for most of you.

When do you use Mload Vs Enterprise Stage when loading the data into Teradata.

My situation is that I have 750M rows to be inserted into an existing table that has already 2Billion rows. I also need to pre-delete (based on the run-date before inserting the 750M rows). I do have the PPI defined on the date column for this table.

Thanks

keshav0307 · Post by **keshav0307** » Wed Aug 20, 2008 7:33 am

using Teradata Enterprize stage,you can not delete record based on a condition in the input row.

for only Insert Teradata Enterprize stage is the fatstest.

then LOAD for update, insert and/or delete.

shershahkhan · Post by **shershahkhan** » Wed Aug 20, 2008 8:20 am

There is no simple answer when to use which, but if you want to run the stage in parallel you should use Enterprise stage although Enterrpise stage do have some bugs but that are in certain senarios, Here is how you should to it

Enterprise stage:
1) Run in parallel
2) Change the write mode to Upsert
3) Upsert mode should be Delete then insert(you need to check if the delete is run per record or per run, ,i am not sure)
4) Give your queries and it will work
5) The logs can be found from the Terasync table
6) you can't provide the error tables names(disadvantage)

Mload stage:
1) Run in sequencial mode
2) you can give the delete statment in the before Script part
3) to get details of log you need to open the Mload log file (disadvantage)
4) You can give details of Error tables, log tables etc

If you have any particular satuation let us know and we will tell you which option is the best

Nagin · Post by **Nagin** » Wed Aug 20, 2008 1:43 pm

shershahkhan wrote: Mload stage:
1) Run in sequencial mode

Doesn't the number of sessions we specify in mload script represent parallelism???.

For such a big load of 750 million rows with pre delets I would rather take the multi load path rather than the Enterprise stage as I can do both in one mload script.

kishorenvkb · Post by **kishorenvkb** » Wed Aug 20, 2008 3:06 pm

I am currently using the MLOAD with a Before Script to delete everything from the table with a table.date > run-date.

As I have communicated before, the destination table would still have few billion rows even after the above delete statement.

Later I am using MLOAD Insert to insert the 750M rows into the table.

I have been suggested to do the following for better performance:
1) From the sequence, execute a script to pre-delete the rows where Date > run-date
2) Call the load job, which uses Enterprise - Insert stage to load the rows.

I just wanted to know if this a better option that I have already and why.

Thanks again for all your responses.

toshea · Post by **toshea** » Thu Aug 21, 2008 3:37 pm

Your best option is to use the Teradata Enterprise stage. Since you have a range of values you want to pre-delete, you can do that in the -open option of Teradata Enterprise. You should be able to pass the constraint value via a job parameter.

After the pre-delete, it's preferrable to use Teradata Enterprise for the load since you are only appending rows. Teradata Enterpise can run in parallel, whereas MultiLoad is not truly a parallel process. Although MultiLoad allows you to use multiple sessions, MultiLoad itself runs as a single process on one node, it reads from the named pipe sequentially, and it spreads that data over the multiple sessions. Teradata Enterprise will run in multiple processes that can be spread over multiple nodes, and each of those processes can spread the data over multiple sessions (determined by the sessionsperplayer option).

kishorenvkb · Post by **kishorenvkb** » Thu Aug 21, 2008 4:53 pm

Wonderful. Thanks.

Can you please expand on -open option of Enterprise Stage?

toshea · Post by **toshea** » Thu Aug 21, 2008 5:35 pm

Open the Teradata Enterprise stage. From the Properties tab of the Input page, click on Options folder and you'll see the Open Command option in the window of "Available properties to add". Click on it to add that option, type in your DELETE statement in the Open Command field.

hamzaqk · Post by **hamzaqk** » Thu Aug 21, 2008 9:46 pm

Doesn't the number of sessions we specify in mload script represent parallelism???.

Nope.The number of sessions define connections to TD, not how parallelism or degree of parallelism is controled and achieved. IF "sessions" were controlling parallelism one could use Mload stage in server jobs and run them in parallel which, however, is not the case.

In TD its done through the AMPS processing.DS has its own way of implementing parallelism. You can find details in its documentation.

kishorenvkb · Post by **kishorenvkb** » Fri Aug 22, 2008 6:48 am

Thanks to "TOSHEA" and all other responses.

kishorenvkb · Post by **kishorenvkb** » Wed Aug 27, 2008 1:51 pm

[quote="toshea"]Open the Teradata Enterprise stage. From the Properties tab of the Input page, click on Options folder and you'll see the Open Command option in the window of "Available properties to add". Click on it to add that option, type in your DELETE statement in the Open Command field.[/quote]

Toshea, when I use the open command in the Enterprise stage, to pre-delete the rows, I get the following error:

TDEnt_Ext_Recov_Strs_STID: TeraUtils:DB Call Failure(success check) Info = 43, Code = 3704, Message = ' ('0A'X) is not a valid Teradata SQL token.

TDEnt_Ext_Recov_Strs_STID: DBCAREA msg_txt: Completed successfully.

TDEnt_Ext_Recov_Strs_STID: TeraUtils:ExecuteImmediate failed for 'DELETE from qascst.SALES_RECOVERED_STORES
'. Error Code = 3704

Please assist.

kishorenvkb · Post by **kishorenvkb** » Wed Aug 27, 2008 1:57 pm

[quote="toshea"]Open the Teradata Enterprise stage. From the Properties tab of the Input page, click on Options folder and you'll see the Open Command option in the window of "Available properties to add". Click on it to add that option, type in your DELETE statement in the Open Command field.[/quote]

Toshea, when I use the open command in the Enterprise stage, to pre-delete the rows, I get the following error:

TDEnt_Ext_Recov_Strs_STID: TeraUtils:DB Call Failure(success check) Info = 43, Code = 3704, Message = ' ('0A'X) is not a valid Teradata SQL token.

TDEnt_Ext_Recov_Strs_STID: DBCAREA msg_txt: Completed successfully.

TDEnt_Ext_Recov_Strs_STID: TeraUtils:ExecuteImmediate failed for 'DELETE from qascst.SALES_RECOVERED_STORES
'. Error Code = 3704

Please assist.

toshea · Post by **toshea** » Wed Aug 27, 2008 2:05 pm

0A is the ASCII hex value for a line feed character which the Teradata CLIv2 does not like. Try taking out any carriage returns in your statement. Does your DELETE statement span multiple lines? If you did a cut-and-paste from somewhere, make sure any terminating line feed was not included in the cut. Only highlight the statement text itself and not the whole line.

kishorenvkb · Post by **kishorenvkb** » Wed Aug 27, 2008 2:10 pm

[quote="toshea"]0A is the ASCII hex value for a line feed character which the Teradata CLIv2 does not like. Try taking out any carriage returns in your statement. Does your DELETE statement span multiple lines? If you did a cut-and-paste from somewhere, make sure any terminating line feed was not included in the cut. Only highlight the statement text itself and not the whole line.[/quote]

Okay. Will try that. Can the SQL be a multi statement? If so, do we need to just add a ; and keep going without any line feed character?

toshea · Post by **toshea** » Wed Aug 27, 2008 2:19 pm

Yes, you can specify multiple statements separated by semicolon characters. Just don't use a line feed.