Mload Vs Enterprise stages while using Teradata as target DB
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54
- Joined: Mon Dec 24, 2007 9:27 am
Mload Vs Enterprise stages while using Teradata as target DB
Hello,
It may be a very basic question for most of you.
When do you use Mload Vs Enterprise Stage when loading the data into Teradata.
My situation is that I have 750M rows to be inserted into an existing table that has already 2Billion rows. I also need to pre-delete (based on the run-date before inserting the 750M rows). I do have the PPI defined on the date column for this table.
Thanks
It may be a very basic question for most of you.
When do you use Mload Vs Enterprise Stage when loading the data into Teradata.
My situation is that I have 750M rows to be inserted into an existing table that has already 2Billion rows. I also need to pre-delete (based on the run-date before inserting the 750M rows). I do have the PPI defined on the date column for this table.
Thanks
-
- Premium Member
- Posts: 783
- Joined: Mon Jan 16, 2006 10:17 pm
- Location: Sydney, Australia
-
- Participant
- Posts: 64
- Joined: Fri Jan 25, 2008 4:41 am
There is no simple answer when to use which, but if you want to run the stage in parallel you should use Enterprise stage although Enterrpise stage do have some bugs but that are in certain senarios, Here is how you should to it
Enterprise stage:
1) Run in parallel
2) Change the write mode to Upsert
3) Upsert mode should be Delete then insert(you need to check if the delete is run per record or per run, ,i am not sure)
4) Give your queries and it will work
5) The logs can be found from the Terasync table
6) you can't provide the error tables names(disadvantage)
Mload stage:
1) Run in sequencial mode
2) you can give the delete statment in the before Script part
3) to get details of log you need to open the Mload log file (disadvantage)
4) You can give details of Error tables, log tables etc
If you have any particular satuation let us know and we will tell you which option is the best
Enterprise stage:
1) Run in parallel
2) Change the write mode to Upsert
3) Upsert mode should be Delete then insert(you need to check if the delete is run per record or per run, ,i am not sure)
4) Give your queries and it will work
5) The logs can be found from the Terasync table
6) you can't provide the error tables names(disadvantage)
Mload stage:
1) Run in sequencial mode
2) you can give the delete statment in the before Script part
3) to get details of log you need to open the Mload log file (disadvantage)
4) You can give details of Error tables, log tables etc
If you have any particular satuation let us know and we will tell you which option is the best
Doesn't the number of sessions we specify in mload script represent parallelism???.shershahkhan wrote: Mload stage:
1) Run in sequencial mode
For such a big load of 750 million rows with pre delets I would rather take the multi load path rather than the Enterprise stage as I can do both in one mload script.
-
- Participant
- Posts: 54
- Joined: Mon Dec 24, 2007 9:27 am
I am currently using the MLOAD with a Before Script to delete everything from the table with a table.date > run-date.
As I have communicated before, the destination table would still have few billion rows even after the above delete statement.
Later I am using MLOAD Insert to insert the 750M rows into the table.
I have been suggested to do the following for better performance:
1) From the sequence, execute a script to pre-delete the rows where Date > run-date
2) Call the load job, which uses Enterprise - Insert stage to load the rows.
I just wanted to know if this a better option that I have already and why.
Thanks again for all your responses.
As I have communicated before, the destination table would still have few billion rows even after the above delete statement.
Later I am using MLOAD Insert to insert the 750M rows into the table.
I have been suggested to do the following for better performance:
1) From the sequence, execute a script to pre-delete the rows where Date > run-date
2) Call the load job, which uses Enterprise - Insert stage to load the rows.
I just wanted to know if this a better option that I have already and why.
Thanks again for all your responses.
Your best option is to use the Teradata Enterprise stage. Since you have a range of values you want to pre-delete, you can do that in the -open option of Teradata Enterprise. You should be able to pass the constraint value via a job parameter.
After the pre-delete, it's preferrable to use Teradata Enterprise for the load since you are only appending rows. Teradata Enterpise can run in parallel, whereas MultiLoad is not truly a parallel process. Although MultiLoad allows you to use multiple sessions, MultiLoad itself runs as a single process on one node, it reads from the named pipe sequentially, and it spreads that data over the multiple sessions. Teradata Enterprise will run in multiple processes that can be spread over multiple nodes, and each of those processes can spread the data over multiple sessions (determined by the sessionsperplayer option).
After the pre-delete, it's preferrable to use Teradata Enterprise for the load since you are only appending rows. Teradata Enterpise can run in parallel, whereas MultiLoad is not truly a parallel process. Although MultiLoad allows you to use multiple sessions, MultiLoad itself runs as a single process on one node, it reads from the named pipe sequentially, and it spreads that data over the multiple sessions. Teradata Enterprise will run in multiple processes that can be spread over multiple nodes, and each of those processes can spread the data over multiple sessions (determined by the sessionsperplayer option).
-
- Participant
- Posts: 54
- Joined: Mon Dec 24, 2007 9:27 am
Nope.The number of sessions define connections to TD, not how parallelism or degree of parallelism is controled and achieved. IF "sessions" were controlling parallelism one could use Mload stage in server jobs and run them in parallel which, however, is not the case.Doesn't the number of sessions we specify in mload script represent parallelism???.
In TD its done through the AMPS processing.DS has its own way of implementing parallelism. You can find details in its documentation.
Teradata Certified Master V2R5
-
- Participant
- Posts: 54
- Joined: Mon Dec 24, 2007 9:27 am
-
- Participant
- Posts: 54
- Joined: Mon Dec 24, 2007 9:27 am
[quote="toshea"]Open the Teradata Enterprise stage. From the Properties tab of the Input page, click on Options folder and you'll see the Open Command option in the window of "Available properties to add". Click on it to add that option, type in your DELETE statement in the Open Command field.[/quote]
Toshea, when I use the open command in the Enterprise stage, to pre-delete the rows, I get the following error:
TDEnt_Ext_Recov_Strs_STID: TeraUtils:DB Call Failure(success check) Info = 43, Code = 3704, Message = ' ('0A'X) is not a valid Teradata SQL token.
TDEnt_Ext_Recov_Strs_STID: DBCAREA msg_txt: Completed successfully.
TDEnt_Ext_Recov_Strs_STID: TeraUtils:ExecuteImmediate failed for 'DELETE from qascst.SALES_RECOVERED_STORES
'. Error Code = 3704
Please assist.
Toshea, when I use the open command in the Enterprise stage, to pre-delete the rows, I get the following error:
TDEnt_Ext_Recov_Strs_STID: TeraUtils:DB Call Failure(success check) Info = 43, Code = 3704, Message = ' ('0A'X) is not a valid Teradata SQL token.
TDEnt_Ext_Recov_Strs_STID: DBCAREA msg_txt: Completed successfully.
TDEnt_Ext_Recov_Strs_STID: TeraUtils:ExecuteImmediate failed for 'DELETE from qascst.SALES_RECOVERED_STORES
'. Error Code = 3704
Please assist.
-
- Participant
- Posts: 54
- Joined: Mon Dec 24, 2007 9:27 am
[quote="toshea"]Open the Teradata Enterprise stage. From the Properties tab of the Input page, click on Options folder and you'll see the Open Command option in the window of "Available properties to add". Click on it to add that option, type in your DELETE statement in the Open Command field.[/quote]
Toshea, when I use the open command in the Enterprise stage, to pre-delete the rows, I get the following error:
TDEnt_Ext_Recov_Strs_STID: TeraUtils:DB Call Failure(success check) Info = 43, Code = 3704, Message = ' ('0A'X) is not a valid Teradata SQL token.
TDEnt_Ext_Recov_Strs_STID: DBCAREA msg_txt: Completed successfully.
TDEnt_Ext_Recov_Strs_STID: TeraUtils:ExecuteImmediate failed for 'DELETE from qascst.SALES_RECOVERED_STORES
'. Error Code = 3704
Please assist.
Toshea, when I use the open command in the Enterprise stage, to pre-delete the rows, I get the following error:
TDEnt_Ext_Recov_Strs_STID: TeraUtils:DB Call Failure(success check) Info = 43, Code = 3704, Message = ' ('0A'X) is not a valid Teradata SQL token.
TDEnt_Ext_Recov_Strs_STID: DBCAREA msg_txt: Completed successfully.
TDEnt_Ext_Recov_Strs_STID: TeraUtils:ExecuteImmediate failed for 'DELETE from qascst.SALES_RECOVERED_STORES
'. Error Code = 3704
Please assist.
0A is the ASCII hex value for a line feed character which the Teradata CLIv2 does not like. Try taking out any carriage returns in your statement. Does your DELETE statement span multiple lines? If you did a cut-and-paste from somewhere, make sure any terminating line feed was not included in the cut. Only highlight the statement text itself and not the whole line.
-
- Participant
- Posts: 54
- Joined: Mon Dec 24, 2007 9:27 am
[quote="toshea"]0A is the ASCII hex value for a line feed character which the Teradata CLIv2 does not like. Try taking out any carriage returns in your statement. Does your DELETE statement span multiple lines? If you did a cut-and-paste from somewhere, make sure any terminating line feed was not included in the cut. Only highlight the statement text itself and not the whole line.[/quote]
Okay. Will try that. Can the SQL be a multi statement? If so, do we need to just add a ; and keep going without any line feed character?
Okay. Will try that. Can the SQL be a multi statement? If so, do we need to just add a ; and keep going without any line feed character?