Mload Vs Enterprise stages while using Teradata as target DB

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

kishorenvkb
Participant
Posts: 54
Joined: Mon Dec 24, 2007 9:27 am

Mload Vs Enterprise stages while using Teradata as target DB

Post by kishorenvkb »

Hello,

It may be a very basic question for most of you.

When do you use Mload Vs Enterprise Stage when loading the data into Teradata.

My situation is that I have 750M rows to be inserted into an existing table that has already 2Billion rows. I also need to pre-delete (based on the run-date before inserting the 750M rows). I do have the PPI defined on the date column for this table.

Thanks
keshav0307
Premium Member
Premium Member
Posts: 783
Joined: Mon Jan 16, 2006 10:17 pm
Location: Sydney, Australia

Post by keshav0307 »

using Teradata Enterprize stage,you can not delete record based on a condition in the input row.

for only Insert Teradata Enterprize stage is the fatstest.

then LOAD for update, insert and/or delete.
shershahkhan
Participant
Posts: 64
Joined: Fri Jan 25, 2008 4:41 am

Post by shershahkhan »

There is no simple answer when to use which, but if you want to run the stage in parallel you should use Enterprise stage although Enterrpise stage do have some bugs but that are in certain senarios, Here is how you should to it

Enterprise stage:
1) Run in parallel
2) Change the write mode to Upsert
3) Upsert mode should be Delete then insert(you need to check if the delete is run per record or per run, ,i am not sure)
4) Give your queries and it will work
5) The logs can be found from the Terasync table
6) you can't provide the error tables names(disadvantage)

Mload stage:
1) Run in sequencial mode
2) you can give the delete statment in the before Script part
3) to get details of log you need to open the Mload log file (disadvantage)
4) You can give details of Error tables, log tables etc

If you have any particular satuation let us know and we will tell you which option is the best
Nagin
Charter Member
Charter Member
Posts: 89
Joined: Thu Jan 26, 2006 12:37 pm

Post by Nagin »

shershahkhan wrote: Mload stage:
1) Run in sequencial mode
Doesn't the number of sessions we specify in mload script represent parallelism???.

For such a big load of 750 million rows with pre delets I would rather take the multi load path rather than the Enterprise stage as I can do both in one mload script.
kishorenvkb
Participant
Posts: 54
Joined: Mon Dec 24, 2007 9:27 am

Post by kishorenvkb »

I am currently using the MLOAD with a Before Script to delete everything from the table with a table.date > run-date.

As I have communicated before, the destination table would still have few billion rows even after the above delete statement.

Later I am using MLOAD Insert to insert the 750M rows into the table.

I have been suggested to do the following for better performance:
1) From the sequence, execute a script to pre-delete the rows where Date > run-date
2) Call the load job, which uses Enterprise - Insert stage to load the rows.

I just wanted to know if this a better option that I have already and why.

Thanks again for all your responses.
toshea
Participant
Posts: 79
Joined: Thu Aug 14, 2008 6:46 pm

Post by toshea »

Your best option is to use the Teradata Enterprise stage. Since you have a range of values you want to pre-delete, you can do that in the -open option of Teradata Enterprise. You should be able to pass the constraint value via a job parameter.

After the pre-delete, it's preferrable to use Teradata Enterprise for the load since you are only appending rows. Teradata Enterpise can run in parallel, whereas MultiLoad is not truly a parallel process. Although MultiLoad allows you to use multiple sessions, MultiLoad itself runs as a single process on one node, it reads from the named pipe sequentially, and it spreads that data over the multiple sessions. Teradata Enterprise will run in multiple processes that can be spread over multiple nodes, and each of those processes can spread the data over multiple sessions (determined by the sessionsperplayer option).
kishorenvkb
Participant
Posts: 54
Joined: Mon Dec 24, 2007 9:27 am

Post by kishorenvkb »

Wonderful. Thanks.

Can you please expand on -open option of Enterprise Stage?
toshea
Participant
Posts: 79
Joined: Thu Aug 14, 2008 6:46 pm

Post by toshea »

Open the Teradata Enterprise stage. From the Properties tab of the Input page, click on Options folder and you'll see the Open Command option in the window of "Available properties to add". Click on it to add that option, type in your DELETE statement in the Open Command field.
hamzaqk
Participant
Posts: 249
Joined: Tue Apr 17, 2007 5:50 am
Location: islamabad

Post by hamzaqk »

Doesn't the number of sessions we specify in mload script represent parallelism???.
Nope.The number of sessions define connections to TD, not how parallelism or degree of parallelism is controled and achieved. IF "sessions" were controlling parallelism one could use Mload stage in server jobs and run them in parallel which, however, is not the case.

In TD its done through the AMPS processing.DS has its own way of implementing parallelism. You can find details in its documentation.
Teradata Certified Master V2R5
kishorenvkb
Participant
Posts: 54
Joined: Mon Dec 24, 2007 9:27 am

Post by kishorenvkb »

Thanks to "TOSHEA" and all other responses.
kishorenvkb
Participant
Posts: 54
Joined: Mon Dec 24, 2007 9:27 am

Post by kishorenvkb »

[quote="toshea"]Open the Teradata Enterprise stage. From the Properties tab of the Input page, click on Options folder and you'll see the Open Command option in the window of "Available properties to add". Click on it to add that option, type in your DELETE statement in the Open Command field.[/quote]

Toshea, when I use the open command in the Enterprise stage, to pre-delete the rows, I get the following error:

TDEnt_Ext_Recov_Strs_STID: TeraUtils:DB Call Failure(success check) Info = 43, Code = 3704, Message = ' ('0A'X) is not a valid Teradata SQL token.

TDEnt_Ext_Recov_Strs_STID: DBCAREA msg_txt: Completed successfully.

TDEnt_Ext_Recov_Strs_STID: TeraUtils:ExecuteImmediate failed for 'DELETE from qascst.SALES_RECOVERED_STORES
'. Error Code = 3704

Please assist.
kishorenvkb
Participant
Posts: 54
Joined: Mon Dec 24, 2007 9:27 am

Post by kishorenvkb »

[quote="toshea"]Open the Teradata Enterprise stage. From the Properties tab of the Input page, click on Options folder and you'll see the Open Command option in the window of "Available properties to add". Click on it to add that option, type in your DELETE statement in the Open Command field.[/quote]

Toshea, when I use the open command in the Enterprise stage, to pre-delete the rows, I get the following error:

TDEnt_Ext_Recov_Strs_STID: TeraUtils:DB Call Failure(success check) Info = 43, Code = 3704, Message = ' ('0A'X) is not a valid Teradata SQL token.

TDEnt_Ext_Recov_Strs_STID: DBCAREA msg_txt: Completed successfully.

TDEnt_Ext_Recov_Strs_STID: TeraUtils:ExecuteImmediate failed for 'DELETE from qascst.SALES_RECOVERED_STORES
'. Error Code = 3704

Please assist.
toshea
Participant
Posts: 79
Joined: Thu Aug 14, 2008 6:46 pm

Post by toshea »

0A is the ASCII hex value for a line feed character which the Teradata CLIv2 does not like. Try taking out any carriage returns in your statement. Does your DELETE statement span multiple lines? If you did a cut-and-paste from somewhere, make sure any terminating line feed was not included in the cut. Only highlight the statement text itself and not the whole line.
kishorenvkb
Participant
Posts: 54
Joined: Mon Dec 24, 2007 9:27 am

Post by kishorenvkb »

[quote="toshea"]0A is the ASCII hex value for a line feed character which the Teradata CLIv2 does not like. Try taking out any carriage returns in your statement. Does your DELETE statement span multiple lines? If you did a cut-and-paste from somewhere, make sure any terminating line feed was not included in the cut. Only highlight the statement text itself and not the whole line.[/quote]

Okay. Will try that. Can the SQL be a multi statement? If so, do we need to just add a ; and keep going without any line feed character?
toshea
Participant
Posts: 79
Joined: Thu Aug 14, 2008 6:46 pm

Post by toshea »

Yes, you can specify multiple statements separated by semicolon characters. Just don't use a line feed.
Post Reply