Teradata Enterprise Stage vs Teradata Multiload Stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
gsherry1
Charter Member
Charter Member
Posts: 173
Joined: Fri Jun 17, 2005 8:31 am
Location: Canada

Teradata Enterprise Stage vs Teradata Multiload Stage

Post by gsherry1 »

Hello Forum,

I have been using Multiloader for situations where I require upserts, which I have unable to do in Enterprise stage. Is there any other advantage to using Multiloader Stager over the Enterprise Stage?

Thanks,

Greg
sanjeev_sithara
Participant
Posts: 15
Joined: Wed May 26, 2004 6:30 am

Post by sanjeev_sithara »

Multiload is very efficient when you are doing maintenance activities on multiple large tables.At a time Multiload stage can perform Inserts/Updates on upto 5 different tables in one pass.

If you are using enterprise stage for this , then you will need five different stage instances of the Enterprise stage.
nivas
Participant
Posts: 117
Joined: Sun Mar 21, 2004 4:40 pm

Post by nivas »

Hi

Can anybody suggest how to use Multiload stage with execution mode Parallel. I am able to use the Multiload stage with sequential excution mode. When I try to run in Parallel mode I am getting errors like work table already exists.

thanks
Nivas
sanjeev_sithara
Participant
Posts: 15
Joined: Wed May 26, 2004 6:30 am

Post by sanjeev_sithara »

Hi,

The work tables are created each time you perform an operation using Multiload.It is automatically dropped once the job has run successfully.However if the job aborts, then the work tables have to be manully dropped before the job is run again.If your job has aborted, pl drop the work tables manually and run the job again.

rgds
nivas
Participant
Posts: 117
Joined: Sun Mar 21, 2004 4:40 pm

Post by nivas »

Hi Sanjeev

When I put execution mode sequentially there are no errors. Only when I put execution mode parallel the job is aborting. The multiload log says CKP table already exists. It means in parallel mode first multiload process creating CKP table. Then the second multiload process trying to create second CKP table with the same name which was used by first multiload process. Thats why job is aborting saying CKP table already exists. This problem is not there in sequential execution mode since in this only one multiload process will run. So if we able to control so that each multiload process create a seperate CKP , WK, UV, ET tables then this problem will be solved. Other wise there should be another way. Please let me know how you are coding .

thanks
Nivas
sanjeev_sithara wrote:Hi,

The work tables are created each time you perform an operation using Multiload.It is automatically dropped once the job has run successfully.However if the job aborts, then the work tables have to be manully dropped before the job is run again.If your job has aborted, pl drop the work tables manually and run the job again.

rgds
Ultramundane
Participant
Posts: 407
Joined: Mon Jun 27, 2005 8:54 am
Location: Walker, Michigan
Contact:

Post by Ultramundane »

For doing straight inserts we can insert using the teradata mload at only 15k rows/second (This is with the stage in parallel). Very slow and uses a lot of cpu (240 out of 400) on the datastage server. Using the teradata enterprise stage we achieve 120k rows/second ( using only 60 out of 400). Much more acceptable.
nivas
Participant
Posts: 117
Joined: Sun Mar 21, 2004 4:40 pm

Post by nivas »

Hi Ultramundate

could you explan how did you use MultiLoad in parallel mode. In the node map constraint did you select one node only. Could you please explain how did you set the MultiLoad plug in properties.

thanks
Nivas
Last edited by nivas on Thu Dec 22, 2005 11:32 am, edited 1 time in total.
Ultramundane
Participant
Posts: 407
Joined: Mon Jun 27, 2005 8:54 am
Location: Walker, Michigan
Contact:

Post by Ultramundane »

I'm afraid that we did not anything special. We installed the plug-in when we installed the base product. We did not select one node only and we have two nodes configured. The job runs, but is slow in both sequential and parallel mode. We set the job up exactly the same as the sequential except that we selected parallel.

When you look at the processes running on the box, you now see 2 parallel processes and 1 mload process. In sequential we saw one sequential process and one mload process. I am not sure what exactly it is doing, but it looks like it is funneling everything back down to just a single mload process when configured in parallel. In addition, the parallel ascential processes and sequential process use 100% of a cpu whereas the mload process only uses about 1%. I believe this tells me that the bottleneck is the Ascential code in whatever it is doing to filter the data for the generated mload process.

As we did not do anything special, maybe it is related to the version of the software? We are DS 7.5.1A with many patches applied ( 8 as of this moment and we are waiting for many more ).
pneumalin
Premium Member
Premium Member
Posts: 125
Joined: Sat May 07, 2005 6:32 am

Post by pneumalin »

Ultramudane,
Could you please advise your latest Patches number, since we have applied 2 patches in Server(Solaris) and 1 patch in Client, and would like to know if any latest patches applied to us.
Could I just apply the latest patches that should cover all the old patches?

Thanks for your help and have happy holiday season!

Pneuma.
Ultramundane
Participant
Posts: 407
Joined: Mon Jun 27, 2005 8:54 am
Location: Walker, Michigan
Contact:

Post by Ultramundane »

We are on AIX so not sure if patches will be the same. Here is the list of patches by ecase (in the order they were applied) from Ascential that we have applied.

75003 (The DRS plugin and the ODBC stage are incorrectly
placing an unqualified tablesname in the stage
when afull-qualified name is selected... the SQL
is wrong because the columns are fully-qualified.
Resolution:
Modified server code to correct DatabaseName and Owner for oracle, db2 & access in EventLinkRep. Modified clients to call GetFQTN to get fully qualified tablename.),

76318 (When you import a view from DB2 via odbc, DataStage is using the real table name for the fully qualified table name. If you look down at the Long Description Field, it has the fully qualified table name CORRECT but that is not what is used when trying to load the data to a job. When you try to use SQL Builder in the DB2 stage, it makes the table name schema.table instead of view.table. Ex. I create a table as user db2inst8 and the table name id DL_K. I then create a view called @T00120 from table DL_K. When the metadata is loaded into sql builder, the FROM clause has db2inst8.DL_K and not @T00120.DL_K. )

80895 ( Importing stored procedure metadata via the
Manager is not working for DB2 on z/os. Column
and parameter metadata is not importing),

80559 ( After installing patch for ecase 75003, Version
Control errors with error "DSR_RECORD Unknown
arg: 19". Without patch there are no errors on
promotion to the protected project in Production.
After the patch the VERSION Control version was
7.5.1.39 before it was 7.5.1.35.
Several problems after the patch. Initialized a
job from an unprotected project (never
initialized before), promoted the job to a
protected project, received the above error,
started compiling another job not the one being
promoted, errored again and compiled the job
promoted and erored. Looked in Dircetor in the
protected project and first job was not compiled.
The job compiled in erro still had the compile
status from the last time it was promoted.),

80036 (stored procedure
import is not working correctly. When import
stored procedure from DB2 on mainframe, the fully
qualified name (i.e. owner.storedprocname) is not
imported and when it goes against the catalog to
get the column information, it is unable to
capture the information. So user has to manually
enter everything in stored procedure stage
because the import is incomplete.),

76017 (Using the SYBASEOC stage on the parallel canvas to read data with columns defined as TIMESTAMP results in significant performance degradation. This also occurs with Oracle when the DRS stage is used on the parallel canvas but the degradation is not as severe.
The same degradation does not occur when the stage is used on the Server canvas regardless of the data type.
Resolution:
Cache conversion strings),

20959 #1(Security fix #1 to prevent unencrypted passwords in log),

20959 #2(Security fix #2 to prevent unencrypted passwords in log),

61040 (Allows retrieval of min and max (strings and timestamps) in aggregator)
pneumalin
Premium Member
Premium Member
Posts: 125
Joined: Sat May 07, 2005 6:32 am

Post by pneumalin »

Ultramundane wrote:We are on AIX so not sure if patches will be the same. Here is the list of patches by ecase (in the order they were applied) from Ascential that we have applied.

75003 (The DRS plugin and the ODBC stage are incorrectly
placing an unqualified tablesname in the stage
when afull-qualified name is selected... the SQL
is wrong because the columns are fully-qualified.
Resolution:
Modified server code to correct DatabaseName and Owner for oracle, db2 & access in EventLinkRep. Modified clients to call GetFQTN to get fully qualified tablename.),

76318 (When you import a view from DB2 via odbc, DataStage is using the real table name for the fully qualified table name. If you look down at the Long Description Field, it has the fully qualified table name CORRECT but that is not what is used when trying to load the data to a job. When you try to use SQL Builder in the DB2 stage, it makes the table name schema.table instead of view.table. Ex. I create a table as user db2inst8 and the table name id DL_K. I then create a view called @T00120 from table DL_K. When the metadata is loaded into sql builder, the FROM clause has db2inst8.DL_K and not @T00120.DL_K. )

80895 ( Importing stored procedure metadata via the
Manager is not working for DB2 on z/os. Column
and parameter metadata is not importing),

80559 ( After installing patch for ecase 75003, Version
Control errors with error "DSR_RECORD Unknown
arg: 19". Without patch there are no errors on
promotion to the protected project in Production.
After the patch the VERSION Control version was
7.5.1.39 before it was 7.5.1.35.
Several problems after the patch. Initialized a
job from an unprotected project (never
initialized before), promoted the job to a
protected project, received the above error,
started compiling another job not the one being
promoted, errored again and compiled the job
promoted and erored. Looked in Dircetor in the
protected project and first job was not compiled.
The job compiled in erro still had the compile
status from the last time it was promoted.),

80036 (stored procedure
import is not working correctly. When import
stored procedure from DB2 on mainframe, the fully
qualified name (i.e. owner.storedprocname) is not
imported and when it goes against the catalog to
get the column information, it is unable to
capture the information. So user has to manually
enter everything in stored procedure stage
because the import is incomplete.),

76017 (Using the SYBASEOC stage on the parallel canvas to read data with columns defined as TIMESTAMP results in significant performance degradation. This also occurs with Oracle when the DRS stage is used on the parallel canvas but the degradation is not as severe.
The same degradation does not occur when the stage is used on the Server canvas regardless of the data type.
Resolution:
Cache conversion strings),

20959 #1(Security fix #1 to prevent unencrypted passwords in log),

20959 #2(Security fix #2 to prevent unencrypted passwords in log),

61040 (Allows retrieval of min and max (strings and timestamps) in aggregator)

Thanks for your information! 80859, 80036, and 76017 will be helpful to our development environment.
smuppidi
Premium Member
Premium Member
Posts: 11
Joined: Fri Mar 17, 2006 9:00 am

Post by smuppidi »

Hello All,

We are trying to extract/ read huge data from Teradata in Datastage EE. We are trying to decide which stage is better, the Teradata Enterprise Stage or Teradata MultiLoad/FastExport stage.

Can anybody have any idea of apprx. cut off records (Like < 1million reads use Teradata Enterprise and > 1million use Mload stage), which can help decide which stage to go for??

Thanks in advance.
Satish.
Post Reply