Datastage Job Design

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
urshit_1983
Participant
Posts: 73
Joined: Wed Jun 28, 2006 3:27 pm
Location: NJ

Post by urshit_1983 »

Welcome aboard.

What is the job log like in Director ? Does it start and never stops or it gets executed but none of the rows are passed ?
"Nobody is expert in Everything,
But Everybody is expert in Something."
narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

Post by narasimha »

Is there a possibility that you could combine the records from the three tables using a single query, rather than using three seperate ODBC Stages?
If yes, run this query outside datastage and check how long it takes to return rows.
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You don't really have 'three source tables'. You have one source table with two ODBC-based reference lookups. That means that for every record in from the first table, a query is executed against each of the other two tables.

As Narasimha notes, if these tables are in the same instance then you should be joining all three tables using one stage. If they're not, you should seriously consider loading the two secondary tables into hashed files. Your ODBC lookups are about the slowest way on the planet to build this job.

Are your lookup queries generated? User defined? Well tuned?
-craig

"You can never have too many knives" -- Logan Nine Fingers
AthiraM
Participant
Posts: 25
Joined: Tue Feb 27, 2007 4:53 am

Post by AthiraM »

Hi,

The lookup queries are generated.

I tried using the Hashed File, still no response.

http://img90.imageshack.us/my.php?image=dsjob3tv3.jpg
http://img168.imageshack.us/my.php?image=dsjob4vh6.jpg

As seen in the log should anyting be done with NLS settings.

This job (without the hash files implemented) I tried out running in my local machine where I have both the DS server and DS client installed.
The job though not fast it does the work within 8 hrs time.

The same job when (I tried out both with the hash file and without it implemented) I deployed in the DS server at our client location it shows no response.

Thanks
Athira
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Sounds like you may need to tune your first query. Have you checked the explain plan? Gotten help from your DBA on how you can improve the performance of the query? Or to verify that there are no blocking locks when you get 'no response'?
-craig

"You can never have too many knives" -- Logan Nine Fingers
AthiraM
Participant
Posts: 25
Joined: Tue Feb 27, 2007 4:53 am

Post by AthiraM »

http://img90.imageshack.us/my.php?image=dsjob3tv3.jpg
http://img147.imageshack.us/my.php?image=dstransss0.jpg

The SQL queries used in the 3 stages are as follows :

(a user defined query)
SELECT PACKAGE3.CMS_ORDER.ORDERID, PACKAGE3.CMS_ORDER.NINO, PACKAGE3.CMS_ORDER.ORDERED, PACKAGE3.CMS_ORDER.ORDSTATUS_TYP2, PACKAGE3.CMS_ORDER.MANUSTATUS_TYP1 FROM PACKAGE3.CMS_ORDER where PACKAGE3.CMS_ORDER.ORDERID like '0%' or PACKAGE3.CMS_ORDER.ORDERID like '1%' or PACKAGE3.CMS_ORDER.ORDERID like '2%' or PACKAGE3.CMS_ORDER.ORDERID like '3%' or PACKAGE3.CMS_ORDER.ORDERID like '4%' or PACKAGE3.CMS_ORDER.ORDERID like '5%' or PACKAGE3.CMS_ORDER.ORDERID like '6%' or PACKAGE3.CMS_ORDER.ORDERID like '7%' or PACKAGE3.CMS_ORDER.ORDERID like '8%' or PACKAGE3.CMS_ORDER.ORDERID like '9%' ;

(a generated query)
SELECT PACKAGE3.CMS_CAR.ORDERNO, PACKAGE3.CMS_CAR.GAVEUP, PACKAGE3.CMS_CAR.CHC FROM PACKAGE3.CMS_CAR;

(a generated query)
SELECT PACKAGE3.CMS_MERRIT.NINO, PACKAGE3.CMS_MERRIT.EINNO, PACKAGE3.CMS_MERRIT.ALLOWANCE FROM PACKAGE3.CMS_MERRIT;

I did bring about changes in the query, used the hash file, still the same condition persists.
Please do comment on this.


Thanks
Athira
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Athira,
start with a very simple job, just your data source to a transform and output to a dummy sequential file. Get this working. Then add one reference lookup and get that working (again to the dummy output file). Finally add your last reference, and once that is working to your satisfaction finalize the job to the output.

It might be that your final output is causing issues. Are all 4 tables different ones? Or is the output table the same as one (or more) of the input/reference tables? If so, depending upon your ODBC options, you might be deadlocking yourself.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

ArndW wrote:It might be that your final output is causing issues. Are all 4 tables different ones? Or is the output table the same as one (or more) of the input/reference tables? If so, depending upon your ODBC options, you might be deadlocking yourself.
Asked and Answered, your Honor. In the other thread on this issue. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

:? oops, I didn't realize that! :evil:
AthiraM
Participant
Posts: 25
Joined: Tue Feb 27, 2007 4:53 am

Post by AthiraM »

Hi,

Even though the performance is not laudable, there has been an improvement from the no response status.
When I checked with the Oracle database it was found that there was a lock due to an insert statement on the output table. I did kill some threads manually.

http://img224.imageshack.us/my.php?image=dsjobdr0.jpg
http://img132.imageshack.us/my.php?image=dsjobia8.jpg

Please do suggest on how I can improve the performance which is now very very poor... it took 12 hrs (11.55) to complete this job.

Thanks a lot..
Athira
narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

Post by narasimha »

Some points I would like to mention here

- Go for a Ora OCI stage instead of the ODBC stage (There are more options to tune in the Ora OCI stage)
- Break up your job design. Do the population of the hashed files in seperate jobs and use them as lookups in the current job.(This way you can identify the bottleneck easily)
- Tune your main query. Check how long it takes to return rows outside datastage first (So you can get a feel as to how long it is going to take to run)
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The stage has instructions for cofiguring it in the 'standard' documentation - either the 'Plug-In Installation and Configuration Guide' and the OCI pdf itself.

On UNIX it's a matter of editing the dsenv file, on Windows I *think* you just need to ensure your Registry entries are correct for Oracle. And in either system you need to have an Oracle Client installed on the DataStage Server. Do you?
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply