Page 1 of 1

About DataStage PX Resource Comsumption

Posted: Thu Dec 11, 2003 8:06 pm
by dhwankim
Hi One
We made DW with DataStage PX. and I already feed all data to DW initially. and We started Daily CDC process This monday with DataStage PX. and The Daily ETL Process is built in Sequence Job.

but First day of CDC, DataStage Jobs made problem. So many PX job aborted. When I checked datastage log and found the problems occured by Hardware Resource.

But It's my guess.
So My Customer and I want to how to calculate The DataStage PX Resource Comsumption. If I know that, I could rearrange Job Flow.

Please Let me know that. and I'll be happy with Any related Tip or Resoultions.

D.H Kim

Re: About DataStage PX Resource Comsumption

Posted: Thu Dec 11, 2003 8:50 pm
by Teej
dhwankim wrote:Hi One
We made DW with DataStage PX. and I already feed all data to DW initially. and We started Daily CDC process This monday with DataStage PX. and The Daily ETL Process is built in Sequence Job.

but First day of CDC, DataStage Jobs made problem. So many PX job aborted. When I checked datastage log and found the problems occured by Hardware Resource.

But It's my guess.
So My Customer and I want to how to calculate The DataStage PX Resource Comsumption. If I know that, I could rearrange Job Flow.
Can you please explain how you came to the conclusion that there is a resource limitation? Some of those error messages would be helpful...

But basically, as a PX consultant would tell you -- follow the 1/2 rule. If you have 8 cpu, run 4 node. It would not max out your computer, but then it will probably not kill it. If you must run 8 nodes, only run one at a time.

But I have dealt with situations where my computer's loads went into the high 80s and 90s for a 8 CPU system. It ran SLOW, but it still ran. So...

Could it be that you are not providing enough scratch space?

Could it be that you are not providing enough memory?

I am at a loss to help you without extra and specific details. There are no one single 'resource' system rule. It all varies. Heck, even with the Data Warehouse LifeCycle Toolkit's 800+ pages, they only teach you the general concepts that would do squat unless you know what the problem is.

-T.J.

Posted: Thu Dec 11, 2003 10:25 pm
by kcbland
As with any application you may build, resource consumption is usually driven some/all of the following factors:

1. Row count
2. Row width
3. Average bytes/row

So, if you write a program/job you have to understand the nature of the data. You should be able to roughly estimate the usage for each program/job. The easiest way is to simply run it isolated and measure the usage. If you attempted to run everything together for the first time in a production environment, then this should show you the wisdom of doing full load testing during development.

Without seeing the messages you saw, we have no way of disputing your conclusions. I suspect you probably ran out of temp disk space somewhere. This can be caused by running more processes with more data that you have the space to accommodate. You are correct that the resolution is to defer or re-organize the jobstreams so as to reduce the total amount of work.

Re: About DataStage PX Resource Comsumption

Posted: Thu Dec 11, 2003 11:48 pm
by dhwankim
Teej wrote:
dhwankim wrote:Hi One
We made DW with DataStage PX. and I already feed all data to DW initially. and We started Daily CDC process This monday with DataStage PX. and The Daily ETL Process is built in Sequence Job.

but First day of CDC, DataStage Jobs made problem. So many PX job aborted. When I checked datastage log and found the problems occured by Hardware Resource.

But It's my guess.
So My Customer and I want to how to calculate The DataStage PX Resource Comsumption. If I know that, I could rearrange Job Flow.
Can you please explain how you came to the conclusion that there is a resource limitation? Some of those error messages would be helpful...

But basically, as a PX consultant would tell you -- follow the 1/2 rule. If you have 8 cpu, run 4 node. It would not max out your computer, but then it will probably not kill it. If you must run 8 nodes, only run one at a time.

But I have dealt with situations where my computer's loads went into the high 80s and 90s for a 8 CPU system. It ran SLOW, but it still ran. So...

Could it be that you are not providing enough scratch space?

Could it be that you are not providing enough memory?

I am at a loss to help you without extra and specific details. There are no one single 'resource' system rule. It all varies. Heck, even with the Data Warehouse LifeCycle Toolkit's 800+ pages, they only teach you the general concepts that would do squat unless you know what the problem is.

-T.J.
Hi

The Reason of My Conclusion is from this sitution,
If I run Each PX Job sepeletly, Each Job is finished without error.
But When I run PX Jobs the same time, Some job is gone with the below log

:
:
Info S1AffDCPtop1001.0:Progress: 10percent
Info S1AffDCPtop1001.0:Progress: 20percent
Info S1AffDCPtop1001.0:Progress: 30percent
Fatal Tr01,1:Failure during execution of operator logic
Info Tr01,1:Output 0 produced 0 records
Fatal Tr01,1:Fatal Error: APT_Decimal:assignFrom: src(2,744,048,048) out of range for decimal with precision 12 and scale 0
Fatal S2AffDCPtop1001,0:Failure during execution of operator logic
Fatal S2AffDCPtop1001,0:Fata Error: waitForWriteSignal(): Premature EOF on node crmdm No such file or directory
Fatal S1AffDCPtop1001,0:Failure during execution of operator logic
Info S1AffDCPtop1001,0:Output 0 produced 353 records
Fatal S1AffDCPtop1001,0:Fatal Error: Unable to allocate communication resources
Fatal node_node2: Player 1 terminated unexpectedly
Fatal node_node1: Player 1 terminated unexpectedly
Fatal main_program: Unexpected exit status 1
Fatal Tr01,0: Failure during execution of operator logic
:
:
:
Now I used SMP Machine(1 Node). This Server OS is AIX 5.1 and has 8 CPU and 16 GB Memory.
This Server is running For ORACLE 9i(1 instance) and DataStage.

If My guess is not correct. Please give a advice and some resoultion.

Thanks

D.H Kim

Re: About DataStage PX Resource Comsumption

Posted: Fri Dec 12, 2003 7:36 am
by Teej
dhwankim wrote:The Reason of My Conclusion is from this sitution,
If I run Each PX Job sepeletly, Each Job is finished without error.
But When I run PX Jobs the same time, Some job is gone with the below log
This statement really conflicts with what I saw here:
Fatal Tr01,1:Fatal Error: APT_Decimal:assignFrom: src(2,744,048,048) out of range for decimal with precision 12 and scale 0
This tells me that a value (which I believe is the displayed number) is larger than 12 digits. Does the input data have commas? If so, there is 13 'digits' or rather 13 characters. Does this error still occurs if you set that particular field with a precision of 13 (or 15?)

Now why does this fail only when multiple PX jobs are running at the same time -- that is a major puzzle. Are those PX jobs pulling from the same source data? Is this particular job running in multiple invocations (many copies running at the same time)?

-T.J.

About DataStage PX Resource Comsumption

Posted: Fri Dec 12, 2003 9:42 am
by bigpoppa
Does your SMP and your MPP have the same C compiler?

- BP

Re: About DataStage PX Resource Comsumption

Posted: Mon Dec 15, 2003 3:33 am
by dhwankim
bigpoppa wrote:Does your SMP and your MPP have the same C compiler?

- BP
We use SMP only.

About DataStage PX Resource Comsumption

Posted: Mon Dec 15, 2003 4:49 pm
by bigpoppa
Sometimes a really long OSH script (generated by PX, in your case) can lead to random timeout errors.

Two suggestions:

1. Contact your ASC rep and ask them about an undocumented environment variable called APT_TIMEOUT (or something similar). You maybe to set that and run your jobs to completion.

2. Break up your job into two jobs, landing intermediate data to disk. Run both jobs in parallel and see if you get the same error.

-BP

Re: About DataStage PX Resource Comsumption

Posted: Mon Dec 15, 2003 9:09 pm
by Teej
bigpoppa wrote:1. Contact your ASC rep and ask them about an undocumented environment variable called APT_TIMEOUT (or something similar). You maybe to set that and run your jobs to completion.
$APT_BUFFER_MAXIMUM_TIMEOUT? Set it to 1.

Of course, I would recommend ensuring that the kernel settings are set first to ensure that everything is well tuned for DataStage. The Install and Upgrade Guide PDF file provides the full and juicy detail.

-T.J.