Page 1 of 1

Double Vs Decimal: Which is faster?

Posted: Mon Feb 11, 2008 1:42 am
by joesat
Hi,
I have a set of 20 to 25 long running PX jobs (each job runs from 3 hours to more than 5 hours). I have been asked to look at ways to improve the performance.

I observed that the Double datatype had been used as a default for most dollar fields in the jobs inorder to avoid arithmetic complications. I was wondering if the Double could be changed to a Decimal type in a transformer stage.

Double uses 8 bytes while Decimal uses only 4. So will this enhance the speed of the job? Please give me your comments and suggestions.

Thanks!

Re: Double Vs Decimal: Which is faster?

Posted: Mon Feb 11, 2008 2:19 am
by priyadarshikunal
joesat wrote:Hi,
I have a set of 20 to 25 long running PX jobs (each job runs from 3 hours to more than 5 hours). I have been asked to look at ways to improve the performance.

I observed that the Double datatype had been used as a default for most dollar fields in the jobs inorder to avoid arithmetic complications. I was wondering if the Double could be changed to a Decimal type in a transformer stage.

Double uses 8 bytes while Decimal uses only 4. So will this enhance the speed of the job? Please give me your comments and suggestions.

Thanks!
How you can say that decimal takes only 4 bytes?

have you checked how much space it takes for a decimal

integer takes 4 bytes not decimal

but yes mapping algorithm of decimal is simpler than double.

hardly matters

look for the other aspects to reduce the time before looking this one.

Regards,

Posted: Mon Feb 11, 2008 2:39 am
by joesat
You are right Priya. I think was acting on assumption that it was 4 bytes. But I tried running a job which gives a 30 GB output with Double, and the same job gives a 17 GB output file with Decimal type. All the data is intact with the Decimal type except that the format is different.

But can someone explain what is given in the documentation:
SQL Type: Decimal Numeric
Underlying Data Type: decimal
Size: (Roundup(p)+1)/2
Description: Packed decimal, compatible with IBM packed decimal format
How is the size calculated for Decimal?

Posted: Mon Feb 11, 2008 4:10 am
by joesat
:)

Posted: Mon Feb 11, 2008 4:14 am
by ArndW
What kind of a file are you writing to? If a text file then your output will be "display" format and not internal format.

Have you identified what your performance bottleneck for this job is?

Posted: Mon Feb 11, 2008 4:54 am
by joesat
Yes I am writing to a sequential output file. Actually, there is no performance bottleneck, per se... but we are just looking at possible areas which can be improved.

Like I said, using Double increases the amount of disk space that we might need. In that case, using Decimal is better as it uses quite less space compared to Double (for huge files in the order of tens of GB).

But I would like to know if using Decimal can speed up the process when we use sequential output...

Posted: Mon Feb 11, 2008 5:10 am
by ArndW
The most efficient in your case will be the datatype which requires the least conversions. You can minimize I/O on output by choosing the smallest possible display mask - the source column type is irrelevant in this case as a conversion needs to be done either way.

Posted: Mon Feb 11, 2008 9:42 am
by sud
joesat wrote:You are right Priya. I think was acting on assumption that it was 4 bytes. But I tried running a job which gives a 30 GB output with Double, and the same job gives a 17 GB output file with Decimal type. All the data is intact with the Decimal type except that the format is different.

But can someone explain what is given in the documentation:
SQL Type: Decimal Numeric
Underlying Data Type: decimal
Size: (Roundup(p)+1)/2
Description: Packed decimal, compatible with IBM packed decimal format
How is the size calculated for Decimal?
Completely agree with Arnd. In fact, the very fact that you are getting a huge difference in the size of the files does not quite indicate anything substantial. The difference, as it seems to me, must have happened merely due to the output formats, like zero filling or more spaces being output(since the output formats of decimal and double indeed vary).

Double Vs Decimal: Which is faster?

Posted: Mon Feb 11, 2008 3:33 pm
by tomengers
joesat:
How is the size calculated for Decimal?
Decimal size calc: If byte count is odd, add 1 and divide by 2. If byte count is even, divide by 2 and add 1.

So Decimal(12) would store as 7 bytes, while Decimal(7) would store as 4.

... tom

Posted: Mon Feb 11, 2008 4:07 pm
by Teej
To answer the question - which is faster, decimals or double:

I must ask:

What do you plan to do?

If you are planning on using native functions to do calculations, then it is faster when you use the original value, instead of having it being converted one way or another.

Landing to disk - both decimals and double are converted to text, so the conversion rate is most likely neglectible (although there have been several improvements on decimal performance since 6.x).

So please answer: What do you plan to do?