# 32 float or 32 int



## MaxR (Jan 15, 2011)

Hello,

First post on this forum.

I recently had a discussion regarding whether the 32 bit floating point format is more suitable than the 32 bit integer format for representing a high dynamic range photograph.

I've always believed that 32 float was the better of both types for this purpose. However, I've been presented with a number of arguments that I'm trying to understand, and I'd like your opinion: 

Digital images are made up of discrete values.
32 bit float can only represent up to 2^24 discrete values.
32 bit int can represent up to 2^32 discrete values.

Therefore, an image using 32 bit int can represent a richer dynamic range than 32 bit float, simply because it can define the complete range with significantly many more values.

Still, intermediate operations using 32 bit int may end up in rounding up that could bring that 2^32 down - however,  if such intermediate operations are done in, say, 64 bit float, the higher number of discrete values in 32 bit int would be preserved.

Now... 32 bit float can address number much higher than 32 bit int, and this, along with the precision supposedly added by 32 bit float is what has always made me think that float 32 can represent better the dynamic range of an image, but it seems, well, I'm trying to understand, that this discretization along with the usual normalization that goes along when defining the dynamic range of an image, throws that theory out of the window (?) and in the end, the main advantage of 32 bit float versus 32 bit int is in the intermediate operations, but then this is no longer an advantage if for intermediate operations with a 32 bit int image we use 64 bit float, being then processing time the only penalty of using 32 bit int, but not precision or "richness" of the defined dynamic range.

I do understand how the IEEE 32 bit float format works, but I'm still trying to digest putting all this together, as anywhere I look, people praise 32 bit float over 32 bit int. I guess my question could also be: Can a particular software make better use of defining the dynamic range of an image using 32 bit int than by using the best it can get from using 32 bit float?

Thanks!
Max


----------



## Provo (Jan 15, 2011)

Just glancing through this article you wrote I think you are mistaking X86aka 32bit + x64aka 64bit with working with a 32bit image for hdr I take working with 32bit would give you more realistic color space and it is able to separate the dynamic range data especially the brightness/contrast. In theory you would have a more richer HDR image.
The downside to it is not all plugins are designed around 32bit so you have to step down the image to 16bit or 8bit in order to use them. Prime example would be using unified colors 32bit float or even in photoshop built in 32bit channel mode now I can't use my nik plugins which defeats the entire point of even using 32bit if what you have looks richer but then you have to step it back down for that you might as well work on 16bit is all about color space for  dynamic range but down the road we will be leaning more towards 32bit. 

Debate worth reading 
RAW, 8 bit, 16 bit, and 32 bit explained


----------



## MaxR (Jan 15, 2011)

Hello Provo,

Thank you for taking the time to reply, but I don't think this clarifies my question, which again is, which format is more suitable to better represent the dynamic range of an image, 32 bit float or 32 bit int, and ultimately, as I said towards the end of my previous post, whether a particular software could make better use of defining the dynamic range  of an image using 32 bit INT than by using the best it can get from  using 32 bit FLOAT.

This is regardless of what Photoshop or other applications offer today (to begin with, Photoshop doesn't even offer a 32 bit int mode).

The link to the article you posted is interesting  but again, it doesn't differentiate 32 bit float from 32 bit int.

Thanks again!
Max


----------



## AverageJoe (Jan 15, 2011)

I think I understand your question but the bigger question is: Does it matter?

A camera will capture an image and apply a limited (or discrete) value per pixel. Whether or not the processing software treats it as floating point or an integer doesn't matter as long as it is big enough to hold the value coming from the camera.

An analogy would be, why try to hold one gallon of water with a two gallon bucket when the one gallon bucket does it just fine?

Maybe I'm missing something, and I only got a C in discrete mathematics... which is why I'm in advertising


----------



## MaxR (Jan 15, 2011)

AverageJoe said:


> I think I understand your question but the bigger question is: Does it matter?



Knowledge doesn't hurt, does it? 
The debate of whether it does matter or not can also be interesting, although it'd be a parallel discussion. For the purpose of crafting artistic images it probably doesn't matter. If you ask me, I think we all are rather fine with ranges of 10 million values and even less. But in some scientific areas it might be significant.



> A camera will capture an image and apply a limited (or discrete) value per pixel. Whether or not the processing software treats it as floating point or an integer doesn't matter as long as it is big enough to hold the value coming from the camera.


This isn't necessarily true the moment we start processing our image. Certain operations such as deconvolution, noise reduction and dynamic range compression work better with a larger representation of the dynamic range: what started as an image represented in, say, 16 bits, could start generating truncated values the moment we start to process it, that is, unless we extend the dynamic range. So what started as one gallon of water may end up needing more than that after certain processing operations in order to avoid truncation.

What I'm trying to figure out is not so much the practical aspects of using one format versus the other, but whether saying that 32 bit float is a "better" type than 32 bit int is just a myth, or is really true, and why.

Thanks!
Max


----------



## AverageJoe (Jan 15, 2011)

MaxR said:


> Knowledge doesn't hurt, does it?



Yes! I'm sorry, I meant it doesn't matter within the example I had the understanding of, not from the perspective of it not mattering for the process.

But beyond that I'm not sure. Assuming the two formats (int and float) make a difference in post, what might be interesting is if:

You can tell the difference in the real world application.

And

Which format works better for which application, and that answer will be purely subjective.

I struggle with order of operations when editing a single JPG in Photoshop, i.e. when to sharpen, when to adjust levels, when to adjust color tones/saturation etc.

Interesting discussion none the less, there are several other HDR experts that patrol this area, we'll have to wait and see... I feel like after re-reading this I've entered a black hole, this is why I dropped computer science.


----------



## PASM (Jan 15, 2011)

I thought i read somewhere that Integer is the best DCT method, but i don't remember why. I use Integer & 1x1,1x1,1x1 sub-sampling for creating jpegs.

Sorry if this isn't relevant to your question.


----------



## Garbz (Jan 18, 2011)

MaxR here's a point to consider. You have come up with reasons why a floating point is better suited for math, but you have missed the practicality for the application. Ultimately the biggest point is still that in the end you have a discrete value for a pixel, normalised so black is 0 and white is whatever the upper range of the format is. 

To that extent it is not critical how many decimal places a calculation is capable of, what is critical is the total number of values which could be represented, or the final format which is an integer anyway. To illustrate the point at lower bits what's the benefit of having a 1 bit "float" (doesn't exist but bear with me), where the value of 1 divided by 2 by some math and can actually take the value of 0.5, VS increasing the number of possible discrete values you can represent to 2 and ending up with 1 as the final result. Nothing changes here in terms of dynamic range, and both solutions end up showing the same graduated tones in the middle.

The end result is the format which wins on all accounts, practical, scientific, and accurate is the integer. A 32 bit integer can represent more values. 

Ooooh but the float can represent part values right? Not quite. If accuracy is key then float is the last format of choice. For instance with a 64bit float 0.1 + 0.2 = 0.30000000000000004 on account of that even with 64bit precision the values 0.1 and 0.2 can't be accurately represented. This ends up throwing out all sorts of other maths problems which can compound to throw off the result such as (a+b)*c is not equal to a*c + b*c. 

It may sound completely irrelevant, but frankly the entire discussion is irrelevant to being with. You have 4.295billion bits of precision with a 32bit integer. There is not a technology in existence where either a measurement sensor, or even the summarized stitching of several different exposures will be able to accurately represent even a small number of those values. Any argument for or against a format is completely dwarfed by non-linearities in measurement, source noise, math problems, etc. 

In the end the most practical format is ultimately the one which can be processed the fastest. And really I barely have the patience to wait for photoshop working on 16bit integer operations let alone the overhead of floating point calculations.

By the way 64bits can represent 1.8 billion-billion  (or million-trillian, or really 1.8x10^19) possible values. We don't need that yet


----------

