Wednesday, 12 October 2011

Tones and Dynamic Range. Why You Should Shoot RAW

To help me keep on writing this blog, please consider buying your Adobe Photoshop Lightroom license at the best price on Amazon using the link below.


I'm writing this post to provide a small introduction about how a digital camera sensor reads and stores data, how human eyes perceives this data and the impact that all of this has on your decisions as a photographer. There's some mathematics in it that I'm not willing to avoid: although photographers should not worry about mathematics and such "technicalities", it's important to understand what's going on inside your camera, first, and in post-processing, later, if you want to get the better out of your images.

How The Sensor Reads and Stores Data

The first thing that's important to understand is that your camera sensor is linear. That's the easiest part of all: the sensor reads light intensity, samples it and stores its value. It's plain old sampling going on here. Cameras typically use three channels of a fixed bit-height to store light intensities of the three primary colors. The bit width of the channel depends on the camera and the wider the channel, more information can potentially be stored into it.

Photographers usually think in terms of f-stops or zones. Changing the exposure by an f-stop means doubling the light that reaches the sensor, if the compensation is positive, or halving it if the compensation is negative. Photographers use f-stops because the eye is a logarithmic sensor: in its working range, it is sensible to relative differences in light intensities, regardless its the absolute value.

Since sensors are linear, how is a specific zone's data distributed into the RAW file? If we think in binary representation, doubling the light intensity (stepping up 1 f-stop) means shifting a sensor reading (for each pixel) to the left (we're using 8 bits as an example):

7 6 5 4 3 2 1 0    7 6 5 4 3 2 1 0
--------------- -> ---------------
f 0 0 0 0 a b c    0 0 0 0 a b c 0

On the other hand, halving the light intensity (stepping down 1 f-stop):


7 6 5 4 3 2 1 0    7 6 5 4 3 2 1 0
--------------- -> ---------------
0 0 0 0 0 a b c    0 0 0 0 0 0 a b


Let's now think in terms of zones. Basically, an n-bit channel can store data of up to n zones and zones don't use the same number of bits to be stored (and thus they cannot store the same level of detail). The highest zone, in this case the 1st, is 7 bit wide, the 2nd is 6 bit wide and so on, up to the 8th, which is 0-bit wide.

If you're proficient in Mathematics, this is in fact pretty obvious and derives from the very nature of the binary representation: adding 1 bit to the representation means doubling the range of values you can express with those bits. But stepping up 1-stop from a zone to another precisely means doubling the light intensity. That's why the maximum number of zones you can store in an n-bit number is n, and each zone will have half of the bits of the previous one (from the lightest to the darkest), to store its information.

You know understand why a camera such as a Nikon D5100, that produces 14 bit RAW files, may have a dynamic range of about 13 EV.

How Many Levels Can Be Stored In Each Zone?

As we've seen, the number of bits available to store the data of each zone decrease to a half from one zone to the next one (from the lightest to the darkest). Since the number of distinct unsigned integer values that you can store with an n-bits representation is 2^(n), it comes out that each in each zone you will be able to store an exponentially decreasing number of levels. For an 8-bit file and a 14-bit file, you will have the following:

Zone  | Levels | Levels |
------+--------+--------+
    1 |    128 |   8142 |
    2 |     64 |   4096 |
    3 |     32 |   2048 |
    4 |     16 |   1024 |
    5 |      8 |    512 |
    6 |      4 |    256 |
    7 |      2 |    128 |
    8 |      1 |     64 |
    9 |      - |     32 |
   10 |      - |     16 |
   11 |      - |      8 |
   12 |      - |      4 |
   13 |      - |      2 |
   14 |      - |      1 |
------+--------+--------+

Awful numbers in the 8-bit case, aren't they? If you've heard about the zone systems, you're probably expecting at least 8 zones in your shots. If you're starting to worry about them, and thinking something like: "Am I saving my images in 8-bits JPG files?", then: yes. There's plenty of reasons to worry about this. But still wait some minutes and read on.

Human Vision

The human eye response to light intensity is logarithmic, under all our practical purposes, amongst the visible spectrum. We have to factor this into the equations to correct the estimations we did in the previous section.

The correction we've got to apply is well known, and you probably heard about it: it's called gamma correction. I won't go into details in this post, the linked Wikipedia article is pretty well done and useful for our introductory purposes. However, it's important to stress out that gamma correction partially compensates for the great (exponential) unbalance in the number of levels we can store for each zone.

The gamma corrected intensities are calculated with:

v_o = v_i ^ gamma

where the gamma exponent is a number that commonly takes the value 2.2. The monitor you're using is applying a gamma correction right now, as well as whichever photographic software you have.

Since we're interested in the zones we perceive (v_o in the previous equation), we should apply the inverse transformation to the values we've calculated and update our estimates accordingly.

Let's then assign some values to the zones so that we can gamma correct them. We will use the following (rounded at 2 decimal digits):

Zone  | Value  | Gamma     |
      |        | Corrected |
------+--------+-----------+
    1 |   8142 |     60.09 |
    2 |   4096 |     43.85 |
    3 |   2048 |     32.00 |
    4 |   1024 |     23.35 |
    5 |    512 |     17.04 |
    6 |    256 |     12.43 |
    7 |    128 |      9.07 |
    8 |     64 |      6.62 |
    9 |     32 |      4.83 |
   10 |     16 |      3.52 |
   11 |      8 |      2.57 |
   12 |      4 |      1.88 |
   13 |      2 |      1.37 |
   14 |      1 |      1    |
------+--------+-----------+

To determine how the corrected values distribute into an n-bit channel, we apply a linear transformation, to "stretch" them into the desired interval. For an 8-bit channel the scale factor is 255/60.09 while for a 14-bit channel the scale will be 16383/60.09. Results are:

Zone  | Maximum | Maximum |
      | Zone    | Zone    |
      | Value   | Value   |
------+---------+---------+
    1 |     255 |   16383 |
    2 |     186 |   11955 |
    3 |     136 |    8724 |
    4 |      99 |    6366 |
    5 |      72 |    4646 |
    6 |      53 |    3390 |
    7 |      39 |    2474 |
    8 |      28 |    1805 |
    9 |      21 |    1317 |
   10 |      15 |     961 |
   11 |      11 |     702 |
   12 |       8 |     512 |
   13 |       6 |     374 |
   14 |       4 |     273 |
------+---------+---------+

It's pretty evident that the width of the zones is more balanced than it was in the non gamma corrected case. The last step is to calculate the number of levels per zone, subtracting from the maximum value of a zone the maximum value of the next zone:

Zone  | Levels | Levels |
------+--------+--------+
    1 |     69 |   4428 |
    2 |     50 |   3231 |
    3 |     37 |   2358 |
    4 |     27 |   1721 |
    5 |     20 |   1256 |
    6 |     14 |    916 |
    7 |     10 |    669 |
    8 |      8 |    488 |
    9 |      6 |    356 |
   10 |      4 |    260 |
   11 |      3 |    190 |
   12 |      2 |    138 |
   13 |      2 |    101 |
   14 |      1 |     74 |
------+--------+--------+

8-bit images aren't that bad, in fact, but aren't that good, either.

It's clear that using RAW is a huge improvement. Also, when converting from RAW to another format, you should try and avoid 8-bit formats such as JPEG, unless you're willing to lose all that information. You should try and stick with 16-bit image file formats, although few programs can use them. Notably, Photoshop Elements is able to open them but not manipulate them. It's a good selling point for Photoshop, if you're a professional.

How Bad Is Using 8 Bits Image Files?

To fully understand how bad 8-bits images can be, it's necessary to understand how sensible human eyes are to light intensity. It turns out that this question is answered by the Weber-Fechner Law. This law states that human eyes are sensible enough to distinguish a difference in light intensity of about 1%.

How many such levels are in a zone? To calculate it, using the definition of a zone (an interval in which the intensity of light doubles), you must find the number x such that:

(1.01)^x = 2

That number is approximately 70. There are 70 levels in a zone that human eyes can distinguish. Let's observe once more the zone levels in the 8 bit and 14 bit case. We notice that 8 bit images provide a good level of detail only in the brightest zone. In the darkest ones artifacts such as "banding" will easily occur. A 14-bit file, on the other hand, will provide good zone level details up to the darkest zone. That's one more reason why you should always use RAW when shooting and post processing your images. If you use narrowest channels, artifacts will soon pop up.

Useful Tips for Shooting

Now that we've learnt how zones are stored into our image files and the level of details we can expect from each one of them, we can draw some conclusions that may help us shoot the perfect photo.

Although gamma correction introduces some balance into the number of levels that can be stored for each zone, it's clear that brightest zones can use much more information than darkest one. Under this aspect, digital sensors like a bit of overexposure. To capture the highest level of details, you should guarantee that the zones you're interested are exposed to the right. If you slightly overexpose, provided you don't burn out any channel, you can lower the exposition in post-production while retaining the maximum number of details.

Even with dynamic ranges as wide as 14 EV, you must be careful not to burn a channel out. When a channel has filled up, you start clipping information and the first side effects will be: a partial loss of color saturation, a color drift and, finally, the saturation of all the channels to a pure white.

Many RAW files will let you recover 1 f-stop of light, but if you clip the white, as well as the black, you will be losing information. That's why on camera histograms are a good tool to check your exposure and ensure you're using the channels efficiently.

Do These Recommendations Apply Only to Pro?

No. I'm not a pro, either. However, it's pretty easy to see how artifacts will quickly appear in a typical 8-bit image with relatively light post-processing. For example, bands quickly appear in the dark areas and information is not sufficient to modify the exposure of a shot even for only 1 f-stop.

If you can shoot RAW, do it. And if you can distribute and store 16-bit images, do it as well. Today, very few people is going to run out of storage space for his photographies and, at least, you should be aware that you're going to lose a great deal of information with an apparently innocuous transformation (8-bits images can use more than 16 millions colors, but now you know that this is insufficient in many cases).

No comments:

Post a Comment