Colour Spaces

For artists








Written by
Bram Stout






This is my explanation of colour spaces and the surrounding topics based on how I’ve come to understand them. While I believe these explanations are perfectly adequate, especially for artists, others may disagree. Additionally, the terminology used in this document may have different meanings to different people. If you have any ideas on how to improve this web book, please feel free to send feedback to business@bramstout.nl

Copyright © 2021 Bram Stout Productions  –  This work is licensed under CC BY 4.0. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Table of Contents

Colour and Light

Visible light consists of a bunch of photons, of various wavelengths in the visible spectrum, bouncing around until hopefully some of them enter your eye and hit a rod cell or one of the three types of cone cells. Each photon of a different wavelength, produces a different colour in our brain. If we map out these colours based on the wavelength, we get something roughly like this.

Source

However, it’s not that the photons hitting our eyes are always all of the same wavelength. Instead, light often consists of a combination of photons of different wavelengths. If we map these out, we can get something like this.

Source

The line around the coloured shape shows the colours made with only one wavelength and it’s the same as the previous chart. The colours inside of the shape are all of the colours that you can make by combining photons of different wavelengths. From this you can see that white isn’t some specific wavelength, but rather a combination of wavelengths. One important thing is that the colours shown in the charts are with a constant luminance (the amount of photons). By also varying the luminance you can make even more colours.

When looking at this, we can conclude that the distribution, of the amount of photons per wavelength in the visible spectrum, determines the colour. This distribution is called the spectral power distribution over the visible spectrum.

If we want to store these colours, for example in an image, then storing the amount of photons for every wavelength in the visible spectrum is awfully inefficient and generally impractical. Fortunately, our eyes see colour with the help of three different types of cone cells, each one sensitive to the visible spectrum in their own way. With the responses of these three types cone cells, our brain interprets the colours. Because of this, instead of storing a value for every wavelength in the visible spectrum, we can abstract away those wavelengths and approximate it by only storing three values. One value for each kind of cone cell, that specifies how many photons hit that cone cell (based on how sensitive it is to those photon’s wavelengths). We call this kind of abstraction colour models.

Colour models

Colour models describe what kind of values we use to specify colours. We can use three values, one of each type of cone cell (called LMS), but we can also use hue, saturation, and brightness to specify colours. Or, we can use one value for luminance and two values for chromaticity. We could use four values; cyan, magenta, yellow, and black (CMYK). We could even use ten different values, each one for a different slice of the visible spectrum. To say it with sciency words: A colour model defines the coordinate space to describe colours.

Arguably the most popular colour model in use today, is the RGB colour model. Rather than specifying the amount of light that hits each kind of cone cell, as described above, RGB simply defines a value for red light, a value for green light and a value for blue light. By making it more generic, the RGB colour model becomes much more useful, especially for devices like digital camera sensors, that don’t respond to light in the same way as the three types of cone cells in our eyes.

There is one problem though. In the RGB colour model, what kind of red is red? But also, what kind of green is green, what kind of blue is blue and what kind of white is white? Similarly, when using hue, saturation, and brightness, what kinds of colours do we have with a hue of 0° or with a hue of 140°? The answer is that we don’t know. We have to define this first.

Colour gamuts

A colour gamut defines what the colour is for the values of a colour model. For the RGB colour model, it defines what kind of red, red is. What kind of green, green is. What kind of blue, blue is. And what kind of white, white is. 

The gamuts for RGB colour models are often defined as xyY values for the red, green, and blue primaries and the white point. In 1931 the International Commission on Illumination (CIE) created the colour space CIE 1931 XYZ (or XYZ for short), which is a standardised and objective version of LMS (a colour space based on our cone cells, which stores a value for the response of each cone cell). xyY is a special version of XYZ where the x and y values specify the chromaticity (colour without luminance) and the Y value specifies the luminance. Often only the xy values are defined and the Y value is implied to be 1.0. By defining the gamut using xyY values, we know what the actual colours are for the RGB values, which also means that we can convert from one gamut to another.

RedGreenBlueWhite
x0.64000.30000.15000.3127
y0.33000.60000.06000.3290
The xy values of the sRGB gamut
The xy values of the sRGB gamut plotted on a xy chromaticity diagram. Source

When plotting the gamut on the chromaticity diagram, a triangle shows up. All of the chromaticities (colours without luminance) inside of the triangle, are the chromaticities that sRGB can reproduce. Every colour space with an RGB colour model, has such a gamut, but often with their own xy coordinates for the primaries and white point.

Colour spaces, with a different colour model than RGB, still need to define a colour gamut, but often they are defined differently. Those other ways are outside of the scope of this web book.

Transfer functions and our eyes’ response to brightness

Our eyes do not respond to luminance linearly. If we have a 100 watt light bulb and we add another 100 watt light bulb, the resulting amount of light is equal to a 200 watt light bulb. You’d think the light is now twice as bright, but our eyes don’t see it as twice as bright. Below is a chart with both a gradient increasing linearly and a gradient increasing logarithmically. Each swatch also has the amount of light it emits written down.

As you can see, the top gradient increases by the same amount, but it doesn’t seem to increase in brightness linearly. However, the bottom gradient keeps getting twice as bright with every swatch to the right, but it looks to our eyes like the swatches are increasing in brightness roughly linearly. This is just how our eyes work.

But why is this important? Well, when storing images we’d rather try and use the least amount of space as possible. Most images today are stored using 8 bits per red, green and blue value. 8 bits gives us 256 possible values. When we look at the chart again, we can now also see the luminance percentage expressed as an 8 bit number.

With this, we can clearly see a problem. In the top gradient, the first two swatches have a huge difference in brightness and in between them are 25 possible values, but the last two swatches only have a small difference and also 25 possible values. That doesn’t seem very efficient. When we look at the bottom gradient, the first two swatches only have 1 possible value in between them, while the last two swatches have a whopping 128 possible values in between them, even though they both have roughly the same difference in brightness to our eyes.

From this we can conclude that if we can transform the values to be closer to how our eyes respond to brightness, we can more efficiently store the values using a limited amount of bits. This is something that we can do using a transfer function. A transfer function is just a formula that takes our values in and transforms them into values that are more efficient to store using a limited amount of bits. When we want our original values back, we simply use the inverse of our transfer function. To prevent confusion, I’ll call the original values (which can be seen as the amount of light that hits our eyes or the camera sensor or the amount of light that is emitted from a display) linear values and the values that come out of a transfer function (which are more efficient for storing in a file or sending to your monitor), I’ll call encoded values. Linear values are also called scene-referred values and encoded values are also called display-referred values.

Transfer functions can be anything, but there are two main types of transfer functions that are used a lot: gamma functions and log functions. Gamma functions are formulas where you take the linear value and raise it to some power (calculated by 1.0/gamma). Some gamma functions also have a linear segment where it is a straight line rather than raised to a power, which then transitions into a power function. Log functions are formulas where you take the log of the linear values and then offset and scale it to make it fit nicely within the range of values that you have. When performing the transfer functions, the linear values and encoded values are generally from 0 to 1, where 0 means 0% luminance and 1 means 100% luminance. Some transfer functions, like perceptual quantizer and log take linear values that can also be above 1 and output encoded values that are between 0 and 1, which allows perceptual quantizer to be used for HDR video.

Let’s take a look at that chart again, but instead of the 8 bit values being linear, it’s going to show the encoded value from a gamma transfer function with a gamma of 2.2.

While still not perfect, you can see that it’s now a lot closer to how our eyes perceive it.

Sometimes we don’t want to transform our values and instead keep them as linear values. A linear transfer function does just that. It simply does nothing, meaning that the encoded values are the exact same as the linear values. If a transfer function isn’t specified, it generally means it’s a linear transfer function.

While the terms transfer function and inverse transfer function are clear enough for most people, they can be ambiguous. Because of this we often use the following terms:

Opto-electronic transfer function (OETF) specifies the transfer function to transform linear values into encoded values.

Electro-optical transfer function (EOTF) specifies the transfer function to transform encoded values into the linear values specifically used by display devices.

Additionally, the inverse of the OETF and EOTF also exist. It might sound weird to have different transfer functions and their inverses, even though they simply seem the inverse of each other. In most cases the inverse-OETF is the EOTF and the inverse-EOTF is the OETF, however in some cases they are different, because of reasons. I’ll continue to use the terms transfer function and inverse transfer function in this web book.

Bringing it all together

When we take the colour model, which defines a coordinate space used to specify colours, the colour gamut, which defines what the actual colour is for the values in a colour model, and the transfer function, which defines how to transform the values to more efficiently store them, we get a colour space. Colour spaces allow us to define colours and, by having the colour spaces be clearly defined themselves, we can convert colours from one colour space to another.

Popular colour spaces are XYZ, sRGB, Rec.709, Rec.2100, CIELAB and YCbCr (although YCbCr is actually a family of colour spaces). When people say sRGB, some mean the colour space, while others mean the transfer function used in the colour space. In reality, the term sRGB refers to the colour space. If the transfer function is meant, then the term used should be the sRGB transfer function or the sRGB OETF.

Additionally, to be more pedantic (or confusing), Rec.709 and Rec.2100 are technically not colour spaces themselves, but rather are standards that define colour spaces including other things. We just often use the name of the standard to also refer to the colour spaces themselves. And, Rec.2100 can have two transfer functions, hybrid log-gamma and perceptual quantizer.

Displays

Displays are devices, like monitors, televisions, projectors, and even printers, that can display images to us. But when we send the values of the image to the display, how does the display know what kind of values they are? Are they RGB, HSV, YUV, CMYK, or some other colour model? We might know, since we made the image, but the display doesn’t. And let’s say that it’s RGB, how does the display know what colours belong to what values? Is the red this kind of red or some other kind of red? Again, the display doesn’t know. And even then, the cable that sends the image to the display device has a limited bandwidth, so we want to be as efficient with it as possible. Encoded values are more efficient than linear values, but how does the display know what transfer function we used to get those encoded values? Once again, the display doesn’t know.

We can send whatever values that we want to the display, but the display just doesn’t know how to interpret those values. We solve this issue by saying that every display needs to have a display colour space. That’s the colour space that the display assumes the values it receives are in. With this, the display knows what the actual colours are that it should show to us. So, if we want to send an image to the display and make sure that the display shows it correctly, we just need to have the image be in that display colour space.

There are a lot of different colour spaces used by various display devices. sRGB is almost always used by computer monitors. Rec.709 by televisions. DCI-P3 and XYZ by cinema projectors. Rec.2100 (often with the PQ transfer function) by HDR computer monitors and HDR televisions. DCI-P3 but with the PQ transfer function by HDR cinema projectors. And, some displays can even support multiple display colour spaces.

However, in practice things are more complicated. A display isn’t always calibrated and would show incorrect colours even if the image is in the right display colour space. A consumer or manufacturer could take some creative liberty and modify how the display shows colours, because they think that looks better. The viewing conditions could also be different. A movie showing on a TV in a brightly lit room looks different from the same movie on the same TV in the same room, but with the curtains closed and the lights turned off. But this is not something that you can fix with colour spaces. Rather the fix would be in the display itself. If the display isn’t calibrated, it should be calibrated. If the consumer or manufacturer has some picture profile that messes with the colours, they should turn it off. If the room is too bright, so now they can’t see in the shadows, they should increase the gamma or brightness on the display. If you try to fix it by modifying the image rather than the display, then it’s going to look right on that one display under those viewing conditions, but send the image to somebody else who has a different display or change the viewing conditions, and the fix doesn’t work anymore.

It doesn’t matter what you do, the images that you create are going to look different for everyone. So, the best that you can do is to make sure that you are using the right colour spaces and hope for the best.

Linear workflows and using the right colour space

Let’s suppose that we want to make an image darker, more specifically, we want it to have half the amount of light (a.k.a. a stop darker). This is equivalent to adding an ND filter with a density of 0.3, halving the shutter angle, halving the ISO or making all of the lights in the scene half as bright. In a mathematical sense, we multiply all our values by 0.5. We can use our linear values, but what happens when we use encoded values?

Here’s a chart with some gradients. The top gradient is the original gradient. The upper middle gradient is made darker by multiplying the linear values by 0.5. The lower middle gradient is made darker by multiplying the encoded values by 0.5. The values are encoded using a gamma transfer function with the gamma set to 2.2. The bottom gradient uses a log transfer function instead of a gamma transfer function.

Just by using different transfer functions, we can already see quite a difference. The upper middle gradient, which uses linear values, emits exactly half the amount of light as the top gradient and that’s what we want. The lower middle gradient, that uses gamma encoded values, is much darker and only emits 21.8% the amount of light instead of 50%, so that’s not good. The bottom gradient, that uses log encoded values, is even darker and now the swatches don’t increase linearly in brightness anymore! The bright swatches got much darker than the dark swatches got darker. Seems like linear values are the winner here.

Let’s do another experiment. We have another gradient but one tinted blue and we want to make it into a neutral gradient (a.k.a. white balancing). The brightest blue swatch has RGB values of [0.25, 0.5, 1.0]. We want this to become 1.0 for all, so we need to multiply the red value by 4 and the green value by 2. Here’s the chart for it. The top gradient is the original. The middle gradient multiplies the linear values by the numbers we specified. The bottom gradient multiplies the encoded values by different numbers based on the fourth swatch from the right. The bottom gradient uses a log transfer function to encode the values.

The middle gradient, that uses linear values, looks like what we’d expect. The bottom gradient, however, is far from what we’d want. The darker values are still a bit blue and the brighter values have become a bit orange. Only the fourth swatch from the right, that was used to determine the white balance values, has a neutral colour. Again, the linear values seem to win.

We can do even another experiment. This time, we have a coloured gradient, but we want to change the hue by 190°. The top gradient is the original gradient. The middle gradient uses linear values to do the hue change. The bottom gradient uses encoded values and uses a log transfer function.

The middle gradient, that uses linear values, looks perfectly fine and, when the hue is measured, its hue is actually 190° shifted. The bottom gradient has a very different hue and the brighter swatches are a bit purple while the darker swatches become a tiny bit more blue, although the change is very tiny. The bottom gradient doesn’t look quite right.

From these experiments we can conclude that when talking about the amount of light, we should be using linear values rather than encoded values. To say it with sciency words: The mathematical formulas, used to manipulate colours and simulate light bouncing around in a scene, specify light in terms of energy (a.k.a. the amount of light there is) and therefore require linear values. Providing these formulas with values that aren’t linear, will result in mathematically incorrect results.

If we want to change the exposure of an image, we want to change the amount of light that there is in the image and thus we need to use linear values. If we are rendering a 3D scene, we are simulating light bouncing around and thus we need to use linear values. If an image is made out of focus, the lens spreads out the light over the sensor, blending it in with other light. To simulate that, we’d need to use linear values because we are again talking about the amount of light.

This idea of using linear values when talking about the amount of light, is what a linear workflow is all about. It’s about using linear values when you should, so that the results that you get are mathematically correct. However we still need to know what kind of values we are talking about (colour model) and what kind of colours those values mean (colour gamut). So we need a colour space that uses linear values, which means a colour space with a linear transfer function. Otherwise, we wouldn’t know what kind of colours the linear values actually are. In a linear workflow, we specify such a colour space and we convert all of the colours to that colour space (if they aren’t already in that colour space) for use in all of our mathematical formulas that we use to manipulate the colours or to simulate light bouncing around in a scene.

The term linear is often used as a name for such a colour space, but linear in this case is short for the term linear colour space, which is the term used for any colour space with a linear transfer function and isn’t a colour space itself. We could take the sRGB colour space but replace the transfer function with a linear transfer function and we’d get what we call linear-sRGB, which is a linear colour space. We could also take the Rec.2100 colour space (used for HDR video) and replace its transfer function with a linear one and also get a linear colour space. Or we could take a colour space that already has a linear transfer function like ACEScg and that’s a linear colour space as well. So, saying linear can mean a lot of things.

Linear colour spaces are cool and all, but are there times where we might not want to do some colour operation using linear values? Yes, but they do often go into subjective territory. An example would be making a gradient from one colour to another. Below is a chart with multiple gradients. Each starts and ends on the same colours. The colour spaces used are shown on the left side.

The look of the gradients vastly differ based on what colour space was used for the interpolation of the gradient. When it comes to making gradients like this, the right colour space for the job is simply a matter of personal taste and for what it needs to be used.

There are a lot of colour spaces with varying properties, but the one important thing is that you make use of the right colour space (or one of the right colour spaces).

Colour management and handling colours the right way

As described in the previous chapter, when doing things like compositing, 3D rendering, and photo manipulation, we are working with light and thus we should be using linear values. However, if we have a colour in the HSV model and another colour in the RGB model and we multiply them together, we multiply the hue with red, saturation with green, and value with blue, but what kind of values do we end up with? Is the result HSV, RGB, or some other colour model? It’s not HSV or RGB and I’m not aware of any other colour model that it could be, so we can’t just multiply values together like that. We need to make sure that both colours use the same colour model. Next to that, if I have two RGB colours that are pure red (have the values [1.0, 0.0, 0.0]) but with different colour gamuts and I multiply these together, is the colour gamut of the resulting values the gamut of the first colour, the second colour, or some other gamut? Just like with colour models, it’s not the gamut of the first colour or the second colour and also not any other gamut. These two questions probably sound confusing and they are. You cannot multiply two sets of values with different colour models and/or different colour gamuts, and end up with logical results. We also need to make sure that the colour gamuts are the same. So, when we are manipulating colours and light (like in compositing, 3D rendering, and photo manipulation) we need to make sure that the colours are all in the same colour space with a linear transfer function.

We call this colour space, that we do all of the mathematical formulas in, the rendering colour space or more broadly a working colour space. If we have a colour (or image) that isn’t in that colour space, then we need to convert it from its colour space to the rendering colour space. When we are done and we want to save it to a file or we want to show it on our display (so that we can see what we are doing while we are working on the image), then we need to convert the colours from the rendering colour space into the display colour space. Even when saving it to a file, we’d still call it a display colour space for the sake of simplicity. In a lot of cases, we’d also want to apply a look and/or rendering to the image, which we’d do just before converting it to the display colour space (and it’s often also grouped together with the conversion to the display colour space). If we want to show the image on another display that uses a different display colour space, then we convert the same image from the rendering space to the new display colour space. This way, we can swap out one display colour space with another and be able to show the same image on any display that we can think of and still have it look the same.

What is described here above is called a colour management workflow, also called colour workflow or colour management. The goal of a colour management workflow is to define how we should handle colours, in order to be as mathematically accurate as possible. This way, we use the right colour spaces for the right things. The colour management workflow described above is a linear colour management workflow (often shortened to linear workflow). Linear colour management workflows aren’t the only kind of colour management workflows that can be used, but for things like compositing, 3D rendering, and photo manipulation you should use a linear colour management workflow if you want to be mathematically accurate.

Here’s a chart of how a linear colour management workflow would work.

The part that converts colours (or images) from some colour space to the rendering colour space is called the input transform (IT) or input device transform (IDT). The part that converts the colours (or images) from the rendering space to the display colour space is called output device transform (ODT). When you combine the look transform, rendering transform, and output device transform you get the output transform (OT). Different people use different terms for this, but these are what I use. The look and rendering transforms will be explained later in this web book. If you want to show the image on a different display, you can simply use a different output device transform appropriate for the new display’s display colour space.

All of this “colour management” seems to be a lot of work, and that’s true. Having artists need to keep track what colour spaces images are in, making sure that they convert them to the right colour spaces and not accidentally overlook something that requires them to redo a lot of work, doesn’t seem like the most artist-friendly way. Luckily, it doesn’t have to be like that. Software programs can build the right colour management workflow for them into the software. All you’d then have to do is to tell the software what your working space is, what the colour spaces are for the files that you import, and what the display colour space is. The software then takes care of everything else.

As an example, Autodesk Maya has a linear workflow built-in. In the settings, you can specify the rendering space and the display colour space, and in the file node, you can specify the colour space of the file. Autodesk Maya will take care of the rest.

In Davinci Resolve, you have something similar. In the project settings, you can specify a colour management workflow and then specify the colour space of your footage (you can also specify it per footage) and the display colour space. All of the colour grading will then be done in the right colour spaces.

The only thing that is left is that when working with other people or companies, you need to make sure that you use the same workflow (or ones that are compatible with each other). You can achieve this by agreeing on a workflow together, but you can also use a pre-existing one like ACES. ACES (Academy Color Encoding System) is a standardised linear colour management workflow designed to be used by all parts of filmmaking and all adjacent fields. It works the same way as the linear colour management workflow described above. The advantage of ACES is that it’s standardised, so when collaborating with other people and companies, you can just say “let’s use ACES!”

Many pieces of software, like Autodesk Maya and Nuke, make use of OpenColorIO for its colour management. OpenColorIO is a library where you can specify all of the colour spaces, display colour spaces, look transforms, and render transforms and it will figure out how to convert from one colour space to the other. All that programs like Autodesk Maya and Nuke need to do is tell OpenColorIO from what colour space to what other colour space the image needs to be converted to. What is useful about OpenColorIO is that all of these colour spaces, display colour spaces, look transforms, and render transforms are defined in a .ocio configuration file. This file can be shared across applications that use OpenColorIO and can even be sent to other people and companies, so that they can use the same workflow as you. ACES can be implemented using OpenColorIO, but you can also create your own workflow.

A history of doing it wrong

What is described in the previous chapter sounds great, but if you’ve spent some time in the filmmaking industry on the post-side, then you’ll most likely know that this isn’t the case in a lot of software programs. It is very common for software programs to not do any kind of colour management or only implement it partially. The main reason for it is that over twenty years ago, when most of these applications were written, computers were very slow. So to convert the image into a higher bit depth, then convert it into a linear colour space, then do the operations that you need to do, and lastly convert it back to the original colour space and bit depth, is either impossible or unacceptably slow. The developers tried out not doing that and to just use the encoded values rather than linear values. The results, while not mathematically correct, were good enough, and so the developers stuck to using the encoded values and therefore not doing any colour management. It doesn’t matter if you have two separate images with different colour gamuts and transfer functions, the program just didn’t bother with colour spaces and just used the encoded values as-is. These programs got popular and when other programs were developed, they also didn’t do any colour management, because that’s how those popular programs did it.

Now we live in a world where many programs don’t handle colours the mathematically correct way and only some do. Luckily this is changing. More developers are aware that using the right colour spaces is important and are starting to add colour management to their programs. Unfortunately, a lot of the very popular programs are very old and trying to add colour management to them is a lot of work, so it just doesn’t happen. Which means that there will still be programs that don’t handle colours the mathematically correct way.

I once made a video essay about this, using Lumetri as an example. The response was almost universally along the lines of “so that’s why I could never achieve what I wanted to.” Most artists who use programs that don’t do colour management, just accept that there is no colour management and even believe that that’s just how it works and is supposed to work. They probably aren’t even aware of colour management. When linear workflows are then explained, they are often explained from the perspective of programs not doing colour management, making them sound much more complicated than they really are and scaring artists away. Hopefully, as more developers add colour management to their programs, more artists should be aware and make it easier for them to achieve the things that they want.

Look and rendering transforms

If we take an image that we have and convert it to the display colour space of our display (in this case sRGB since the computer monitor that you are likely reading this on uses sRGB), then we might get something like this.

The image looks normal. The brightness that we perceive is what we’d expect if we would actually be there. However, there are parts of the image that are very bright, so bright that they become solid white, but if we were actually there, we wouldn’t see that. This effect is called clipping and it happens when values become so high, that they are simply clamped (clipped) to 100%. You’ve probably seen the same thing when taking pictures with a smartphone camera or DSLR or on television. Here are two gradients that get more saturated on the horizontal axis and brighter on the vertical axis.

The bottom half of the gradients look perfectly fine, but in the top half it rather harshly transitions to a pure white and the more saturated colours start to show some weird colours, instead of smoothly transitioning to white.

From this we can conclude that we don’t always want to convert our linear values in our rendering space directly into the display colour space. Sometimes we want to apply a creative manipulation of the colours, in order to make an image more pleasing or have certain characteristics that we want. This is called rendering. You’ve probably heard of rendering in the context of 3D rendering, but there is also a second kind of rendering, which is colour rendering and is what we have here. Colour rendering determines how a medium should respond to light.

When a painter is painting a landscape that they are in, they have to determine the combination of paints that best represent the light that is hitting the painter’s eyes. Colour rendering is like this, but then the painter’s eye is a camera and the canvas is a display.

Photographic film also has a rendering that happens when you develop an image. When you film something using photographic film, then project that photographic film and measure how much light hit the photographic film during filming and how much light hits the screen on which you project it, then the graph would look something like this (in reality there’s much more to it).

If no rendering were to happen, we’d get a straight line from (0, 0) to (1, 1), but with this specific rendering, we get some contrast in the shadows and what we’d call a ”highlight roll-off”, which gives a smoother transition into pure white.

If we’d apply a rendering (in this case based on photographic film) to the two images above, we’d get this.

Instead of having the bright parts immediately become a solid white, it now more slowly transitions into a solid white and we can see much more detail. Additionally, the gradients have smooth transitions as well. In my opinion (and hopefully yours as well) these images with the rendering applied look much nicer than the original images without the rendering. However, if we compare the amount of light that hits the camera against the amount of light that is emitted by the display, they no longer match up. These images are no longer an objective representation, but rather an artistic interpretation.

Colour renderings can be literally anything. You could use a rendering that does nothing, as if there was no rendering being applied at all, or one that mimics that particular type of photographic film that you like. You could use one that tries to preserve hues instead of skewing them. You could use a rendering that gives you black-and-white images or a rendering that makes saturated colours even more saturated. You could even use camera-manufacturer-supplied log-to-rec.709 luts, which transforms the images from the camera’s log colour space to the rec.709 display colour space, while also applying a rendering. The possibilities are endless.

ACES also comes with a default rendering, which it calls the Reference Rendering Transform (RRT) and is one of the reasons why some people choose ACES over some other colour management workflow.

Tone mapping (also called tone reproduction) is a term often used together with or even in place of the term rendering. Tone mapping is the operation of mapping one value to another value using a curve, often called the tone curve, tone mapping curve, tone reproduction curve, or tone response curve. It can be used in a rendering transform and can even be the only operation in a rendering transform (which is why the term tone mapping is sometimes used instead of the term rendering). The term tone in tone mapping came from black-and-white photography where it essentially was the term for what we now call value. Here are some examples of possible tone mapping curves.

While colour rendering is about how a medium responds to light, for some images you might want to manipulate some or all of the colours even more. Maybe one shot in a movie needs to be more saturated while the other less. This is done using a look or look transform. A rendering transform and a look transform are both creative manipulations of colour, and the terms are sometimes used interchangeably. Which begs the question, why aren’t these the same thing?

The difference is slight, but important, a rendering transform is often generic in its manipulation of colours and doesn’t change from one image to another, but a look transform takes the rendering as a starting point, manipulates the colours further, and it can change from one image to another. Additionally, multiple look transforms can be used together, while there can only be a single rendering transform. An example could be a feature film. A rendering transform is used to determine how the image responds to light and is the same for every single shot in the film. Often this is a rendering based on photographic film. A look transform is decided for the entire film (or different look transforms for different portions of the film), which takes that rendering transform and manipulates it further to get a custom look for that film. Another look transform is then made for every shot (generally in the form of a colour grade), which manipulates the colours based on what is right for the story.

Gamut mapping

A big cornerstone of colour management is the conversion from one colour space to the other, which includes converting from one colour gamut to another colour gamut. But, what if we have a colour in a large colour gamut and we want to convert it to a smaller gamut, but the colour lies outside of the smaller gamut? The chart below shows this exact question. We have the large gamut of the ACEScg colour space, the small gamut of the sRGB colour space, and a colour that lies within the large gamut but outside of the small gamut. This is a CIE xy chromaticity diagram. It shows all of the chromaticities (colour but without luminance) that the human eye can see.

Source

The colour is inside of the ACEScg gamut, but outside of the sRGB gamut, so this colour cannot be represented by the sRGB colour space. As far as sRGB is concerned, this is an impossible colour. So, how do we represent this impossible colour in the sRGB colour space?

This is essentially the same question as what is the value of a number that does not exist? The answer is that there is no value, since the number doesn’t exist. In the same vein, there is no colour in the sRGB colour space that can represent the colour we picked above. Because we cannot objectively represent that colour in sRGB, we have to get creative. What we can do is pick a colour that is in the sRGB colour space and use that to represent the colour, even though they aren’t actually the same colour at all. Basically, we creatively manipulate the colours, until they are all in the sRGB colour gamut.

Converting from one gamut to another is called gamut mapping. There are a lot of methods to choose from, and since this is something creative/subjective, they are all equally correct. Gamut clipping is a simple method, which objectively maps the colours so that they remain the same actual colour, but the colours outside of the gamut are “clipped” to the nearest colour on the edge of the gamut (thus not being objective anymore for out-of-gamut colours). Gamut clipping is probably the most used method and can be performed using just a 3×3 matrix. 

Gamut clipping is a clearly defined gamut mapping method. Unfortunately, most other methods are only really defined by their names, leaving them up for interpretation by the developer that has to implement them. So, methods like absolute colorimetric, relative colorimetric, perceptual, and saturation gamut mapping all have varying implementations. This makes it difficult to talk about the different kinds of gamut mapping methods, so I won’t describe them. Just know that there are many methods out there.

I have come up and implemented my own gamut mapping method. Here are a couple of images, to show the difference between gamut clipping and my own method. Both images have the ACES RRT applied. The only difference is the gamut mapping method.

Gamut clipping   Source
My method
Gamut clipping
My method
Gamut clipping
My method

Gamut mapping is a creative manipulation of colours, just like rendering transforms and look transforms are, and it could easily be a part of a rendering transform. However, since it is the method of converting from one gamut to another, it sounds like it should be a part of the output device transform, rather than the rendering transform. In truth it can be in both, but I’d put the preference on having it be a part of the output device transform. If the display colour space changes, then the gamut mapping needs to be updated as well to now map to the new gamut. If the gamut mapping is done in the rendering transform, then you’d need a new rendering transform as well. On the other hand, when changing the display colour space, you’d need a new output device transform anyways, so if the gamut mapping is in there, then you wouldn’t need a new rendering transform. This way, you can keep the rendering transform and you’d only need to change the output device transform.

Colour and Colour

In the beginning of this document, I described colour as the spectral power distribution of the visible spectrum. When that light hits our eyes and the signal from our eyes goes to our brain, it gets interpreted by our brain to produce a colour. But that colour isn’t the same kind of colour that I described in the beginning of this document. We have colour in the objective sense (a spectral power distribution of the visible spectrum) and in the subjective sense (interpreted by our brain), yet we use the term colour for both.

At first glance, it might seem like this isn’t a big deal and that both kinds of colour are effectively the same. In reality, they aren’t. Let’s say that you are outside in broad daylight. The light from the sun and the sky looks white. When you walk into your house, close all of the curtains, and turn on the lights, the lights will look orange (but we just call it “warm”). Stay inside for a couple of hours and the lights in the house will end up looking white (or at the very least much less orange). The lights haven’t changed. The distribution of light on the visible spectrum coming from those lights is still the same, yet our brain interprets the light to be white. Go step outside in broad daylight again and the light from the sun and the sky will look blue. Spend a couple of hours outside and your brain will adjust and now the light from the sun and the sky looks white again.

This happens because our brain has an internal “white balance” (called colour constancy), just like a camera has. Even though the distribution of light hitting our eyes remains the same, our brain can interpret it differently depending on what that internal “white balance” is. There are many more effects in play that can change how we perceive light. One example is if we have a distribution of light where the hue and saturation remains constant, but the luminance (amount of light) changes. Because of the way that our brain interprets light, we can perceive a change in the hue and saturation, as well as the change in luminance.

These two definitions of colour are both very different things, so calling them both “colour” becomes confusing. When trying to differentiate between the two, colour in terms of a spectral power distribution of the visible spectrum, is referred to as “physical colour” and colour in terms of how our brain interprets the physical colour, is referred to as “perceived colour” or “colour appearance”. Some say that physical colour doesn’t exist (it’s just the spectral power distribution) and that the perceived colour is the only true colour (and thus we should only use colour to refer to perceived colour and not physical colour), however I find that to be pedantic and not in accordance to how the term colour is used in practice. Since artists for the majority work with colour as a spectral power distribution (especially when it comes to the mathematical formulas), I will use “colour” to refer to physical colour and “perceived colour” to refer to perceived colour.

Perceptual colour spaces

So far in this document, all of the colour spaces described, are colour spaces that represent physical colours, but what about perceived colours? Perceived colours can be represented using special kinds of colour spaces called perceptual colour spaces, which try to be “perceptually uniform.” Perceptually uniform means that if we change the hue, saturation, or brightness by some constant amount, the difference that we’d perceive would stay the same for any starting colour. If we change the hue by 10° from 30° to 40°, we’d perceive that change to be the same amount as going from 230° to 240°. Similarly, going from a saturation of 90% to 70% would be perceived the same as going from 40% to 20%. Going from a saturation of 90% to 70% for a hue of 50° would be perceived the same as the same saturation difference but with a hue of 140°.

Converting physical colours to perceived colours (a.k.a. converting from colour spaces representing physical colours to perceptual colour spaces) is done using a colour appearance model. Colour appearance models try to model how our eyes and brain perceive colours, but how our brain perceives colour is complicated, has a level of subjectivity to it, and is dependent on the context (surrounding colours). Most colour appearance models don’t bother to be 100% accurate (since it’s either incredibly difficult or just impossible). Instead, they try to get close and are more focused on providing a practical perceptual colour space.

There are many perceptual colour spaces with their own colour appearance model (or multiple colour appearance models), like CIELAB, RLAB, ITP, ICtCp, CAM02-UCS, and Oklab. But why and when would we want to use a perceptual colour space?

Here is a gradient between two colours, a saturated blue and white, that is created using linear RGB values (which represents physical colours).

This gradient is physically accurate. You could take two objects, one blue and one white, and have some kind of filter in front that blurs them together, and this is the exact result that you would get. The hue of the colour remains the same throughout the gradient, however the hue that we perceive in the centre of the gradient seems to be a bit more purple than the hue of the blue all the way on the left. When we take the same gradient, but create it using the Oklab perceptual colour space (which represents perceived colours), we get this.

Now the hue that we perceive in the centre seems to be about the same hue as the blue all the way on the left. To our eyes, the hue that we perceive remains constant, however when we check the hue of the physical colours in this gradient, it starts at 236° on the left and ends on 219° on the right just before it turns into full-on white.

Now the question is, what colour space should we use? The answer is the same as in the chapter Linear workflows and using the right colour space. If we are talking about the amount of light, we should be using a linear colour space. Otherwise, the right colour space is simply a matter of personal taste. The same operation will give different results in different colour spaces and which one is a subjective and artistic choice.

Colour terminology

So far, I’ve mostly explained concepts and given some terms, but I haven’t defined all of the terms I’ve used and there are more terms that are useful to know. Here’s a list of terms and their definitions.

Luminance (L) is generally used as the amount of visible light. More specifically, it’s the amount of visible light per area. The SI unit is candela per square metre (cd/m2), but the unit called nit is also used. It is also used as shorthand for relative luminance.

Relative Luminance (Y) is the amount of visible light relative to some reference value (often called the reference white). It does not have a unit and generally a relative luminance of 1.0 is the same luminance as the reference white. When representing colours using almost any colour space, the luminance of that colour is a relative luminance. As an example, if we have the linear-sRGB values of (1, 1, 1), then the luminance calculated is 1.0, however we don’t know how many candela per square metre this is. In theory it could be any amount of candela per square metre, so the luminance is a relative luminance. On the other hand, the perceptual quantizer transfer function specifically defines that the luminance is in candela per square metre and therefore isn’t relative. So, the same value of (1, 1, 1) in Rec.2100 PQ results in a luminance of 10,000 cd/m2.

Brightness (Q) is the amount of visible light that we perceive. It is subjective and non-linear (as shown in the chapter Transfer functions and our eyes’ response to brightness). 

Lightness (J or L*) is the amount of visible light that we perceive relative to some reference. It is essentially brightness but relative. Perceptual colour spaces often use lightness and the colour appearance models generally use the relative luminance to calculate the lightness.

Luma (Y’) is the relative luminance encoded by some transfer function or the relative luminance calculated from encoded values. Luminance and relative luminance are calculated from linear values and are therefore linear. Luma is either encoded itself or calculated from encoded values and is therefore non-linear.

Chromaticity is the physical colour without luminance. It is the combination of hue and saturation. Chromaticity can also be the perceptual colour without brightness/lightness. 

Colourfulness is the perceived amount of saturation.

Middle Grey is the colour that is perceived to be precisely in the middle between black and white. Since it’s based on human perception, there are different estimates of what this colour would be. In photography, cinematography, and computer graphics, middle grey is defined as 18% reflectance. In other words, the colour of an object that reflects 18% of all visible light. In terms of linear RGB values (regardless of the gamut), we’d get (0.18, 0.18, 0.18).

Tone is the brightness of a colour in the context of an image (often black-and-white images).

Value is almost always used in the sense of a number, like linear values and encoded values, but is sometimes used as another word for tone or brightness.

Midtones are the tones around middle grey, often occupying the middle 50% of the total range of tones. The range of tones that belong to the midtones isn’t specifically defined. Instead, it is dependent on the image and on the person talking.

Shadows are the tones that are darker than the midtones.

Highlights are the tones that are brighter than the midtones.

Relative luminance colour spaces

If we have the sRGB colour of (0.735, 0.735, 0.735) and we show it on a display, how much light is the display emitting? We can convert the colour to linear values, which gives (0.5, 0.5, 0.5), so we know that the display will emit 50% of the light that it would do with a full-on white colour. However, we don’t and can’t know how much light that actually is. If our display is a 100 nits display, the colour will be 50 nits (1 nit is 1 cd/m2). If our display is 250 nits, the colour will be 125 nits and so on. But everyone has a different display with a different brightness, so we can’t tell how many nits or cd/m2 a colour will actually be.

The values of the colour are relative to the brightness of the display. In other words, the luminance is relative to some reference. Many colour spaces use a relative luminance, like Rec.709, XYZ, Display P3, and Adobe RGB. But why do these colour spaces use a relative luminance?

If you are on your phone in a dark room, you turn down the brightness of your phone. If you go outside and look at your phone, you need to turn up the brightness in order to be able to see something. So, depending on your viewing conditions (how bright it is where you are), you need to change the brightness of your display in order to compensate for that. If the luminance was absolute, then in the dark room your phone would be blindingly bright and outside your phone would be unintelligibly dark.

By using a colour space with a relative luminance, we can change the settings on our display to correct for our viewing conditions, so that we still perceive the colours the same. If the colour space used an absolute luminance, we would have to change our viewing conditions instead.

Changing the brightness isn’t the only way to correct for the viewing conditions. If you are in a brightly lit room, some of that light will also be reflected towards you by the glass panel of the display, causing glare. This washes out the shadows, making it so that you can’t see the detail in the shadows anymore (or much less detail). If you increase the brightness of the display to compensate for that, then the display would be much brighter than the lights in the room. Your eyes are adjusted to the amount of light in the room and not your display, so the colour appears much brighter than they should (which also skews the hue and saturation). Instead, you want to have the display be about the same brightness as the lights in the room, but now you’re still stuck with the washed out shadows.

The solution to that is to increase the gamma of the display. This raises the shadows while barely raising the highlights and keeps the brightness of the display the same. It emphasises the detail in the shadows, so that you can still see it. But if the room is no longer brightly lit, then you’d have to turn down the gamma again, since there is no glare to compensate for anymore.

The colour of objects

When light hits an object, the photons can do one of four things: reflect it, scatter it, transmit it, or absorb it. If an object absorbs all of the green and blue wavelengths of light, but reflects, scatters, or transmits the red wavelengths, then the light coming away from that object is red and thus the object looks red. If an object absorbs all of the red and blue wavelengths of light, it looks green. If an object absorbs all of the red wavelengths of light, it looks cyan. If an object absorbs all wavelengths of light, then it looks black. If an object doesn’t absorb any light, it looks white. And if an object only absorbs half of the light for all wavelengths, it looks grey. So, the colour of an object is basically the colour of the light that it didn’t absorb.

The colour of an object is dependent on the percentage of light it reflects, scatters, or transmits for each wavelength in the visible spectrum. This results in a spectral distribution of the object’s reflectance (which includes scattering) and transmittance. Visible light is a spectral power distribution, which is useful because with the spectral power distribution we know how much light there is for each wavelength and with the spectral reflectance/transmittance distribution we know how much of that light is reflected, scattered, or transmitted. Multiply them together and we get the spectral power distribution of the reflected, scattered, and transmitted light.

Just like how storing the amount of light per wavelength in the visible spectrum is impractical, storing the amount of light reflected, scattered, or transmitted per wavelength in the visible spectrum is impractical as well. Luckily, we can use the same trick as with light. We can abstract away the many wavelengths and only represent it using a few values. If we can represent physical colours by how much red, green, and blue light there is, we can also represent the reflectance and transmittance of an object by what percentage of red, green, and blue light it reflects, scatters, or transmits.

For this we need to start with a colour model, but the colour model for the reflectance and transmittance of an object needs to be the same as the colour model for the physical colour which represents light. If the physical colour is using the RGB model and the reflectance is using a RYB (red, yellow, blue) model, then we’d end up multiplying the green with the yellow value, but that doesn’t make any sense (and it shouldn’t). So, if our physical colour is using the RGB model, then our object’s colour should also be using the RGB model.

In the same vein, we also need to define a colour gamut and transfer function and these need to be the same as our physical colour. Combine this all together and we have a colour space. So, just like how we can represent physical colours using colour spaces, we can also represent an object’s colour using the exact same colour spaces. The only difference is that the values of a physical colour could be any number, while the values of an object’s colour should always be between 0.0 (0%) and 1.0 (100%).

We can give an object a diffuse colour texture and have that texture be in sRGB (which is incredibly common), but before we can use the values in the texture, we need to convert it to the rendering/working colour space (otherwise the mathematics wouldn’t work out anymore). Luckily as outlined in the chapter Colour management and handling colours the right way, software programs can do this for you, where all you have to do is to tell it what the colour space of the texture is.

In this chapter I’ve described the spectral reflectance/transmittance distribution as one thing, but in reality there are multiple, each one for a different phenomena that occurs when light interacts with an object. You can see this in the materials that can be found in 3D applications. They have colours for diffuse, translucency, reflection, refractions, transmission, subsurface scattering, and transparency.

HDR

The vast majority of the displays in use today can only get so bright. They often use a display colour space like sRGB, Rec.709, Rec.2020, DCI-P3, and Display P3, which represent luminance relative to the maximum luminance of the display. The brightest colour that the display can show is therefore limited. Some displays, on the other hand, can get much brighter. Sometimes even a hundred times brighter. But showing an sRGB image on such a bright display on its maximum brightness, would become blindingly bright. Instead, that extra brightness is used for the highlights of the image, allowing them to get much brighter and thus extending the display’s dynamic range.

For the displays that use sRGB, Rec.709, and such, a linear value of 1.0 means 100% the maximum luminance of the display and we’d call these displays SDR-displays (standard dynamic range displays). For the other kind of displays that can get much brighter, a linear value of 1.0 could mean 10% the maximum luminance or even 1% the maximum luminance (or any other percentage), depending on the capabilities of the display. We call these displays HDR-displays (high dynamic range displays).

For these displays, we also need display colour spaces that can encode values above 1.0. This can be done by using a different transfer function. The most popular transfer functions for HDR displays are HLG (hybrid-log-gamma) and PQ (perceptual quantizer). HLG can encode a maximum linear value of 12.0 and PQ can encode a maximum linear value of 100.0 (assuming that the reference white is 100 nits). Reference white is the physical colour associated with the linear values of (1, 1, 1).

HDR display colour spaces can also have different colour gamuts and often they do make use of wider colour gamuts. However, there is nothing about HDR that says that a wide colour gamut should be used, and SDR displays can also use wide colour gamuts. A benefit of HDR is said to be the ability to have more vibrant colours, yet this comes from the use of a wider colour gamut and not from it being HDR.

Because HDR displays can show a wider dynamic range, in order to prevent banding from showing up, a higher bit depth needs to be used. Generally 10 to 12 bits per value is used. The higher bit depth is often used as an argument for choosing HDR over SDR, but there is nothing about HDR or SDR that says that SDR can’t have higher bit depths as well. The movie files that movie theatres receive from the distributors, make use of the DCI-P3 colour space (which is SDR) and a bit depth of 12 bits.

Ultimately, the only actual benefit that HDR gives over SDR is the ability to show brighter colours. This absolutely does not mean that HDR is some kind of scam or not worth it. There are many cases where being able to reproduce those brighter colours can actually make a significant improvement on the viewing experience, while with SDR you’d need to make use of a rendering to be able to represent those brighter colours (without the display actually producing those bright colours).

The concept of HDR doesn’t just apply to displays, but also to footage/images. An image that can only represent a maximum linear value of 1.0, is called an SDR-image. An image that can represent linear values above 1.0, is called an HDR-image. Log camera footage is an example of HDR-images. When you linearise the log-encoded values, you can end up with linear values of above 1.0. EXR is an image format that stores floating point numbers, which allows it to represent linear values above 1.0 as well. 360 degree HDR-images (often called HDRI) are generally used within CGI for the lighting. A light source is often much brighter than a linear value of 1.0, so you would need to use an HDR-image to be able to store that.

HomeProjectsMiExWeb BooksContactNederlands