This is the second part of a discussion of the pixel formats supported by Windows Media Photo. I definitely recommend you read Part 1 (below) first. It will provide some important background information that will help with the understanding of Part 2.
The Mess We’re in Today
In our last installment, we discussed Windows Media Photo’s support for unsigned integer pixel formats. While a wide range of unsigned integer pixel formats are supported, they all share the same limitation of not allowing any data outside the visible range defined by the gamut of the particular color space. The range of numerical values, regardless of the bit depth , are used to represent pixel values from the darkest visible color (black), to the brightest or most saturated visible color. Any value that exceeds this visible range as a result of editing or image processing is simply discarded, and can never be recovered.
This limitation has led to the widespread adoption of some fairly convoluted workflows to always retain the highest quality images. The conventional approach is to carefully convert from RAW to retain the maximum dynamic range and widest gamut, not necessarily the best image appearance. Then, this “low contrast, low saturation” master image can be loaded in Photoshop and adjusted in different ways, depending on whether the final destination is print or monitor display. We’ve also been taught to choose different working color spaces, again based on the intended use of the image.
Of course, if we want to do any image editing (and don’t we always), ideally we need to complete all the image editing steps and produce a new low contrast/saturation master, before we begin the series of steps that will optimize the photo for the particular output requirements. And if farther down the workflow path we discover the need for an additional edit or tweak, we invariably have to back up in the workflow to the appropriate point, make the change, and then re-do all the rest of the steps to get back to where we were.
To help with this convoluted process, our tools have become much more sophisticated (and complex!) Any seasoned Photoshop user has learned to make heavy use of adjustment layers. This allows us to define and view all the changes we want to make, but we don’t actually commit those changes until the image is flattened for distribution, printing, or display. That makes it easier to back up in the workflow and make a change, without the need to manually redo all the subsequent steps. However, adjustment layers can often be confusing and difficult; they work for some scenarios, but not others. Additionally, the use of adjustment layers invariably locks us in a single application. It’s rarely possible to move an image with all its adjustment layers between different tools; we have to flatten an image (and permanently commit to the changes) before moving an image from one tool to the next.
Even with Photoshop’s adjustment layers, it’s still common to find that the appearance we want can’t be achieved with the version of the image we started with. Changing the white balance or recovering a lost highlight often requires going all the way back to the RAW conversion and making an adjustment to produce a different version of our low contrast/saturation master image. At that point, many of the adjustments made in Photoshop will likely have to be tweaked accordingly, or redone from scratch.
To address this problem, some of the newest tools provide “adjustment layer” style editing directly from the unprocessed RAW image. Apple Aperture, Adobe LightRoom as well as other popular RAW conversion applications always start from the RAW image data, first demosaicing it to an RGB image, and then applying the requested exposure, color and other adjustments as editable layers. At some point, a flattened image needs to be rendered for distribution, printing or display, or just to export it to a different program. But up until that point, most adjustments can be changed or removed without the need to manually reproduce the subsequent steps in the image processing workflow.
The problem with all these “adjustment layer” style image processing applications is the proprietary nature of the software. It’s not like the software developers have much of a choice – the adjustment information they are creating only has meaning to their application; it can’t be understood or processed by another application. While it might be possible to define a common language for these adjustments, we would be reducing all applications to some “least common denominator” subset of adjustment operations, robbing application developers of the ability to develop and deliver new, innovative image processing operations.
So, the working edit copy of your photo is locked up inside a particular proprietary software package. You can always export a flattened output image, but because we typically use unsigned integer pixel formats (you did remember that this started as a discussion of pixel formats, right?), the pixel data is cropped to the visible limits of the chosen color space. We’re forced to choose our color space carefully, based on the intended use of the image, and once we commit to flatten the image to unsigned integer values in that color space, we’ve effectively locked the appearance. Any further edits or adjustments we make continue to reduce the overall gamut and dynamic range of the image because we’re constrained by the limits of the color space. If we do want to make changes that preserve the full fidelity of the image, we have no choice but to go back to the proprietary tool we first used, or else start all over from scratch.
Along the way, we wind up saving multiple versions of each photo: The original RAW image, the “adjustment layer” file used by our editing tool of choice, and multiple flattened images for each different use or destination for the photo. Half the job is just keeping track of all those files! And we won’t even get into the long term archivability of device-specific RAW files or application-specific editing files.
In reality, the complexity of the workflow required to fully preserve the full tonal range of digital photos limits this process to only a subset of digital photographers. Most photographers, even those who regularly shoot in RAW, convert the RAW file to an sRGB unsigned integer format and complete the remainder of the workflow process completely in sRGB. This significantly limits the tonal range that can be delivered in a final print.
What’s All This Have to Do with Pixel Formats?
In case you forgot, this began as a discussion of pixel formats. We took a pretty long detour lamenting the challenges of today’s high quality photo workflow. In fact, our current dependence on unsigned integer pixel formats is a big contributor to the problems that demand the cumbersome workflows described above.
Windows Vista (and .NET Frameworks 3.0 for down-level support) introduces a new graphics architecture with comprehensive support for high dynamic range, wide gamut pixel formats, using fixed point or floating point numerical representations, rather than unsigned integers. Windows Media Photo is currently the only file format for Windows (and beyond) that supports these new pixel formats.
The fundamental goal with a high dynamic range, wide gamut pixel format is to never discard image information that falls outside the visible range. The entire tonal spectrum is always retained, regardless of the current exposure or color adjustments.
Using fixed point or floating point values, pixel information is encoded using an extended numerical range. The visible portion of the numerical range is a subset of the total numerical range that can be encoded with fixed point or floating point values.
Using this approach, if color or exposure adjustment pushes a pixel value outside the visible range, rather than losing this value (as is the case with unsigned integer representations), the numerical value is still retained. If a subsequent adjustment brings that pixel value back into the visible range, the correct numerical value is fully recovered.
This dramatically eliminates the issues and concerns with “flattening” a file. Most color or exposure adjustments are completely reversible, eliminating the need to restart with the original RAW file or an intermediate layered editing file. A single file can often be used for printing, display, archiving, and even additional editing. Regardless of the choices made for the particular “look” of the photo, the full dynamic range and gamut of the original data is always retained, insuring that the full total value of the photo is available for printing, display or further exposure and color adjustments.
While this doesn’t completely replace the need for adjustment layer editing or multiple version file management, in many cases a high dynamic range, wide gamut file format enables the highest possible quality workflow without the need for cumbersome and confusing multi-stage workflows. And a high dynamic range, wide gamut image file is ideal for long term archiving as well.
Of course, I’d be remiss not to point out that this is still largely potential, not reality. Until we have a greater range of choice for image processing applications with robust high dynamic range, wide gamut editing features, today’s popular tools still largely restrict us to unsigned integers. However, there are still many scenarios enabled with Windows Vista that can take full advantage of the benefits provided by Windows Media Photo and high dynamic range, wide gamut pixel formats.
Windows Media Photo supports high dynamic range, wide gamut photo encoding using either fixed point or floating point numerical representations in 16 bits per channel (bpc) or 32bpc bit depths. There are advantages with each format option and supporting multiple pixel formats offers the greatest possible choice and flexibility.
Fixed Point Pixel Formats
Windows Media Photo supports the following fixed point pixel formats:
This list of six different includes RGB, RGB plus Alpha, and Gray scale, in both 16bpc and 32bpc bit depths.
A fixed point numerical representation is not commonly used in current image processing applications or image file formats. It is being introduced in Windows Media Photo as an optimal format to encode greater dynamic ranges while still retaining all the performance advantages of integer processing.
Fixed point values are essentially signed, scaled integer values. By applying an appropriate scaling factor, the signed integer range can represent an arbitrary numeric range. This enables the encoding of color information that goes beyond the traditional range limits of “black” and “white” or the gamut of any particular device or rendering target.
Rather than interpreting the number as range of integer steps from black to maximum saturation for a particular color profile, this fixed point data is scaled to represent a larger floating point range in a linear color space. With this linear encoding, zero still represents the minimum visible value, or black. A value of 1.0 represents the maximum visible value, or when applied to all channels that make up a pixel, white. The specific scaling for each bit depth specifies exactly what point in the entire signed integer range is interpreted as a value of 1.0.
For 16bpc fixed point pixel formats, there are 65536 unique values. When interpreted as a signed integer, these values range from -32768 to +32767. Windows Media Photo interprets this fixed point range as representing a linear floating point color value numeric range from -4.0 to +3.999+. 0 still represents black, and the 1.0 value for white (or maximum color saturation for a single channel) is represented by the signed integer value 8,192 (0x2000h). In effect, we are re-scaling the signed integer range (from -32767 to +32768) by dividing the values by 8192. Rather than only representing the visible color value range (from 0.0 to 1.0), this scaling allows us to represent an exposure range eight times as large. Most typical photo color or exposure adjustments will still retain the entire original sensor data within this expanded range. At the same time, 13 bits are always available to represent the current visible range; because this is treated as linear data, the original sensor data efficiently maps to this high dynamic range, wide gamut color space.
If this still doesn’t provide sufficient range or precision, Windows Media Photo also supports 32bpc fixed point encoding. The process is the same, but the signed integer values are interpreted to represent a scaled floating point range from -128 to 127.999+, with 24 bits of linear precision always available for the visible range. 0 is still used to represent black, and the white (or maximum color saturation) value of 1.0 is represented by the value 16,777,216 (0x01000000h).
Floating Point Pixel Formats
In addition to the fixed point pixel formats above, Windows Media Photo can also encode high dynamic range, wide gamut image data using the following floating point pixel formats:
Like the fixed point formats, this list includes RGB, RGB plus Alpha and Gray scale in 16bpc and 32bpc bit depths. In addition, it includes a pre-multiplied alpha channel version at 32bpc. Finally, there’s one special case floating point representation that we’ll also cover a little later.
Windows Media Photo uses floating point values to encode a range of values beyond just the visible range. With both bit depths, a floating point value of 0.0 represents black and a floating point value of 1.0 represents white or maximum saturation per channel. 32bpc provides a dramatically larger range and more precision than 16bpc.
16bpc floating point encoding is commonly referred to as the HALF floating point format. This encoding is not natively supported by most general purpose CPU’s, but many graphics cards natively support the HALF format on the GPU. The 16 bits are organized as a sign bit, 5 exponent bits and 10 normalized mantissa bits, and are otherwise interpreted using the same rules as 32bpc floating point values. This provides an efficient method to encode values with a very wide dynamic range.
32bpc floating point values are encoded in accordance with the 32-bit implementation of the ANSI/IEEE Standard 754-1985 Standard for Binary Floating Point Arithmetic, widely used on most computing platforms. The format uses one sign bit, 8 exponent bits and 23 normalized mantissa bits. While this is one of the least efficient means to encode values for compression purposes, it offers the greatest precision and dynamic range. Most image editing applications use 32bpc floating point representation internally for the highest quality image processing.
The final floating point pixel format supported by Windows Media Photo is 32bppRGBE. An entire RGB pixel is encoded as three 16-bit floating point values using only four bytes. The bytes include three unnormalized, unsigned 8-bit mantissas for the red, green and blue channels, plus a shared 8-bit exponent. While this offers no increase in gamut, it is a more compact uncompressed method to encode image content with a very wide exposure range.
High, Wide and Deep
So, no we’ve covered the wide range of pixel formats supported by Windows Media Photo. This new file format offers dramatically more flexibility than any previous photo file format, allowing the choice of the appropriate pixel format depending on the data source, application and usage scenario.
Unsigned integer pixel formats provided the greatest compatibility with legacy files and applications. As such, they are bound by all the same limitations imposed by this method of color representation. However, unsigned integers still provide the best compression efficiency, and are perfectly suited for many applications.
By supporting fixed point and floating point high dynamic range, wide gamut pixel formats, Windows Media Photo helps enable an entirely new approach to high quality photo workflows. These high, wide and deep pixel formats make it possible to preserve the entire dynamic range and gamut originally captured by the source device, and preserves the range throughout the entire image processing workflow. This can be accomplished without concerns about color profiles and working color spaces, and can significantly simplify the process of high fidelity photo printing. An advanced capability that has previously only been available to those professionals willing to invest in complex tools and cumbersome workflows can now be available to everyone. Any easy-to-use and easy-to-share file format can be the same format used to capture original images, edit, archive and print, all with the maximum fidelity possible.
Coming Up …
For the next installment, we’ll detour from this review of technical details and conceptual discussion and dive into some practical tools and techniques for developers and enthusiasts to get your hands dirty and actually start encoding and viewing Windows Media Photo files in these various pixel formats. This is all possible today using Windows Vista or .NET Frameworks 3.0, Photoshop, and some straightforward software development. But we’ll focus on some (largely unsupported) tools and utilities that can be used right now, instead of writing your own code or waiting for the release of updated applications that support Windows Media Photo. We’ll talk about some of the current pitfalls and limitations with Windows Vista Beta 2, so you know what to expect and what to avoid (for now.)
Looking farther down my blogging to-do list, we’ll dive a little deeper into the use of high dynamic range, wide gamut pixel formats and talk in detail about color spaces and the use of color profiles. This will include a discussion of scRGB, the native color space we use for fixed point and floating point pixel formats.
As always, send me your comments, criticisms, corrections and topic requests, either via the comments section below or the email link on the left.