Image Sensor Architecture for Digital Cinema
or CMOS), electronic image sensors must capture incoming light, convert it to electric signal,
measure that signal, and output it to supporting electronics. Similarly, regardless of the technology of image acquisition, cinematographers can
generally agree on a short list of capabilities that a capture medium needs in order to provide great images for big-screen feature films: capabilities
such as Sensitivity, Exposure Latitude, Resolving Power, Color Fidelity, Frame Rate, and one we might call Personality. This paper will use such a
list to evaluate image sensor technologies available for digital cinematography now and in the near future.
Image Quality: Many Paths to
Enlightenment
The comparison of image sensor technologies for motion pictures
is both difficult and complicated. The combination of an image
sensor and its supporting electronics are analogous to a film stock;
just as there is no single film stock that covers all situations or all
cinematographers needs, there is no single sensor or camera that
is perfect for every occasion. Every decision involves tradeoffs.
The same sensor can even be more or less suitable for an
application depending on the camera electronics that drive and
support it. But no amount of processing can retrieve information
that a sensor didnt capture at the scene.
In designing the sensor and electronics for our Origin® digital
cinematography camera, DALSA drew upon its 25 years of
experience in CCD and CMOS imager design. Given the demands
and limitations of the situation, we determined that the best image
sensor design for our purposes was (and still is) a frame-transfer
CCD with large photogate pixels and a mosaic color filter array. It
is not the only design that could have succeeded, but it is the only
design that has succeeded. No other design has demonstrated a
similar level of imaging performance across the range of criteria
we identified above. This is not to say that no other design will
reach those performance levels; to bet against technology
advancement would be short-sighted. On the other hand, the
performance Origin can demonstrate today is several generations
ahead of the best weve seen from other technologies and
architectures, and Origins design team is forging ahead to
improve it even more.
Imaging Requirements: what do
cinematographers really want?
Individual tastes and rankings will vary, but most
cinematographers would agree that any imaging medium can be
judged by a short list of attributes including those described
below.
Sensitivity
Sensitivity refers to the ability to capture the desired detail at a
given scene illumination. Also known as film speed. Matching
imager sensitivity with scene lighting is one of the most basic
aspects of any
photography.
Silicon imagers capture
image information by
virtue of their ability to
convert light into
electrical energy through
the photoelectric effect
incident photons boost
energy levels in the
silicon lattice and knock loose electrons to create electric signal
charge in the form of electron-hole pairs. Image sensor sensitivity
depends on the size of the photosensitive area (the bigger the
pixel, the more photons it can collect) and the efficiency of the
photoelectric conversion (known as quantum efficiency or QE).
QE is affected by the design of the pixel, but also by the
wavelength of light. Optically insensitive structures on the pixel
can absorb light (absorption loss); also, silicon naturally reflects
certain wavelengths (reflection loss), while very long and very
short wavelengths may pass completely through the pixels
Image Sensor Architectures for Digital Cinematography
2
DALSA Digital Cinema
03-70-00218-01
photosensitive layer without generating an electron (transmission
loss). (Janesick, 1)
Sensitivity requires more than merely generating charge from
photogenerated electrons. In order to make use of that sensitivity,
the imager must be able to manage and measure the generated
signal without losing it or obscuring it with noise.
Exposure latitude
Exposure latitude refers to the ability to preserve detail in both
shadow and highlights simultaneously. Some of the most dramatic
cinematic effects, as well as the most subtle, depend on wide
exposure latitude. For film, latitude is described in terms of usable
stops where each successive stop represents a halving (or
doubling) of light transmitted to the focal plane. For example, at
f2.0 there is 50% less light transmitted than at f1.4; f2.8 transmits
half as much as f2.0, and so on. Many film stocks deliver over 11
stops of useful latitude, while broadcast and early digital movie
cameras have struggled to deliver more than eight.
In the electronic domain, exposure latitude is expressed as
dynamic range, usually described in terms that involve the ratio of
the devices output at saturation to its noise floor. This can be
expressed as a ratio (4096:1), in decibels (72dB), or bits (12 bits).
It should be noted that not all of a devices dynamic range is
linear. Above and below certain levels, device response is not
predictable and its output may not be useful. When comparing
device dynamic ranges specifications, note whether the value is
given as linearthe linear segment is by far the most useful part of
the dynamic range. Low noise and a large charge capacity, often
contradictory goals, are crucial to delivering great dynamic range.
While extensive research goes into designing pixels to be as
sensitive and as quiet as possible in low light, performance in
bright light is also very important. Film stocks have been refined
to respond to varied lighting with non-linear toe and shoulder
regions for shadows and highlights; this is one of films defining
characteristics. Very few electronic imagers can offer similar
performance. In contrast, we have all seen digital images in which
extremely bright areas bloom or blow out the highlight
details. The larger a pixels charge capacity, the wider the range of
illumination intensities it can manage. But to contain the brightest
highlights without losing detail or blowing out the rest of the
image, sensors need antiblooming structures to drain away
excess charge beyond saturation. By their nature, CMOS pixels
offer a high degree antiblooming; in CMOS designs there is almost
always a drain nearby to absorb charge overflow. Some (but not
all) CCDs also offer antiblooming, although antiblooming almost
always involves a tradeoff with full-well capacity. For pixels that
are already limited in charge capacity by small active area, good
antiblooming performance can reduce exposure latitude
significantly. The smaller the pixel, the greater the impact.
Resolving power
Technically, the ability to image fine spatial frequencies through
an optical system should be defined as resolution (Cowan, 1)
but in the electronic domain resolution is too often used to
mean mere pixel count. For clarity we will use the phrase
resolving power here. Resolving power is measured in units
such as line pairs per degree of arc (from the point of view of a
human observer), line pairs per millimeter (on the imaging
surface itself), or line pairs per image height (in terms of a display
device, with viewing distances given).
Clearly, resolving power is quite different from pixel count. The
performance of the pixels (and the lens focusing light onto them)
has a huge impact on how much resolving power an imaging
system has. Two related terms are sharpness and detail, both
used to describe the amount and type of fine information available
in the image, and both heavily influenced by the amount of
contrast available at various frequencies in an image (Cowan, 1).
Discussion of resolving power, contrast, and frequencies begs the
inclusion of the technical term Modulation Transfer Function
(MTF), which describes the geometrical imaging performance of a
system, usually illustrated as a graph plotting modulation
(contrast ratio) against spatial frequency (line pairs per unit). As
MTF decreases, closely spaced light and dark lines will lose
contrast until they are indistinguishably gray. Increasing the
number of pixels in an imager will not improve its resolving
power if the design choices made in adding pixels reduce MTF.
This can happen if the pixels become too small, especially if they
become smaller than the resolving power of the lens.
Figure 1. The top image demonstrates much wider exposure
latitude or dynamic range, allowing it to preserve details in
shadows and highlights
Image Sensor Architectures for Digital Cinematography
3
DALSA Digital Cinema
03-70-00218-01
Some film negatives have been tested to exceed 4000 lines of
horizontal resolving power. However, prints, even taken directly
from the negative, inherit only a fraction of the negatives MTF
(see ITU Document 6/149-E, published 2001). The image degrades
during each generational transfer from negative to interpositives,
internegatives, answer prints, and release prints. Clearly,
electronic sensors for digital cinematography will need to be
thousands of pixels wide, but exactly how many thousands is less
clear. Whatever the display resolution, most cinematographers
would prefer to capture as much detail as possible at the
beginning of the scene-to-screen chain to have maximum
flexibility in postproduction and archiving. The feature film
industry has no consensus o