Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

DS7201 ADVANCED DIGITAL IMAGE PROCESSING, Lecture notes of Digital Image Processing

Sources of 3D Data sets, Slicing the Data set, Arbitrary section planes, The use of color, Volumetricdisplay, Stereo Viewing, Ray tracing, Reflection, Surfaces, Multiply connected surfaces, Imageprocessing in 3D, Measurements on 3D images.

Typology: Lecture notes

2015/2016
On special offer
30 Points
Discount

Limited-time offer


Uploaded on 12/23/2016

mohanaprakash_ece
mohanaprakash_ece 🇮🇳

5

(5)

1 document

1 / 60

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Sources of 3D data
True three-dimensional (3D) imaging is becoming more accessible with the continued devel-
opment of instrumentation. Just as the pixel is the unit of brightness measurement for a two-
dimensional (2D) image, the voxel (volume element, the 3D analog of the pixel or picture el-
ement) is the unit for 3D imaging; and just as processing and analysis is much simpler if the pixels
are square, so the use of cubic voxels is preferred for three dimensions, although it is not as often
achieved.
Several basic approaches are used for volume imaging. In Chapter 11, 3D imaging by tomographic
reconstruction was described. This is perhaps the premier method for measuring the density and
in some cases the composition of solid specimens. It can produce a set of cubic voxels, although
that is not the only or even the most common way that tomography is presently used. Most med-
ical and industrial applications produce one or a series of 2D section planes, which are spaced far-
ther apart than the lateral resolution within the plane (Baba et al., 1984, 1988; Briarty and Jenkins,
1984; Johnson and Capowski, 1985; Kriete, 1992).
Tomography can be performed using a variety of different signals, including seismic waves, ultra-
sound, magnetic resonance, conventional x-rays, gamma rays, neutron beams, and electron mi-
croscopy, as well as other even less familiar methods. The resolution may vary from kilometers
(seismic tomography), to centimeters (most conventional medical scans), millimeters (typical in-
dustrial applications), micrometers (microfocus x-ray or synchrotron sources), and even nanome-
ters (electron microscope reconstructions of viruses and atomic lattices). The same basic presen-
tation tools are available regardless of the imaging modality or the dimensional scale.
The most important variable in tomographic imaging, as for all of the other 3D methods discussed
here, is whether the data set is planes of pixels, or an array of true voxels. As discussed in Chap-
ter 11, it is possible to set up an array of cubic voxels, collect projection data from a series of
views in three dimensions, and solve (either algebraically or by backprojection) for the density of
each voxel. The most common way to perform tomography however, is to define one plane at a
time as an array of square pixels, collect a series of linear views, solve for the 2D array of densi-
ties in that plane, and then proceed to the next plane. When used in this way, tomography shares
3D Image Visualization
12
© 2002 by CRC Press LLC
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
Discount

On special offer

Partial preview of the text

Download DS7201 ADVANCED DIGITAL IMAGE PROCESSING and more Lecture notes Digital Image Processing in PDF only on Docsity!

Sources of 3D data

T

rue three-dimensional (3D) imaging is becoming more accessible with the continued devel- opment of instrumentation. Just as the pixel is the unit of brightness measurement for a two- dimensional (2D) image, the voxel (volume element, the 3D analog of the pixel or picture el- ement) is the unit for 3D imaging; and just as processing and analysis is much simpler if the pixels are square, so the use of cubic voxels is preferred for three dimensions, although it is not as often achieved.

Several basic approaches are used for volume imaging. In Chapter 11, 3D imaging by tomographic reconstruction was described. This is perhaps the premier method for measuring the density and in some cases the composition of solid specimens. It can produce a set of cubic voxels, although that is not the only or even the most common way that tomography is presently used. Most med- ical and industrial applications produce one or a series of 2D section planes, which are spaced far- ther apart than the lateral resolution within the plane (Baba et al., 1984, 1988; Briarty and Jenkins, 1984; Johnson and Capowski, 1985; Kriete, 1992).

Tomography can be performed using a variety of different signals, including seismic waves, ultra- sound, magnetic resonance, conventional x-rays, gamma rays, neutron beams, and electron mi- croscopy, as well as other even less familiar methods. The resolution may vary from kilometers (seismic tomography), to centimeters (most conventional medical scans), millimeters (typical in- dustrial applications), micrometers (microfocus x-ray or synchrotron sources), and even nanome- ters (electron microscope reconstructions of viruses and atomic lattices). The same basic presen- tation tools are available regardless of the imaging modality or the dimensional scale.

The most important variable in tomographic imaging, as for all of the other 3D methods discussed here, is whether the data set is planes of pixels, or an array of true voxels. As discussed in Chap- ter 11, it is possible to set up an array of cubic voxels, collect projection data from a series of views in three dimensions, and solve (either algebraically or by backprojection) for the density of each voxel. The most common way to perform tomography however, is to define one plane at a time as an array of square pixels, collect a series of linear views, solve for the 2D array of densi- ties in that plane, and then proceed to the next plane. When used in this way, tomography shares

3D Image Visualization

many similarities (and problems) with other essentially 2D imaging methods that we will collec- tively define as serial imaging or serial section techniques.

A radiologist viewing an array of such images is expected to combine them in his or her mind to “see” the 3D structures present. (This process is aided enormously by the fact that the radiologist already knows what the structure is, and is generally looking for things that differ from the famil- iar, particularly in a few characteristic ways that identify disease or injury.) Only a few current-gen- eration systems use the techniques discussed in this chapter to present 3D views directly. In in- dustrial tomography, the greater diversity of structure (and correspondingly lesser ability to predict what is expected), and the greater amount of time available for study and interpretation, has en- couraged the use of computer graphics. But such displays are still the exception rather than the rule, and an array of 2D planar images is more commonly used for volume imaging. This chapter emphasizes methods that use a series of parallel, uniformly spaced 2D images, but present them in combination to show 3D structure.

These images are obtained by dissecting the sample into a series of planar sections, which are then piled up as a stack of voxels. Sometimes the sectioning is physical. Blocks of embedded bi- ological materials, textiles, and even some metals, can be sliced with a microtome, and each slice imaged (just as individual slices are normally viewed). Collecting and aligning the images pro- duces a 3D data set in which the voxels are typically very elongated in the “ Z ” direction because the slices are much thicker or more widely spaced than the lateral resolution within each slice.

At the other extreme, the secondary ion mass spectrometer uses an incident ion beam to remove one layer of atoms at a time from the sample surface. These pass through a mass spectrometer to select atoms from a single element, which is then imaged on a fluorescent screen. Collecting a se- ries of images from many elements can produce a complete 3D map of the sample. One difference from the imaging of slices is that there is no alignment problem, because the sample block is held in place as the surface layers are removed. On the other hand, the erosion rate through different structures can vary so that the surface does not remain planar, and this roughening or differential erosion is very difficult to account for. In this type of instrument, the voxel height can be very small (essentially atomic dimensions) while the lateral dimension is many times larger.

Serial sections

Most physical sectioning approaches are similar to one or the other of these examples. They are known collectively as serial section methods. The name serial section comes from the use of light microscopy imaging of biological tissue, in which blocks of tissue embedded in resin are cut us- ing a microtome into a series of individual slices. Collecting these slices (or at least some of them) for viewing in the microscope enables researchers to assemble a set of photographs which can then be used to reconstruct the 3D structure.

This technique illustrates most of the problems that may be encountered with any 3D imaging method based on a series of individual slices. First, the individual images must be aligned. The mi- crotomed slices are collected on slides and viewed in arbitrary orientations. So, even if the same structures can be located in the different sections (not always an easy task, given that some varia- tion in structure with depth must be present or there would be no incentive to do this kind of work), the pictures do not line up.

Using the details of structure visible in each section provides only a coarse guide to alignment. The automatic methods generally seek to minimize the mismatch between sections either by aligning the centroids of features in the planes so that the sum of squares of distances is minimized, or by overlaying binary images from the two sections and shifting or rotating to minimize the area re- sulting from combining them with an Ex-OR (exclusive OR) operation, discussed in Chapter 7.

some distortion in the block. This 5–20% compression in one direction is usually assumed to be nearly the same for all sections (since they are cut in the same direction, and generally have only small differences in structure that would alter their mechanical properties). If the fiducial marks have known absolute coordinates, then stretching of the images to correct for the distortion is possible. It is usually assumed that the entire section is compressed uniformly, although for some samples this may not be true.

Otherwise, it may be possible to use internal information to estimate the distortion. For example, if there is no reason to expect cells or cell nuclei to be elongated in any preferred direction in the tissue, then measurement of the dimensions of many cells or nuclei may be used to determine an average amount of compression. Obviously, this approach includes some assumptions and can only be used in particular circumstances.

Another difficulty with serial sections is calibration of dimension in the depth direction. The thick- ness of the individual sections is only known approximately (for example, by judging the color of the light produced by interference from the top and bottom surfaces, or based on the mechanical feed rate of the microtome). It may vary from section to section, and even from place to place within the section, depending on the local hardness of the material being cut. Constructing an ac- curate depth scale is quite difficult, and dimensions in the depth direction will be much less ac- curate than those measured within one section plane.

If only some sections are used, such as every second or fifth (in order to reduce the amount of work required to image them and then align the images), then this error becomes much worse.

Figure 2. Alignment of serial sections with translation. Sections through an inclined circular cylinder may be misconstrued as a vertical elliptical cylinder.

Figure 3. Alignment of serial sections with rotation: (a) actual outlines in 3D serial section stack; (b) surface modeling applied to outlines, showing twisted structure; (c) erroneous result without twist when outlines are aligned to each other.

a b^ c

It also becomes difficult to follow structures from one image to the next with confidence. Before computer reconstruction methods became common, however, this kind of skipping was often nec- essary simply to reduce the amount of data that the human observer had to juggle and interpret.

Using only a fraction of the sections is particularly common when ultra-thin sections are cut for viewing in an electron microscope instead of the light microscope. As the sections become thin- ner, they increase in number and are more prone to distortion. Some may be lost (for instance due to folding) or intentionally skipped. Portions of each section are obscured by the support grid, which also prevents some from being used. At higher magnification, the fiducial marks become larger, less precisely defined, and above all more widely spaced so that they may not be in close proximity to the structure of interest.

Figure 4 shows a portion of a series of transmission electron microscope (TEM) images of tissue in which the 3D configuration of the membranes (dark, stained lines) is of interest. The details of the edges of cells and organelles have been used to approximately align pairs of sections through the stack, but different details must be used for different pairs as there is no continuity of detail through the entire stack. The membranes can be isolated in these images by thresholding ( Figure 5 ), but the sections are too far apart to link the lines together to reconstruct the 3D shape of the surface. This problem is common with conventional serial section images.

Metallographic imaging typically uses reflected rather than transmitted light. As discussed in the next section, serial sectioning in this context is accomplished by removing layers of materials se- quentially by physical polishing. The need to locate the same sample position after polishing, and to monitor the depth of polishing, can be met by placing hardness indentations on the sample, or by laser ablation of pits. These serve as fiduciary marks for alignment, and the change in size of the mark reveals the depth. In archaeological excavation the fiduciary marks may be a network of strings and a transit, and the removal tool may be a shovel. In some mining and quarrying exam- ples it may be a bulldozer, but the principles remain the same regardless of scale.

Figure 4. Four serial section images from a stack (courtesy Dr. C. D. Bucana, University of Texas M. D. Anderson Cancer Center, Houston, TX), which have already been rotated for alignment. The membranes at the upper left corner of the images are thresholded and displayed for the entire a b stack of images in^ Figure 5.

c d

The confocal microscope eliminates this extraneous light, and so produces useful optical section images without the need for processing. This is possible because the sample is imaged one point at a time (thus, the presence of “scanning” in the name). The principle was introduced in Chapter 4 (Image Enhancement), in conjunction with some of the ways that images of light reflected from surfaces can be processed. The principle of the confocal microscope is that light from a point source (often a laser) is focused on a single point in the specimen and collected by an identical set of optics, reaching a pinhole detector. Any portion of the specimen away from the focal point, and particularly out of the focal plane, cannot return light to the pinhole light to interfere with the for- mation of the image. Scanning the beam with respect to the specimen (by moving the light source, the specimen, or using scanning elements in the optical path) builds up a complete image of the focal plane.

If the numerical aperture of the lenses is high, the depth of field of this microscope is very small, although still several times the lateral resolution within individual image planes. Much more im- portant, the portion of the specimen that is away from the focal plane contributes very little to the image. This makes it possible to image a plane within a bulk specimen, even one that would or- dinarily be considered translucent because of light scattering. This method of isolating a single plane within a bulk sample, called optical sectioning, works because the confocal light micro- scope has a very shallow depth of field and a high rejection of stray light. Translating the specimen in the z direction and collecting a series of images makes it possible to build up a three-dimen- sional data set for viewing.

Several imaging modalities are possible with the confocal light microscope. The most common are reflected light, in which the light reflected from the sample returns through the same objective lens as used to focus the incident light and is then diverted by a mirror to a detector, and fluores- cence in which light is emitted from points within the specimen and is recorded using the same geometry. It is also possible, however, to use the microscope to view transmitted light images us- ing the geometry shown in Figure 6. This permits acquiring transmitted light images for focal plane sectioning of bulk translucent or transparent materials. Figure 7 shows an example of a transmitted light focal plane section.

Both transmitted and reflected-light images of focal plane sections can be used in 3D imaging for different types of specimens. The characteristic of reflected-light confocal images is that the inten- sity of light reflected to the detector drops off very rapidly as points are shifted above or below the focal plane. Therefore, for structures in a transparent medium, only the surfaces will reflect light.

Laser Source

Detector

Specimen

Mirror

Splitter

Figure 6. Transmission confocal scanning light microscopy (CSLM) can be performed by passing the light through the specimen twice. Light is not imaged from points away from the in-focus point, which gives good lateral and excellent depth resolution compared to a conventional light microscope. (The more common reflected light confocal microscope omits the optics beneath the specimen.)

For any single image plane, only the portion of the field of view where some structure passes through the plane will appear bright, and the rest of the image will be dark. This characteristic per- mits some rather straightforward reconstruction algorithms.

A widely used imaging method for the confocal microscope is emission or fluorescence, in which the wavelength of the incident light is able to cause excitation of a dye or other fluorescing probe introduced to the specimen. The lower-energy (longer wavelength) light emitted by this probe is separated from the incident light, for instance by a dichroic mirror, and used to form an image in which the location of the probe or dye appears bright. Building up a series of images in depth al- lows the structure labeled by the probe to be reconstructed.

The transmitted-light mode, while it is the most straightforward in terms of optical sectioning, is lit- tle used as yet. This situation is partly due to the difficulties in constructing the microscope with matched optics above and below the specimen, as compared with the reflection and emission modes in which the optics are only above it; however, the use of a lens and mirror beneath the specimen to return the light to the same detector as present in the more standard microscope de- sign can produce most of the same imaging advantages (the only loss is that in passing through the specimen twice, some intensity is lost).

The principal advantages of optical sectioning are avoiding physical distortion of the specimen due to cutting, and having alignment of images from the various imaging planes. The depth reso- lution, although inferior to the lateral resolution in each plane by about a factor of two to three times, is still useful for many applications. This difference in resolution, however, does raise some difficulties for 3D image processing, even if the distance between planes is made smaller than the resolution so that the stored voxels are cubic (which is by no means common).

By measuring or modeling the 3D shape of the microscope’s point spread function, it is possible by deconvolution to improve the resolution of the confocal light microscope. The method is iden- tical to that shown in Chapter 5 for 2D images.

Sequential removal

Many materials are opaque and therefore cannot be imaged by any transmission method, pre- venting any type of optical sectioning. Indeed, metals, composites, and ceramics are usually ex- amined in the reflected light microscope, although it is still possible to collect a series of depth im- ages for 3D reconstruction by sequential polishing of such materials, as mentioned earlier.

The means of removal of material from the surface depends strongly on the hardness of the ma- terial. For some soft metals, polymers, and textiles, the microtome can be used just as for a block of biological material, except that instead of examining the slice of material removed, the surface left behind is imaged. This approach avoids most problems of alignment and distortion, espe- cially if the cutting can be done in situ without removing the specimen from the viewing position

Figure 7. CSLM image showing a 1 / 30 second image of a paramecium swimming in a droplet of water, as it passed through the focal plane of the microscope.

The ability to image many different elements with the SIMS creates a rich data set for 3D display. A color 2D image has three channels (whether it is saved as RGB or HSI, as discussed in Chapter 1), and satellite 2D images typically have as many as seven bands including infrared. The SIMS may have practically any number. The ability of the instrument to detect trace levels (typically ppm or bet- ter) of every element or even isotope in the periodic table, plus molecular fragments, means that even for relatively simple specimens the multiband data present a challenge to store, display and interpret.

Another type of microscope that removes layers of atoms as it images them is the atom probe ion microscope. In this instrument, a strong electrical field between a sharply curved sample tip and a display screen causes atoms to be desorbed from the surface and accelerated toward the screen where they are imaged. The screen may include an electron channel plate to amplify the signal so that individual atoms can be seen, or may be used as a time-of-flight mass spectrometer with pulsed application of the high voltage so that the different atom species can be distinguished. With any of the instrument variations, the result is a highly magnified image of atoms from the sam- ple, showing atom arrangements in 3D as layer after layer is removed.

Examples of images from all these types of instruments were shown in Chapter 1.

Stereo measurement

There remains another way to see 3D structures. It is the same way that humans see depth in some real-world situations — having two eyes that face forward so that their fields-of-view overlap permits us to use stereoscopic vision to judge the relative distance to objects. In humans, this is done point by point, by moving our eyes in their sockets to bring each subject to the fovea, the portion of the retina with the densest packing of cones. The muscles in turn tell the brain what motion was needed to achieve convergence, and so we know whether one object is closer or farther than another.

Further into this section, we will see stereo vision used as a means to transmit 3D data to the hu- man viewer. It would be wrong to think that all human depth perception relies on stereoscopy. In fact, much of our judgment about the 3D world around us comes from other cues such as shad- ing, relative size, precedence, atmospheric effects (e.g., fog or haze), and motion flow (nearer ob- jects move more in our visual field when we move our head) that work just fine with one eye and are used in some computer-based measurement methods (Roberts, 1965; Horn, 1970, 1975; Wood- ham, 1978; Carlsen, 1985; Pentland, 1986). For the moment, however, let us see how stereoscopy can be used to determine depth information to put information into a 3D computer database.

The light microscope has a rather shallow depth of field, which is made even less in the confocal scanning light microscope discussed previously. Consequently, looking at a specimen with deep relief is not very satisfactory except at relatively low magnifications; however, the electron micro- scope has lenses with very small aperture angles, and thus has very great depth of field. Stere- oscopy is most commonly used with the scanning electron microscope (SEM) to produce in-focus images of rough surfaces. Tilting the specimen, or electromagnetically deflecting the scanning beam, can produce a pair of images from different points of view that form a stereo pair. Looking at one picture with each eye fools the brain into seeing the original rough surface.

Measuring the relief of surfaces from such images is the same in principle and in practice as using stereo pair images taken from aircraft or satellites to measure the elevation of topographic fea- tures on the earth or another planet. The richer detail in the satellite photos makes it easier to find matching points practically anywhere in the images, but by the same token requires more match- ing points to define the surface than the simpler geometry of typical specimens observed in the SEM. The mathematical relationship between the measured parallax (the apparent displacement of points in the left and right eye image) and the relative elevation of the two points on the surface was presented in Chapter 1.

Automatic matching of points from stereo pairs is a difficult task for computer-based image analy- sis (Marr and Poggio, 1976; Medioni and Nevatia, 1985; Kayaalp and Jain, 1987). It is usually per- formed by using the pattern of brightness values in one image, for instance the left one, as a tem- plate to perform a cross-correlation search for the most nearly identical pattern in the right image. The area of search is restricted by the possible displacement, which depends on the angle be- tween the two views and the maximum roughness of the surface, to a horizontal band in the sec- ond image. Some points will not be matched by this process because they may not be visible in both images (or are lost off the edges of one or the other image). Other points will match poorly because the local pattern of brightness values in the pixels includes some noise, and several parts of the image may have similar noise levels.

Matching many points produces a new image in which each pixel can be given a value based on the parallax, and hence represents the elevation of the surface. This range image will contain many false matches, but operations such as a median filter usually do a good job of removing the out- lier points to produce an overall range image of the surface. In the example of Figure 8 , cross-cor- relation matching of every point in the left-eye view with points in the right produces a disparity map (the horizontal distance between the location of the matched points) that contains false matches, which are filled in by a median filter as shown. The resulting elevation data can be used for measurement of points or line profiles, or used to reconstruct surface images, as illustrated. This use of surface range data is discussed further in Chapter 13.

Figure 8. Elevation measurement using stereo pair images: (a, b) left and right eye views of a microfossil; (c) raw cross-correlation disparity values; (d) median filter applied to c ; (e) surface height values measured from a mean plane, displayed as grey-scale values; (f) rendered perspective-corrected surface model using the values in e ; (g) the same surface reconstruction using elevation values from e a^ with surface brightness values from^ a.

b c d

e

f g

Presenting the images to a human viewer’s eyes so that two pictures acquired at different times can be fused in the mind and examined in depth is not difficult. It has been accomplished for years pho- tographically, and is now often done with modest tradeoff in lateral resolution using a computer to record and display the images. The methods discussed in the following paragraphs, which use stereo pair displays to communicate 3D information from generated images are equally applicable here.

It is far more difficult, however, to have the computer determine the depth of features in the struc- ture and construct a 3D database of points and their relationship to each other. Part of the prob- lem is that so much background detail is available from the (mostly) transparent medium sur- rounding the features of interest that it may dominate the local pixel brightnesses and make matching impossible. Another part of the problem is that it is no longer possible to assume that points maintain their order from left to right. In a 3D structure, points may change their order as they pass in front or in back of each other.

The consequence of these limitations has been that only in a very few, highly idealized cases has automatic fusion of stereo pair images from the TEM been attempted successfully. Simplification of the problem using very high contrast markers, such as small gold particles bound to selected sur- faces using antibodies, or some other highly selective stain, helps. In this case, only the markers are considered. Only a few dozen of these exist, and similar to the interesting points mentioned previously for mapping surfaces, they are easily detected (being usually far darker than anything else in the image) and only a few could possibly match.

Even with these markers, a human may still be needed to identify the matches. Given the match- ing points in the two images, the computer can construct a series of lines that describe the surface which the markers define, but this surface may be only a small part of the total structure. Figure 11 shows an example of this method in which human matching was performed. Similar methods can be applied to stained networks (Huang et al., 1994), or the distribution of precipitate particles in materials, for example.

Figure 10. Interpolation of a range image: (a) isolated, randomly arranged points with measured elevation (color coded); (b) contour lines drawn through the tesselation; (c) smooth interpolation between contour lines; (d) the constructed a (^) b range image (grey scale).

c (^) d

In most matching procedures, the points in left and right images are defined in terms of pixel ad- dresses. The error in the vertical dimension determined by stereoscopy is typically an order-of-mag- nitude greater than the precision of measurement of the parallax, because the vertical height is pro- portional to the lateral parallax times the cosecant of the small angle between the views. Improving the measurement of parallax between features to subpixel accuracy is therefore of considerable in- terest. Such improvement is possible in some cases, particularly when information from many pix- els can be combined. As described in Chapter 9, the centroids of features or the location of lines can be specified to an accuracy of one-tenth of a pixel or better.

3D data sets

In the case of matching of points between two stereo pair images, the database is a list of a few hundred or perhaps thousands of coordinates that usually define either a surface or perhaps nodes in a network structure. If these points are to be used for measurement, the coordinates and per- haps some information on which points are connected to which points are all that is required. If image reconstruction is intended, it will be necessary to interpolate additional points between them to complete a display. This is somewhat parallel to the use of boundary representation in two dimensions. It may offer a very compact record of the essential (or at least of the selected) infor- mation in the image, but it requires expansion to be visually useful to the human observer.

The most common way to store 3D data sets is as a series of 2D images. Each single image, which we have previously described as an array of pixels, is now seen to have depth. This depth is pre- sent either because the plane is truly an average over some depth of the sample (as in looking through a thin section in the light microscope) or based on the spacing between that plane and the next (as for instance a series of polished planes observed by reflected light). Because of the depth associated with the planes, we refer to the individual elements as voxels (volume elements) rather than pixels (pixel elements).

For viewing, processing and measurement, the voxels will ideally be regular and uniformly spaced. This goal is often accomplished with a cubic array of voxels, which is easiest to address in com- puter memory and corresponds to the way that some image acquisition devices function (e.g., the confocal scanning light microscope). Other arrangements of voxels in space offer some advan- tages. In a simple cubic arrangement, the neighboring voxels are at different distances from the central voxel, depending on whether they share a face, edge or corner. Deciding whether voxels touch and are part of the same feature requires a decision to include 6-, 18- or 26-neighbor con- nectedness, even more complicated than the 4- or 8-connectedness of square pixels in a 2D image, discussed in Chapter 6.

Figure 11. Example of decorating a surface with metal particles (Golgi stain) shown in transmission electron micrographs (Peachey and Heath; 1989) with elevations that are measured stereoscopically to form a network describing the surface (b).

a b

lossy compression methods (discussed in Chapter 2) argues against their use. For 3D image stacks compressed using MPEG methods, the artefacts in the z -direction (interpreted as time) are even worse than those in each individual x-y plane.

It is instructive to compare this situation to that of computer-aided design (CAD). For manmade ob- jects with comparatively simple geometric surfaces, only a tiny number of point coordinates are re- quired to define the entire 3D structure. This kind of boundary representation is very compact, but it often takes some time (or specialized display hardware) to render a drawing with realistic sur- faces from such a data set. For a voxel image, the storage requirements are great but information is immediately available without computation for each location, and the various display images shown in this chapter can usually be produced very quickly (sometimes even at interactive speeds) by modest computers.

For instance, given a series of surfaces defined by boundary representation or a few coordinates, the generation of a display may proceed by first constructing all of the points for one plane, cal- culating the local angles of the plane with respect to the viewer and light source, using those to determine a brightness value, and plotting that value on the screen. At the same time, another im- age memory is used to store the actual depth ( z -value) of the surface at that point. After one plane is complete, the next one is similarly drawn except that the depth value is compared point by point to the values in the z -buffer to determine whether the plane is in front of or behind the pre- vious values. Of course, each point is only drawn if it lies in front. This procedure permits multi- ple intersecting planes to be drawn on the screen correctly. (For more information on graphic pre- sentation of 3D CAD data, see Foley and Van Dam, 1984; or Hearn and Baker, 1986.)

Additional logic is needed to clip the edges of the planes to the stored boundaries, to change the reflectivity rules used to calculate brightness depending on the surface characteristics, and so forth. Standard texts on computer graphics describe algorithms for accomplishing these tasks and devote considerable space to the relative efficiency of various methods because the time involved can be significant. By comparison, looking up the value in a large array, or even running through a col- umn in the array to add densities or find the maximum value, is very fast. This is particularly true if the array can be held in memory rather than requiring disk access.

The difficulties of aligning sequential slices to produce a 3D data set were discussed above. In many cases, there may be several 3D data sets obtained by different imaging techniques (e.g., magnetic resonance imaging (MRI), x-ray, and PET images of the head) which must be aligned to each other. They also commonly have different resolutions and voxel sizes, so that interpolation is needed to adjust them to match one another. The situation is a direct analog to the 2D problems encountered in geographical information systems (GIS) in which surface maps, images in different wavelengths from different satellites, aerial photographs and other information must be aligned and combined.

The general problem is usually described as one of registering the multiple data sets. The two principal techniques, which are complementary, are to use cross-correlation methods on the entire pixel or voxel array as discussed in Chapter 5, or to isolate specific features in the multiple images and use them as fiducial marks to perform warping (Brown, 1992; Besl, 1992; van den Elsen et al., 1993, 1994, 1995; Reddy and Chatterji, 1996; Frederik et al., 1997; West et al., 1997).

Slicing the data set

Most 3D image data sets are actually stored as a series of 2D images, so it is very easy to access any of the individual image planes, usually called slices. Playing the series of slices back in order to create an animation or “movie” is perhaps the most common tool available to let the user view the data. It is often quite effective in letting the viewer perform the 3D integration, and as it

recapitulates the way the images may have been acquired (but with a much compressed time base), most viewers can understand images presented in this way. A simple user interface need only allow the viewer to vary the speed of the animation, change direction or stop at a chosen slice, for example. The same software now widely available to play back movies on the computer screen can be used for this purpose.

One problem with presenting the original images as slices of the data is that the orientation of some features in the 3D structure may not show up very well in the slices. It is useful to be able to change the orientation of the slices to look at any plane through the data, either in still or animated playback. This change in orientation is quite easy to do as long as the orientation of the slices is parallel to the x- , y- , or z -axes in the data set. If the depth direction is understood as the z -axis, then the x - and y - axes are the horizontal and vertical edges of the individual images. If the data are stored as discrete voxels, then accessing the data to form an image on planes parallel to these directions is just a mat- ter of calculating the addresses of voxels using offsets to the start of each row and column in the ar- ray. This addressing can be done at real-time speeds if the data are held in memory, but is somewhat slower if the data are stored on a disk drive because the voxels that are adjacent along scan lines in the original slice images are stored contiguously on disk and can be read as a group in a single pass. When a different orientation is required the voxels must be located at widely separated places in the file, however and it takes time to move the head and wait for the disk to rotate.

Displaying an image in planes parallel to the x- , y- , and z -axes was introduced in Chapter 11. Fig- ure 12 shows another example of orthogonal slices. The images are MRIs of a human head. The views are generally described as transaxial (perpendicular to the subject’s spine), sagittal (parallel

Figure 12. A few slices from a complete set of MRI head scan data. Images a through c show transaxial sections (3 from a set of 46), images d and e are coronal sections (2 from a set of 42), and f is a sagittal section (1 from a set of 30).

a b c

d e^ f

on plane images obtained in the transaxial direction. The poorer resolution in the z direction is ev- ident, but still the overall impression of 3D structure is quite good. These views can also be ani- mated, by moving one (or several) of the planes through the data set while keeping the other or- thogonal planes fixed to act as a visual reference.

Unfortunately, there is no good way to demonstrate this time-based animation in a print medium. Once upon a time, children’s cartoon books used a “flip” mode with animation printed on a series of pages that the viewer could literally flip or riffle through at a fast enough rate to cause flicker-fu- sion in the eye and see motion. That form of animation takes a lot of pages and is really only good for very simple images such as cartoons. It is unlikely to appeal to the publishers of books and tech- nical journals. All that can really be done here is to show a few of the still images from such a se- quence and appeal to the reader’s imagination to supply a necessarily weak impression of the ef- fect of a live animation. Many online Web sites can show such animations as Quicktime® movies.

Figure 16 shows a series of images that can be used to show “moving pictures” of this kind. They are actually a portion of the series of images which Eadweard Muybridge recorded of a running horse by setting up a row of cameras that were tripped in order as the moving horse broke threads attached to the shutter releases. His purpose was to show that the horse’s feet were not always in contact with the ground (in order to win a wager for Leland Stanford, Jr.). The individual still pictures show that. Once it was realized that such images could be viewed in se- quence to recreate the impression of smooth motion, our modern motion picture industry (and ultimately television) became possible.

It is difficult to show motion using printed images in books. There is current interest in the use of videotape or compact disk for the distribution of technical papers, which will perhaps offer a medium that can use time as a third axis to substitute for a spatial axis and show 3D structure through motion. The possibilities will be mentioned again in connection with rotation and other time-based display methods.

Motion, or a sequence of images, is used to show multidimensional data in many cases. “Flipping” through a series of planes provides a crude method of showing data sets that occupy three spatial dimensions. Another effective animation shows a view of an entire 3D data set while varying the opacity of the voxels. Even for a data set that occupies two spatial dimensions, transitions be- tween many kinds of information may be used effectively. Figure 17 shows this multiplicity with weather data, showing temperature, wind velocity and other parameters displayed onto a map of the U.S. In general, displays that utilize a 2D map as an organizing basis for multidimensional data

Figure 15. Several views of the MRI head data from Figure 12 along section planes normal to the axes of the voxel array. The voxels were taken from the trans-axial slices, and so the resolution is poorer in the direction normal to the planes than in the planes.

a (^) b

such as road networks, geological formations, and so on, called GIS, have many types of data and can only display a small fraction of it at any one time.

Of course, time itself is also a valid third dimension and the acquisition of a series of images in rapid succession to study changes in structure or composition with time can employ many of the same analytical and visualization tools as images covering three space dimensions. Figure 18 shows a series of images recorded at video rate (30 frames per second) from a confocal light mi- croscope. Such data sets can be assembled into a cube in which the z direction is time, and changes studied by sectioning this volume in planes along the z direction, or viewed volumetri- cally, or as a time sequence.

Figure 16. A few of the series of historic photographs taken by Eadweard Muybridge to show the motion of a running horse. Viewed rapidly in succession, these create the illusion of continuous motion.