4.5 Preparing the digital image
As discussed in chapter 2, early X-rays were captured on analog film, and some museums have large collections of such analog images. Digitizing these analog X-rays for computer analysis is essentially the same procedure as digitizing a photograph taken on analog film. A digital scanner analyzes the surface of the image or photograph and converts it into a two-dimensional array of numbers that can be visualized as a digital image. For example, figure  shows a simple image laid out as a 5 x 5 grid where each number (on the left) corresponds to the shading of a single pixel region in the image.
A key parameter is the size of each pixel, which is typically specified in terms of dots per inch and abbreviated dpi. Many of the X-rays of Vermeer’s canvases are digitized at 600 dpi, which means that each square inch is divided into blocks of 600 by 600 pixels (corresponding to 236 by 236 pixels per square cm). The importance of this parameter is that smaller pixels allow better resolution of small features. For example, suppose a canvas was woven with a thread density of 18 threads per cm. With 236 pixels to represent these 18 threads, this would allow approximately 13 pixels for each thread. Is this enough? Figure  shows the same patch of a canvas X-ray sampled at several different dpi resolutions.
Since the digitization process is needed whenever images are to be represented in computers, it has been the subject of considerable study.1 For canvases of the seventeenth to the nineteenth century, 600 dpi provides a reasonable compromise between adequate resolution and file size. Observe that both of the interfaces in the figures in § 4.3 and § 4.4 have boxes where the image resolution is specified. Parallel to the importance of acquiring the images with an adequate resolution, it is also necessary to record the actual resolution of the scanner in order to make sense of the measurements. For example, if an image were digitized at 600 dpi but mislabeled as 300 dpi, all features would be reported as half-sized: a thread-count of 18 threads/cm would be miscalculated as 9 threads/cm. If the dpi is not recorded properly, then it may be possible to estimate it if the image contains a ruler or some other feature of known size. This adds an extra step and introduces an extra source of error into the counting procedure. Some scanners add resolution information to the metadata associated with the scan, but it is poor practice to rely on metadata in the digital file, as this can be easily lost or corrupted when the file is loaded into and saved from editing software.
In grey-scale digital images (such as X-rays), numbers near zero typically represent black, while numbers near one typically represent white, and intermediate values represent varying shades of grey.
A 1x1 cm portion of canvas from The Milkmaid is sampled at 600 dpi, 300 dpi, 150 dpi, 75 dpi, and 37.5 dpi. Maintaining adequate resolution is crucial, since the features of interest (the visual impression of the threads) are missing from the under-sampled images.
1 One of the major results in this field is the sampling theorem of Nyquist, which says that waves must be sampled at a rate that is at least twice the highest frequency contained in the wave. For example, if the threads were exactly sinusoidal and at a density of 18 threads per cm, then at least 36 pixels per cm would be required. Since the threads are not sinusoids, and since there may well be noises or other forms of distortion of the image, a good rule of thumb is to use 5 to 10 times the maximum expected frequency. For real canvases, 300 dpi may be borderline, and so 600 dpi is a preferred value for the digitization of canvas.