What is it, and why do we need it?
By Anthony Glynn
In order to correctly infer the position of a camera, and to display the graphics properly, an AR or SLAM system needs to know about a few properties of the camera and lens configuration being used. When these values are known, the system is said to be using a calibrated camera. Camera calibration is the process by which we are able to determine the parameters of the camera.
In order to understand the function of the camera parameters, it helps to consider how the vision system models the process of projecting a 3-dimensional scene into a 2-dimensional image. This idealised model is known as the Pinhole Camera Model.
The Pinhole Camera Model
The construction of the Pinhole Camera Model closely resembles that of the camera obscura, and early lensless cameras. The model consists of two surfaces, one in front the other. The front surface contains an infinitely small hole, or aperture, and blocks almost all the light from reaching the surface behind it. Any light hitting the rear surface must have passed through the aperture, and this results in a perfectly focused image appearing on the rear surface, that has been flipped both horizontally and vertically. Photographic paper, or a digital image sensor would be placed on the rear surface to capture the projected image.
The size of the projected image on the rear surface, and correspondingly, how much of the scene is able to be captured by an image sensor of a fixed size, is dependent on how far the two surfaces are from each other. This distance between the surfaces is known as the focal length of the camera, and is arguably the single most important camera parameter.
Keeping the image sensor size fixed, but changing the focal length of our model, will affect the camera's field of view. The field of view is the maximum angle between the light rays coming through the aperture that will be captured by the image sensor.
To change the field of view, we could alternatively keep the focal length fixed, and change the size of the image sensor. The field of view is in fact dependent on the ratio between the sensor size and the focal length. Camera calibration would only be able to infer this ratio, so it is usual to hold the sensor size fixed (either of unit-length, or to the number of pixels) and compute the focal length with respect to this during the calibration process.
The Pinhole Camera Model is useful as an idealised model of projection, but unfortunately it cannot be constructed in practice. An infinitely small aperture size would require an infinite amount of exposure time to capture a sensible amount of light. Even if this were feasible, this would still make the system highly susceptible to motion blur. In practice, the size of the aperture has to be large enough to let in enough light inside a reasonable timeframe. This would result in a blurry image, so a lens has to be added in order to focus the light.
The additional parameters that are usually provided to any Computer Vision system help deal with the deviation of real world cameras from the idealised Pinhole Camera Model. Due to the addition of a lens, even the computed focal length may not reflect the physical distance between the aperture and the image sensor, and may even end up being different along the horizontal and vertical axes.
The centre of the aperture may not be perfectly aligned with the centre of the image sensor. The location of aperture centre in the resulting image is known as the principal point. In practice this parameter is often difficult to infer through calibration, so it is usually possible to get away with setting the principal point to be the centre of the image.
The lens in a real world camera may also end up distorting the resulting image. Lines that are straight in the 3-dimensional scene could end up appearing curved in the 2-dimensional projection. This distortion has to be dealt with. Some Computer Vision systems are able to incorporate a distortion model, in which case distortion parameters have to provided to the system. Alternatively, the images can be corrected for distortion as a preprocessing step. Camera calibration tools will usually compute distortion parameters as part of the calibration process.