Geographical data linkage and mapping are to a large extent dependent on the quality of the source data measured in terms of accuracy and precision. Although these two concepts are quite similar and even interchangeable in common language, there is a very distinct technical definition attached to each of them. Accuracy could be interpreted as the measure of the truthfulness of the data whereas precision concerns the exactitude with which the data has been collected and recorded. It is quite possible for a geographical reference to be accurate but imprecise or vice versa. Higher precision can normally be achieved by increasing the resolution or level of detail of the data capture and recording system, whereas accuracy is governed by a range of factors including measurement tools, survey methods, human error and environmental factors. Accuracy and precision apply both to locational and non-locational data but we are here particularly concerned with their relevance to locational data.
With reference to the figure below, we here consider four possible scenarios for geographical referencing of a house. We shall consider these both in relation to the specification of a street address and a grid reference. In the figure, the shaded house indicates the actual address of interest and the bracket indicates the address described by the geographical reference in each case. We shall consider an imaginary true address of “39 Acacia Avenue, Silhurst, SH15 6BP” whose 1m-resolution grid reference is 456003, 121725.
Situation (a) illustrates a geographical reference which is neither precise nor accurate. An example would be “Hawthorne Avenue, Silhurst” or the 1km-resolution grid reference 457,121. The street level description does not describe a single address and is incorrect, referring to another nearby street. The reference to a 1km x 1km grid square is of insufficient precision to identify any single address and also refers to the incorrect square.
Situation (b) illustrates a precise but inaccurate reference, for example “41 Acacia Avenue, Silhurst” or the 1m-resolution grid reference 456019,121727. The house number here provides a higher level of precision and the grid reference to a 1m x 1m grid square is of sufficient resolution to identify a single address. However, in this case these references refer to an adjacent property. Similar instances can occur in postcoded data where an incorrect but valid postcode is substituted in error for the true postcode, such as “B51 3NS”for BS1 3NS”. Both are equivalent precision but in this case the textual inaccuracy causes a valid postcode in Birmingham to be given instead of the intended postcode in Bristol. Users must therefore take great care that apparent precision is not taken as an indicator of accuracy.
In example (c) the address reference “Acacia Avenue, Silhurst” is broadly correct but not sufficiently specific to identify a particular house. The same difficulty would occur if given the 100m-resolution grid reference 4560, 1217. This refers to the 100m x 100m grid square containing the correct house, but is a large area also likely to contain several other addresses. In data quality terms, these references are accurate but imprecise. These references may be sufficient to correctly match the supplied address to other datasets providing the required level of geographical precision is not high. For example, these would in most cases be sufficient to allocate the address to the correct local authority district, but insufficient to determine whether it stood within 100m of a high tension electricity supply line.
In the final instance (d), the full address "39 Acacia Avenue, Silhurst, SH15 6BP" and grid reference 456003, 121725 are supplied. These are both accurate and precise for the purposes of identifying the house in question.
It is important to recognize that what constitutes a sufficient level of precision will vary according to the application. If 39 Acacia Avenue is a single dwelling house occupied by one household, it provides a precise residential location for members of that household. If, however, the property were subdivided into flats, each occupied by a different household, then the address "39 Acacia Avenue" will be insufficient for some purposes and further information such as "Flat B, 39 Acacia Avenue" may be required. This type of subdivision is important when matching addresses between lists but does not usually result in addresses which have separate grid references. Address matching is a well-recognized data linkage problem in the domain of censuses and surveys and can contribute significantly to under-enumeration. Social science researchers should always assess the accuracy and precision of their geographical datasets in relation to their intended use.