Additional Resources
The ONS Beginner's Guide to UK Geography [http://www.ons.gov.uk/ons/guide-method/geography/beginner-s-guide/index.html] contains a useful section on postcode structure and georeferencing issues
Most computer applications which use postcodes will require the codes to be formatted in a particular way in order to operate correctly. Unfortunately there are multiple ways of writing postcodes in common usage, including several common errors, which are readily interpretable to the human eye but which will cause automated matching of postcodes to fail.
"SO17 1BJ" is a full, correct, unit postcode. It contains eight characters altogether, one of which is a space separating the incode ("SO17")and outcode ("1BJ") parts.
"SO171BJ" is a variant of this code which is technically correct but which contains only seven characters, with no space. Both seven and eight character versions are widely used.
The following are all variants of the same postcode which are close enough to the correct version that they would probably be correctly delivered by the postal service. However, none of them is technically correct and none would automatically match the correct version using general purpose office or statistical software. They illustrate problems of incorrect spacing, inconsistent letter case ("J" and "j") and substitution of similar characters ("I" and "1", "O" and "0"). These types of error occur most frequently when the original data were entered by hand and have not been subsequently cleaned or validated.
For those with large quantities of data to process, there is a range of commercial services on offer whereby computer-readable postcode lists can be cleaned and correctly formatted using specialist software - this is primarily intended for those preparing large mailshots. The detailed rules for determining whether a UK postcode is valid are complex and require a checklist of all the characters which are valid in each position of the code. It is unlikely that the research user who does not have a repeated need to match large postcode lists will wish to become involved in this level of checking, which will also necessitate programming skills.
If the user has a moderate level of confidence that a postcode list is already likely to be of reasonable quality and consistently formatted, it can be easiest to proceed with processing using software such as GeoConvert and then to attempt manual correction or removal of any postcodes which fail to be matched. Where "A" represents an alphabetic character, "N" represents a number and "_" represents a space, valid postcodes will always follow one of the following patterns:
Pattern | Example (code) | Example (place) |
AN_NAA | B1 1AA | Royal Mail Central Birmingham Delivery Office |
ANN_NAA | M60 2LA | Manchester City Council |
AAN_NAA | SA6 7JL | Driver and Vehicle Licensing Authority, Swansea |
AANN_NAA | SO17 1BJ | University of Southampton |
ANA_NAA | W1D 1AN | Tottenham Court Road Tube Station, London |
AANA_NAA | EC2R 8AH | Bank of England, London |
It is possible for the research user to employ simple tools to overcome some of the most commonly encountered problems before attempting to process their postcodes. Standard spreadsheet functions can be used to change all letters to upper case, to search for spaces, check for simple substitutions and to pad or condense the postcode to a desired length. Before reformatting postcodes the user should check whether a seven or eight-character version is required by the intended analysis software.
The ONS Beginner's Guide to UK Geography [http://www.ons.gov.uk/ons/guide-method/geography/beginner-s-guide/index.html] contains a useful section on postcode structure and georeferencing issues