The National Soil Inventory (NSI), with both attribute and measured properties, is the largest data set of its kind in England and Wales. The aim of this study will be to examine fully the chemical, multivariate, spatial and co-regional relations of the variables within the NSI, using both statistical and geostatistical methods. NSI data will be assembled in a form that is suitable for efficient statistical analysis and scrutinised for anomalies and inconsistencies. Initially, exploratory data analysis will be undertaken which will include descriptive statistics, correlation analysis, transformation of data and the problem of outliers, and the application of generalised linear models. Subsequently, a comprehensive multivariate analysis will be applied; principal components and principal co-ordinates analyses will be used to identify relations between variables and sites, together with numerical classification by hierarchical and non-hierarchical methods. England and Wales will be stratified using the results of the multivariate classification, as well as attributes such as land-use, elevation, parent material and climate. Variogram analysis will be employed to create and model conventional indicator and cross- and multi-variate variograms. Using ordinary kriging, co-kriging, and disjunctive kriging, an intensive grid of estimates will be created. The variograms of the properties and the kriging equations will then be applied to design an optimal sampling scheme for any future soil sampling. Sequential data reduction and subsequent reconstruction using geostatistical simulation will be used to explore the minimum possible number of samples capable of providing adequate information for future monitoring. A soil monitoring network will be designed, taking into account the stratification data and the optimal sampling and data reconstruction investigation. This network design will be tested against subsets of the NSI data obtained during samplings in 1994 and 1997.