How to Predict What You Haven't Measured

In 1951, Danie Krige needed to estimate gold concentrations between scattered drill holes. The method that bears his name turns out to be mathematically optimal — and it knows where it's guessing.

scroll to begin

Sample Locations
25
Samples

Scattered measurements

Imagine a region where you've drilled 25 test holes. Each gives you a measurement at one location — an ore grade, a temperature, a soil concentration. Dots are colored from low to high.

Between them, the map is empty.

What's the value here?

You need to estimate the value at every unmeasured location. This is spatial interpolation — and the obvious approaches have serious flaws.

Nearest neighbor

The simplest method: assign each location the value of the closest sample. The result is a patchwork of flat regions with hard boundaries.

Nature doesn't have edges like this.

Inverse distance weighting

Weight each sample by the inverse of its distance. Closer samples matter more. The result is smoother, but creates bull's-eye patterns around each sample and treats all directions equally.

It ignores the spatial structure of the data.

Spatial correlation

Real data has structure. Nearby measurements tend to be similar; distant ones tend to differ. Green lines connect similar nearby pairs; gold lines connect dissimilar ones.

The key to better prediction is quantifying this pattern.

The variogram

For every pair of samples, plot distance against the squared difference in values. At short distances, pairs are similar (low semivariance). At long distances, they're unrelated (semivariance plateaus).

Three parameters

A smooth model captures the pattern. The nugget measures micro-scale noise. The sill is the total variance. The range is the distance beyond which points are uncorrelated.

This is the spatial fingerprint of the data.

Optimal weights

Kriging uses the variogram to assign weights that minimize prediction error. Unlike IDW, the weights account for redundancy — two nearby samples share information and receive less combined weight than two separated ones.

The prediction surface

The kriging surface is smooth, respects the data exactly at sample locations, and exploits spatial correlation. It is the Best Linear Unbiased Predictor — no other linear method can do better.

Built-in uncertainty

Kriging also produces a variance at each location. Bright means uncertain; dark means confident. You get a map and a measure of how much to trust it.

Named after a mining engineer

The name "kriging" was coined by the French mathematician Georges Matheron in 1963, honoring Danie Krige's pioneering thesis on estimating gold reserves from borehole data in South African mines. The mathematical framework Matheron built around Krige's practical insight — geostatistics — has since spread to hydrology, meteorology, ecology, and machine learning.

The empirical variogram and its model

Each small dot is one sample pair. The large dots are binned averages. The gold curve is the fitted spherical model.

25 samples, 300 unique pairs. Spherical variogram model with fitted nugget, sill, and range.

Kriging is everywhere

Weather maps interpolate between weather stations. Soil maps interpolate between core samples. Pollution maps interpolate between monitoring stations. In machine learning, Gaussian process regression is kriging with a different name — the same optimal linear predictor, the same built-in uncertainty quantification.

Three methods, one truth

Nearest neighbor is blocky. IDW creates bull's-eyes. Kriging is smooth and accurate.

Nearest Neighbor · RMSE
IDW · RMSE
Kriging · RMSE

RMSE computed against the true underlying field at all 2,500 grid cells.

Built-in humility

Most interpolation methods output a number and nothing more. Kriging outputs a number and a confidence level. In a world that treats model predictions as certainties, this built-in humility is rare and valuable. The uncertainty map tells you where to drill next.

Try it yourself

Click the surface to add sample points. Click near an existing sample to remove it. Adjust the variogram range to see how spatial correlation assumptions affect the prediction.

Prediction surface (click to add/remove)
Kriging uncertainty