Spatial Autocorrelation

The instantiation of Tobler’s first law of geography

Everything is related to everything else, but near things are more related than distant things.

Correlation of a variable with itself through space.

The correlation between an observation’s value on a variable and the value of close-by observations on the same variable

The degree to which characteristics at one location are similar (or dissimilar) to those nearby.

Measure of the extent to which the occurrence of an event in an areal unit constrains, or makes more probable, the occurrence of a similar event in a neighboring areal unit.

Several measures available:

Join Count Statistic

Moran’s I

Geary’s C ratio

General (Getis-Ord) G

Anselin’s Local Index of Spatial Autocorrelation (LISA)

Positive spatial autocorrelation

- high values

surrounded by nearby high values

- intermediate values surrounded

by nearby intermediate values

- low values surrounded by

nearby low values

Negative spatial autocorrelation

- high values

surrounded by nearby low values

- intermediate values surrounded

by nearby intermediate values

- low values surrounded by

nearby high values

Why Spatial Autocorrelation Matters

•Spatial autocorrelation is of interest in its own right because it suggests the operation of a spatial process

•Additionally, most statistical analyses are based on the assumption that the values of observations in each sample are independent of one another

–Positive spatial autocorrelation violates this, because samples taken from nearby areas are related to each other and are not independent

Moran’s I

•Where N is the number of cases

X is the mean of the variable

Xi is the variable value at a particular location

Xj is the variable value at another location

Wij is a weight indexing location of i relative to j

•Applied to a continuous variable for polygons or points

•Similar to correlation coefficient: varies between –1.0 and + 1.0

–Value 0 or close to 0: indicates no spatial autocorrelation or random data

–High values close to 1 or -1: high auto-correlation

•Positive value: clustered data

•Negative value: dispersed / uniform data

–Negative/positive values indicate negative/positive autocorrelation

–

•Differences from correlation coefficient are:

–Involves one variable only, not two variables

–Incorporates weights (wij) which index relative location

–Think of it as “the correlation between neighboring values on a variable”

–More precisely, the correlation between variable, X, and the “spatial lag” of X formed by averaging all the values of X for the neighboring polygons

Interpolation

Interpolation is the process of using points with known values or sample points to

estimate values at other unknown points. It can be used to predict unknown values

for any geographic point data, such as elevation, rainfall, chemical concentrations, noise levels, and so on.

It predicts values for cells in a raster from a limited number of sample data points.

Interpolation is based on the assumption that spatially distributed objects are spatially correlated; in other words, things that are close together tend to have similar characteristics.

Why interpolate?

Visiting every location in a study area to measure any data is usually difficult, time consuming and costly. Instead, measurement can be done for some sample input data points, that can be used to predict the values of all other locations. Input points can be either randomly, strategically, or regularly spaced points.

•Point based

Given a number of points whose locations and values are known, determine the values of other points; e.g. weather station readings, spot heights, oil well readings, porosity measurements

• Lines to points

Line data for interpolation; e.g. contours to elevation grids

• Areal interpolation

Given a set of data mapped on one set of source zones determine the values of the data for a different set of target zones; e.g. given population counts for census tracts, estimate populations for electoral districts

Types

Spatial Interpolation method can be categorized in several ways.

First they can be grouped into global and local methods.

1. Global Interpolation: It maps across a whole region; uses every known point available to estimate an unknown value. It produces smother surface with less abrupt variations. – e.g. Trend surface, regression models

2. Local Interpolation: It repeatedly applies to small portion of the whole region; uses a sample of known points to estimate an unknown value. This method is designed to capture the local or short range variation. – e.g. IDW, Thiessen polygon, Spline

Second, spatial interpolation methods can be grouped into exact and inexact interpolation.

1. An Exact interpolation predicts a value at the point location that is the same as is known value; honors the data input data points, passes through all the points . -e.g. Kriging

2. An Inexact interpolation (or approximate) predicts a value at the point location that differ from its known value; used when there is some uncertainty about the surface, believes that in many data sets there are global trends that varies slowly and overlain local fluctuations.

Third: spatial interpolation methods may be deterministic or stochastic

1.Deterministic Models use a mathematical function to predict unknown values and result in hard classification of the value of features.

●

2. Statistical Techniques produce confidence limits to the accuracy of a prediction but are more difficult to execute since more parameters need to be set.

Deterministic Models :

1.Trend surface analysis / Polynomial

2.Minimum Curvature Spline

3.Inverse Distance Weighted

4.Natural neighbourhood

Rectangular

header ad

Spatial Autocorrelation