# Exploring the Link between Crime and Socio-Economic Status in Ottawa and Saskatoon: A Small-Area Geographical Analysis

## 5. Methods Of Analysis

### 5.1 Plan of Analysis

This publication is based on three separate studies and, as a result, the methods used in the analysis are slightly different. They represent a progressive calibration of techniques of statistical and geographic analysis. Study #sec1 of Ottawa is based on data for the smallest areal units available–dissemination areas (DAs). Study # 2 of Saskatoon is based on data for neighbourhoods. While the methods used in the Ottawa study were again employed, the geographic analysis was expanded to include spatial autocorrelation. In Study # 3, Ottawa's DAs are re-aggregated to match the boundaries of the city's neighbourhoods, and the findings of this analysis are compared directly to those of Saskatoon at the neighbourhood level

### 5.2 Statistical and Geographic Methods of Analysis

#### Descriptive statistics

In all three studies, the crime variables were calculated as a rate per 1,000 population related to the geographic unit of analysis (DAs or neighbourhoods). All of the census and socio-economic variables in the three studies were calculated at the ratio scale with the exception of variables relating to average income and value of dwelling, which were left at the interval scale. Descriptive statistics were calculated for each dataset to determine the minimum, maximum, mean, standard deviation and coefficient of variation of each variable.

#### Transformation of Variables

For the purposes of statistical analysis and to meet the basic assumptions and constraints of the general linear model, each of the crime and socio-economic variables employed in the three studies were transformed into a Z-score for each geographic unit of analysis (the DA or neighbourhood). The formula for this transformation is as follows:

**Z _{i} = ( x_{i} - x ) / sd_{x}**

(Where **Z _{i}** is the Z score,

**x**is the original value,

_{i}**x**is the mean of all values of x, and

**sd**is the standard deviation of that mean).

Following the transformation, therefore, each variable has a mean of 0 and a standard deviation of 1, allowing the relative position of each case (DA or neighborhood) to be assessed. For example, higher crime areas will have Z-score above 0 while lower crime areas will have values below 0. This standardization brings variables from different units of measurement onto the same scale and provides the quantitative justification for further statistical analysis, particularly multivariate. This type of transformation is common in crime research and was employed recently in a study of Winnipeg by Fitzgerald, Wisener and Savoie (2004).

#### Principal Components Analysis

Each of the three studies involved conducting a principal components analysis (PCA) on their respective datasets to examine the statistical relationship between crime and socio-economic status in Ottawa and Saskatoon. Essentially, PCA is a data reduction technique. It replaces a set of variables with a smaller number of components, which are made up of inter-correlated variables representing as much of the original data set as possible. Principal components analysis is an appropriate technique in an inductive search for common patterns of crime and socio-economic status in an urban area with the use of small area statistics and has been used in crime research by Hung (2002) and Mata (2003).

#### Multiple Regression

Multiple regression analysis is a multivariate technique that assesses the relationship of two or more independent variables on one dependent variable. It is used to describe the individual contribution of a number of independent variables toward predicting a dependent variable (McKean and Byers 2000). For this research, multiple regression analyses were performed on each of the datasets in the three studies to examine the strength and intensity of the relationship between crime (the dependent variable) and socio-economic conditions (the independent variables) and to identify significant "predictors" of crime in Ottawa and Saskatoon.

Standard multiple regression and step-wise multiple regression models were tested for each of the crime variables used in the studies (including total crime, violent, major property, minor property and drug offences).

#### Cartographic and GIS Analysis

For each of the three studies, a series of maps were produced to illustrate the geographic distribution of crime in Ottawa and Saskatoon and to examine the spatial relationship between crime and certain socio-economic conditions in both cities. ArcGIS (ESRI, http://www.esri.com) was the software used in this research. In Study # 1, the Ottawa Police Service (OPS) provided 2001 crime data for the city's 1187 dissemination areas. This data, along with 2001 census data, was then joined to Statistics Canada's digital cartographic file for Ottawa. In Study # 2, a digital cartographic file showing Saskatoon's 55 residential neighbourhood was obtained from the Planning Unit of the City of Saskatoon. This geographic data was then matched with 2003 crime, 2001 census and additional planning/development data for the city. In Study # 3, a digital cartographic file displaying Ottawa's 50 residential neighbourhoods was acquired from the Planning Department of the City of Ottawa. As stated above, the crime and census data from Study # 1 was re-aggregated to match these neighbourhood boundaries.

Choropleth maps were produced in each of the three studies. This type of map is used when the quantity in the geographical division is represented by the colour or shade of the area symbol placed in the enumeration unit – in this case, the DA or neighbourhood. As Dent (2000, p. 5) explains, several assumptions are made when choropleth maps are used. First, it is assumed that that the quantity being mapped is uniform in the enumeration area. Second, it is assumed that densities, rates or ratios are more important than absolute values. Because enumeration areas vary in size, symbolizing absolute values with shaded area symbols can lead to misinterpretation. Since all of the crime and socio-economic data employed in the three studies is aggregated to match geographic boundaries, it was determined that choropleth mapping was the most appropriate. The mapping classification is based on intervals of crime intensity. In Study # 1, high crime areas in Ottawa's DAs were mapped according to three categories – elevated, high and highest – in relation to their Z-values. In Studies # 2 and # 3, crime rates per 1,000 population in Ottawa and Saskatoon's neighbourhoods are mapped according to five categories ranging from lowest to highest rates of crime per 1,000 population.

#### Spatial Autocorrelation

Spatial autocorrelation was employed for the Saskatoon study only (Study # 3). It is apparent that while statistical techniques such as multiple regression and principal component analysis are effective in crime research they are non-spatial by design. And, while mapping is appropriate in illustrating geographic patterns of crime and socio-economic status, visualization in itself, is not an explicitly spatial approach. As a result, the technique of spatial autocorrelation was used in the Saskatoon study to directly determine the presence of spatial pattern in the mapped variables due to geographic proximity. As Johnston et al (2000, p.775) explain:

The most common form of spatial auto-correlation is where similar values for a variable tend to cluster together in adjacent observation units, so that on average across the map the values for neighbours are more similar than would occur if the allocation of values to observation-units were the result of a purely random mechanism.

In other words, spatial autocorrelation is used to determine clusters of strong association in the variables and is employed in this study to gauge the level of geographic concentration of crime in Saskatoon and the spatial relationship between crime and neighbourhood characteristics.

The software CrimeStat developed by Levine & Associates (2002) was used to calculate Moran's "I", one of the most commonly used spatial autocorrelation indicators. Moran's I (Moran, 1950) is also one of the oldest spatial statistics and is applied to zones or points, which have continuous variables associated with them (intensities). It is calculated as follows:

**I = N ∑**

_{i}∑_{j}W_{0}(X_{i}- X)( X_{j}- X) / (∑_{i}∑_{j}W_{0}) (X_{i}- X)^{2}where N is the number of cases, **Xi** is the variable value at a specified location, **i**, **Xj** is the variable value at another location, **j**, **0**is the mean of the variable and **Wij** is a distance weight applied to the comparison between location **i** and location **j**. The statistic is interpreted much like a correlation coefficient with values near +1 indicating a strong spatial pattern (high values located close to one another and low values located close to one another) and values near –1 indicating strong negative spatial autocorrelation. The significance of Moran's I is calculated as follows:

**Z(I) = I - E(I) / S**

_{E(I)f}where I is the empirical value calculated from a sample, E (I) is the theoretical mean of a random distribution and S_{E(I}_{)} is the theoretical standard deviation of E(I).

CrimeStat uses point locations to calculate spatial autocorrelation statistics. The data entry for the program requires X and Y values in the form of a projected coordinate system. Therefore, ArcGIS was used to compute the X and Y coordinates (not longitude and latitude) for the centroid of each of the 55 residential neighbourhoods in Saskatoon. To calculate Moran's I, CrimeStat also requires that intensity values be associated with each point. In this case, the intensity values were the Z-scores (not to be confused with the Z value of significance) for the five crime and 3 selected socio-economic variables in the 55 neighbourhoods.

- Date modified: