Example analysis of line transect data.
In this exercise, we use R
(R Core Team, 2019) and the Distance
package (Miller, Rexstad, Thomas, Marshall, & Laake, 2019) to fit different detection function models to line transect survey data of winter wren (Troglodytes troglodytes) density and abundance. These data were part of a study described by Buckland (2006).
ds
functionNineteen line transects were walked twice (Figure 1).
The fields of the wren_lt
data set are:
This command assumes that the Distance
package has been installed on your computer. The R workspace wren_lt
contains detections of winter wrens from the line transect surveys of Buckland (2006).
The effort, or transect length has been adjusted to recognise each transect is walked twice. Examine the first few rows of wren_lt
using the function head()
head(wren_lt)
Region.Label Area Sample.Label Effort object distance Study.Area
1 Montrave 33.2 1 0.416 5 15 Montrave 4
2 Montrave 33.2 1 0.416 6 80 Montrave 4
3 Montrave 33.2 1 0.416 7 35 Montrave 4
4 Montrave 33.2 1 0.416 8 55 Montrave 4
5 Montrave 33.2 1 0.416 12 12 Montrave 4
6 Montrave 33.2 1 0.416 13 75 Montrave 4
The object wren_lt
is a dataframe object made up of rows and columns.
The code above determines the number of detection distances that are not missing. Why might there be rows in our data where detection distance is missing? Distance would have to be recorded as missing for rows representing transects on which there were no detections. The transect and its effort would need to appear in the data, but without detections, the perpendicular distance would be recorded as missing (NA).
Gain familiarity with the perpendicular distance data using the hist()
function (Figure 2).
hist(wren_lt$distance, xlab="Distance (m)", main="Winter wren line transects")
Note that there appears to be too few detections between 0 and 20m, and too many detections between 20m and 40m. This may be evidence of evasive movement by winter wrens; see further discussion of this below.
A guaranteed way to produce incorrect results from your analysis is to misspecify the units distances are measured. The
ds
function has an argumentconvert.units
where the user provides a value to report density in proper units. Providing an incorrect value will result in estimates that are out by orders of magnitude.
We can choose the units in which winter wren density is to be reported, we choose square kilometre. How to transmit this information to the ds
function?
The answer is another function convert_units
. Arguments to this function are
Specify the correct arguments to this function for the winter wren data set. Note: units are specified as quoted strings, singular rather than plural; e.g. “meter” rather than “meters”
conversion.factor <- convert_units("meter", "kilometer", "hectare")
ds
Detection functions are fitted using the ds
function and this function requires a data frame to have a column called distance
. We have this in our nests
data, therefore, we can simply supply the name of the data frame to the function along with additional arguments.
Details about the arguments for this function:
key="hn"
adjustment=NULL
convert_units=conversion.factor
wren.hn <- ds(data=wren_lt, key="hn", adjustment=NULL,
convert_units=conversion.factor)
On calling the ds
function, information is provided to the screen reminding the user what model has been fitted and the associated AIC value. More information is supplied by applying the summary()
function to the object created by ds()
.
summary(wren.hn)
Summary for distance analysis
Number of observations : 156
Distance range : 0 - 100
Model : Half-normal key function
AIC : 1418.188
Detection function parameters
Scale coefficient(s):
estimate se
(Intercept) 4.105816 0.1327744
Estimate SE CV
Average p 0.685037 0.05678866 0.08289868
N in covered region 227.724931 21.47288594 0.09429308
Summary statistics:
Region Area CoveredArea Effort n k ER se.ER
1 Montrave 33.2 193.2 9.66 156 19 16.14907 1.226096
cv.ER
1 0.07592366
Abundance:
Label Estimate se cv lcl ucl df
1 Total 39.13286 4.399026 0.1124126 31.30227 48.92235 74.24692
Density:
Label Estimate se cv lcl ucl df
1 Total 1.1787 0.1325008 0.1124126 0.9428394 1.473565 74.24692
summary
functionExamining the output produced by summary(wren.hn)
notice
Visually inspect the fitted detection function with the plot()
function, specifying the cutpoints histogram with argument breaks
(Figure 3):
cutpoints <- c(0,5,10,15,20,30,40,50,65,80,100)
plot(wren.hn, breaks=cutpoints, main="Half normal model, wren line transects")
Continue to note the presence of evasive movement in this plot of the fit of detection function to the observed data.
Detection function forms and shapes, are specified by changing the key
and adjustment
arguments.
The options available for key
detection functions are:
key="hn"
) - defaultkey="hr"
)key="unif"
)The options available for adjustment terms are:
adjustment=NULL
)adjustment="cos"
) - defaultadjustment="herm"
)adjustment="poly"
)To fit a uniform key function with cosine adjustment terms, use the command:
wren.unif.cos <- ds(wren_lt, key="unif", adjustment="cos",
convert_units=conversion.factor)
When this line of code is executed, multiple models will be fitted, succssively adding addition adjustment terms. When the model with four adjustment terms is fit, an error message is returned; but a uniform key with 3 cosine adjustments is fitted and contained in the returned object.
AIC model selection will be used to fit adjustment terms of up to order 5.
To fit a hazard rate key function with simple polynomial adjustment terms, then use the command:
wren.hr.poly <- ds(wren_lt, key="hr", adjustment="poly",
convert_units=conversion.factor)
Each fitted detection function produces a different estimate of winter wren abundance and density. The estimate depends upon the model chosen. The model selection tool for distance sampling data is AIC.
AIC(wren.hn, wren.hr.poly, wren.unif.cos)
df AIC
wren.hn 1 1418.188
wren.hr.poly 2 1412.133
wren.unif.cos 3 1416.433
df
in the AIC table indicates the number of parameters associated with each model.
In addition to the relative ranking of models provided by AIC, it is also important to know whether selected model(s) actually fit the data. The model is the basis of inference, so it is dangerous to make inference from a model that does not fit the data. Goodness of fit is assessed using the function gof_ds
. This function by default, reports the goodness of fit assessed by the Cramer von-Mises test along with a quantile-quantile plot showing locations of deviations from good fit. Optionally, a \(\chi^2\) goodness of fit test and a bootstrap version of the Kolomogorov-Smirnov goodness of fit test can be performed. Using function defaults, we see results only of the Cramer von-Mises test along with the Q-Q plot (Figure 4).
gof_ds(wren.hr.poly)
Goodness of fit results for ddf object
Distance sampling Cramer-von Mises test (unweighted)
Test statistic = 0.249898 p-value = 0.1885
Even though there may have been evasive movement, the goodness of fit statistics are still sufficient for using detection function models for inference.
The function summarise_ds_models
combines the work of AIC
and gof_ds
to produce a table of fitted models and summary statistics.
knitr::kable(summarize_ds_models(wren.hn, wren.hr.poly, wren.unif.cos),digits=3,
caption="Model comparison table for wren line transect data, Montrave.")
Model | Key function | Formula | C-vM p-value | \(\hat{P_a}\) | se(\(\hat{P_a}\)) | \(\Delta\)AIC | |
---|---|---|---|---|---|---|---|
2 | Hazard-rate | ~1 | 0.189 | 0.844 | 0.025 | 0.000 | |
3 | Uniform with cosine adjustment terms of order 1,2,3 | NA | 0.200 | 0.757 | 0.139 | 4.300 | |
1 | Half-normal | ~1 | 0.077 | 0.685 | 0.057 | 6.055 |
The AIC model selection tools suggest the hazard rate key function is the preferred model. However, examine the shape of the hazard rate detection function in contrast to the uniform cosine fitted detection function (Figure 5).
par(mfrow=c(1,2))
plot(wren.hr.poly, breaks=cutpoints, main="Hazard rate")
plot(wren.unif.cos, breaks=cutpoints, main="Uniform cosine")
The fellow who gathered these data (Prof Buckland) maintained the shape of the fitted hazard rate detection function is not plausible. Instead, he chose the uniform key with cosine adjustments for making inference (Buckland, 2006, p. 352):
Common Chaffinch and Winter Wren showed some evidence of observer avoidance. For 2 of the 12 data sets, this resulted in a fitted hazard rate detection function with certain detection out to ∼60 m, with an implausibly rapid fall-off beyond 70 m. In these two analyses, a model with a slightly higher AIC value and a more plausible fit to the detection function was selected.
This is an example of moderating objective model selection tools with common sense and understanding of field procedures.