Point transect density estimation

Example analysis of point transect songbird data.

Eric Rexstad http://distancesampling.org (CREEM, Univ of St Andrews)https://creem.st-andrews.ac.uk
2022-03-14

In this exercise, we use R (R Core Team, 2019) and the Distance package (Miller, Rexstad, Thomas, Marshall, & Laake, 2019) to fit different detection function models to point transect survey data of savanna sparrows (Passerculus sandwichensis) density and abundance. These data were part of a study examining the effect of livestock grazing upon vegetation structure and consequently upon the avian community described by Knopf et al. (1988).

Steps in this analysis are similar to the steps taken in the line transect analysis of winter wren data.

Objectives

Survey design

A total of 373 point transects were placed in three pastures in the Arapaho National Wildlife Refuge in Colorado (Figure 1). Elevation of these pastures was ~2500m. We will not deal with pasture-level analysis of these data in this vignette and will alter the data to remove the strata designations.

Summer grazed pastures along Illinois River Arapaho National Wildlife Refuge, Colorado.  Figure from [@knopf_guild_1988].

Figure 1: Summer grazed pastures along Illinois River Arapaho National Wildlife Refuge, Colorado. Figure from (Knopf et al., 1988).

The fields of the Savannah_sparrow_1980 data set are:

Make the data available for R session

This command assumes that the dsdata package has been installed on your computer. The R workspace Savannah_sparrow_1980 contains detections of savanna sparrows from point transect surveys of Knopf et al. (1988).

library(Distance)
data(Savannah_sparrow_1980)
#  remove pasture-level identifier in Region.Label
Savannah_sparrow_1980$Region.Label <- "Single_stratum"

The code above overwrites the strata designations in the original data to make it appear that all data were derived from a single stratum. This makes the analysis simpler to perform. There are examples of analysis of stratified data in another vignette.

Examine the first few rows of Savannah_sparrow_1980 using the function head()

head(Savannah_sparrow_1980)
    Region.Label Area Sample.Label Effort object distance Study.Area
1 Single_stratum    1    POINT   1      1     NA       NA  SASP 1980
2 Single_stratum    1    POINT   2      1     NA       NA  SASP 1980
3 Single_stratum    1    POINT   3      1     NA       NA  SASP 1980
4 Single_stratum    1    POINT   4      1     NA       NA  SASP 1980
5 Single_stratum    1    POINT   5      1     NA       NA  SASP 1980
6 Single_stratum    1    POINT   6      1     NA       NA  SASP 1980

The object Savannah_sparrow_1980 is a dataframe object made up of rows and columns. In contrast to the Montrave winter wren line transect data used in the previous vignette, savannah sparrows were not detected at all point transects. Radial distances receive the value NA for transects where there were no detections. To determine the number of detections in this data set, we total the number of values in the distance field that are not NA

sum(!is.na(Savannah_sparrow_1980$distance))
[1] 276

Examine the distribution of detection distances

Gain familiarity with the radial distance data using the hist() function (Figure 2).

hist(Savannah_sparrow_1980$distance, xlab="Distance (m)", 
     main="Savannah sparrow point transects")
Histogram of radial distances of savannah sparrows across all pastures.

Figure 2: Histogram of radial distances of savannah sparrows across all pastures.

Note the shape of the radial distance histogram does not resemble the shape of perpendicular distances gathered from line transect sampling (Buckland, Rexstad, Marques, & Oedekoven, 2015, sec. 1.3).

Specify unit conversions

With point transects, there are only units of measure associated with the size of the study area and the radial distance measures, because effort is measured in number of visits, rather than distance.

conversion.factor <- convert_units("meter", NULL, "hectare")

Fitting a simple detection function model with ds

Detection functions are fitted using the ds function and this function requires a data frame to have a column called distance. We have this in our nests data, therefore, we can simply supply the name of the data frame to the function along with additional arguments.

Details about the arguments for this function:

As is customary, right truncation is employed to remove 5% of the observations most distant from the transects, as detections at these distances contain little information about the shape of the fitted probability density function near the point.

sasp.hn <- ds(data=Savannah_sparrow_1980, key="hn", adjustment=NULL,
              transect="point", convert_units=conversion.factor, truncation="5%")

On calling the ds function, information is provided to the screen reminding the user what model has been fitted and the associated AIC value. More information is supplied by applying the summary() function to the object created by ds().

summary(sasp.hn)

Summary for distance analysis 
Number of observations :  262 
Distance range         :  0  -  51.025 

Model : Half-normal key function 
AIC   : 2021.776 

Detection function parameters
Scale coefficient(s):  
            estimate         se
(Intercept) 3.044624 0.04270318

                      Estimate          SE         CV
Average p             0.321125  0.02296184 0.07150438
N in covered region 815.881752 71.61193757 0.08777245

Summary statistics:
          Region Area CoveredArea Effort   n   k        ER      se.ER
1 Single_stratum    1    305.0877    373 262 373 0.7024129 0.04726421
       cv.ER
1 0.06728836

Abundance:
  Label Estimate        se         cv      lcl      ucl       df
1 Total 2.674253 0.2625757 0.09818656 2.206264 3.241512 598.5882

Density:
  Label Estimate        se         cv      lcl      ucl       df
1 Total 2.674253 0.2625757 0.09818656 2.206264 3.241512 598.5882

Visually inspect the fitted detection function with the plot() function, specifying the cutpoints histogram with argument breaks. Add the argument pdf so the plot shows the probability densiy function rather than the detection function. The probability density function is preferred for assessing model fit because the PDF incorporates information about the availability of animals to be detected. There are few animals available to be detected at small distances, therefore lack of fit at small distances is not as consequential for points as it is for lines (Figure 3).

cutpoints <- c(0,5,10,15,20,30,40,max(Savannah_sparrow_1980$distance, na.rm=TRUE))
plot(sasp.hn, breaks=cutpoints, pdf=TRUE, main="Savannah sparrow point transect data.")
Fit of half normal detection function to savannah sparrow data.

Figure 3: Fit of half normal detection function to savannah sparrow data.

Specifying different detection functions

Detection function forms and shapes, are specified by changing the key and adjustment arguments.

The options available for key and adjustment elements detection functions are:

To fit a uniform key function with cosine adjustment terms, use the command:

sasp.unif.cos <- ds(Savannah_sparrow_1980, key="unif", adjustment="cos",
                    transect="point", convert_units=conversion.factor, truncation="5%")

To fit a hazard rate key function with simple polynomial adjustment terms, then use the command:

sasp.hr.poly <- ds(Savannah_sparrow_1980, key="hr", adjustment="poly", 
                   transect="point", convert_units=conversion.factor, truncation="5%")
Error in adj.check.order(adj.series, adj.order, key) : 
  Polynomial adjustment terms of order < 4 selected

Model comparison

Each fitted detection function produces a different estimate of Savannah sparrow abundance and density. The estimate depends upon the model chosen. The model selection tool for distance sampling data is AIC.

AIC(sasp.hn, sasp.hr.poly, sasp.unif.cos)
              df      AIC
sasp.hn        1 2021.776
sasp.hr.poly   2 2026.131
sasp.unif.cos  1 2023.178

Absolute goodness of fit

In addition to the relative ranking of models provided by AIC, it is also important to know whether selected model(s) actually fit the data. The model is the basis of inference, so it is dangerous to make inference from a model that does not fit the data. Goodness of fit is assessed using the function gof_ds (Figure 4).

gof_ds(sasp.hn)
Q-Q plot of half normal detection function to savannah sparrow data.

Figure 4: Q-Q plot of half normal detection function to savannah sparrow data.


Goodness of fit results for ddf object

Distance sampling Cramer-von Mises test (unweighted)
Test statistic = 0.0835959 p-value = 0.671325

Model comparison tables

The function summarise_ds_models combines the work of AIC and gof_ds to produce a table of fitted models and summary statistics.

knitr::kable(summarize_ds_models(sasp.hn, sasp.hr.poly, sasp.unif.cos),digits=3,
             caption="Model selection summary of savannah sparrow point transect data.")
Table 1: Model selection summary of savannah sparrow point transect data.
Model Key function Formula C-vM p-value \(\hat{P_a}\) se(\(\hat{P_a}\)) \(\Delta\)AIC
1 Half-normal ~1 0.671 0.321 0.023 0.000
3 Uniform with cosine adjustment term of order 1 NA 0.364 0.350 0.015 1.402
2 Hazard-rate ~1 0.674 0.326 0.038 4.355

Conclusions

Key differences between analysis of line transect data and point transect data

Buckland, S., Rexstad, E., Marques, T., & Oedekoven, C. (2015). Distance sampling: Methods and applications. Springer.
Knopf, F. L., Sedgwick, J. A., & Cannon, R. W. (1988). Guild structure of a riparian avifauna relative to seasonal cattle grazing. The Journal of Wildlife Management, 52(2), 280–290. https://doi.org/10.2307/3801235
Miller, D. L., Rexstad, E., Thomas, L., Marshall, L., & Laake, J. L. (2019). Distance sampling in r. Journal of Statistical Software, 89(1), 1–28. https://doi.org/10.18637/jss.v089.i01
R Core Team. (2019). R: A language and environment for statistical computing. Vienna Austria: R Foundation for Statistical Computing.

References