```{r include=FALSE}
knitr::opts_chunk$set(eval=TRUE, echo=TRUE, message=FALSE, warnings=FALSE)
```
# Analysis of camera trapping data using distance sampling
A distance sampling approach to the analysis of camera trapping data offers the potential advantage that individual animal identification is not required. However, accurate animal-to-camera detection distances are required. This requires calibration prior to the survey with images of objects of known size taken at known distances from the camera. See details in @howeetal for description of the field work and data analysis. Here we present analysis of data from @howeetal using the R package `Distance` @miller.
## Estimating temporal availability for detection
Heat- and motion-sensitive camera traps detect only moving animals within the range of the sensor and the field of view of the camera. Animals are therefore unavailable for detection by camera traps when they are stationary, and when they are above (e.g., semi-arboreal species) or below (e.g., semi-fossorial species) the range of the sensor or the camera, regardless of their distance from the camera in two dimensions. This temporally limited availability for detection must be accounted for to avoid negative bias in estimated densities. When data are abundant, researchers may choose to include only data from times when 100% of the population can be assumed to be active within the vertical range of camera traps @howeetal. However, for rarely-detected species or surveys with lower effort, it might be necessary to include most or all observations of distance. In these situations, survey duration ($T_k$) might be 12- or 24-hours per day, and it becomes necessary to estimate the proportion of time included in $T_k$ when animals were available for detection. Methods for estimating this proportion directly from CT data have been described @rowcliffe_2014, and it can be included in analyses to estimate density @bessone_2020, for example as another multiplier, potentially with an associated standard errors.
## Data input
A data set for recording of detections during *peak activity* are included in the `Distance` package. Examine the `DuikerCameraTraps` described in the data repository @dryad.
```{r, readin, message=FALSE}
library(Distance)
data("DuikerCameraTraps")
```
## Exploratory Data Analysis
A quick summary of the data set including: How many camera stations and how many detections in total.
```{r smaltable}
sum(!is.na(DuikerCameraTraps$distance))
table(DuikerCameraTraps$Sample.Label)
```
Note, three sampling stations (B1, C5, E4) had no detections. The one record for each of those stations has distance recorded as `NA`, but the record is important because it contains effort information.
## Distance recording
A quick examination of the distribution of detection distances; note the bespoke cutpoints causing distance bins to be narrow out to 8m, then increasing in width to the maximum detection distance of 21m.
```{r}
breakpoints <- c(seq(0,8,1), 10, 12, 15, 21)
hist(DuikerCameraTraps$distance, breaks=breakpoints, main="Peak activity data set",
xlab="Radial distance (m)")
```
## Truncation decisions
As described by @howeetal:
> a paucity of observations between 1 and 2 m but not between 2 and 3 m, so we left-truncated at 2 m. Fitted detection functions and probability density functions were heavy-tailed when distances \>15 m were included, so we right truncated at 15 m.
# Detection function fits
The conversion factor must be included either in the call to `ds()` or the call to `bootdht()`. In this vignette, it is included as an argument to `bootdht()`.
Candidate models include the half normal key with 0 and 1 Hermite polynomial adjustment; uniform key with 1 and 2 cosine adjustments and the hazard rate key with 0, 1 and 2 cosine adjustments.
```{r fit}
conversion <- convert_units("meter", NULL, "square kilometer")
trunc.list <- list(left=2, right=15)
mybreaks <- c(seq(2,8,1), 10, 12, 15)
hn0 <- ds(DuikerCameraTraps, transect = "point", key="hn", adjustment = NULL,
cutpoints = mybreaks, truncation = trunc.list)
hn1 <- ds(DuikerCameraTraps, transect = "point", key="hn", adjustment = "herm",
order=2,
cutpoints = mybreaks, truncation = trunc.list)
uni1 <- ds(DuikerCameraTraps, transect = "point", key="unif", adjustment = "cos",
order=1,
cutpoints = mybreaks, truncation = trunc.list)
uni2 <- ds(DuikerCameraTraps, transect = "point", key="unif", adjustment = "cos",
order=c(1,2),
cutpoints = mybreaks, truncation = trunc.list)
hr0 <- ds(DuikerCameraTraps, transect = "point", key="hr", adjustment = NULL,
cutpoints = mybreaks, truncation = trunc.list)
hr1 <- ds(DuikerCameraTraps, transect = "point", key="hr", adjustment = "cos",
order=2,
cutpoints = mybreaks, truncation = trunc.list)
hr2 <- ds(DuikerCameraTraps, transect = "point", key="hr", adjustment = "cos",
order=c(2,3),
cutpoints = mybreaks, truncation = trunc.list)
```
# Model selection adjustments from overdispersion
Overdispersion causes AIC to select overly-complex models, so analysts should specify the number/order of adjustment terms manually when fitting distance sampling models to data from camera traps, rather than allowing automated selection using AIC. @howe_model_2019 describes two methods for performing model selection of distance sampling models in the face of overdispersion. Here we provide R functions to perform the first of these methods. The first method of @howe_model_2019 employs a two-step process. First, an overdisersion factor $(\hat{c})$ is computed for each key function family from the most complex model in each family. The $\hat{c}$ is derived from the $\chi^2$ goodness of fit test statistic divided by its degrees of freedom. This results in an adjusted AIC score for each model in the key function family:
$$QAIC = -2 \left \{ \frac{log(\mathcal{L}(\hat{\theta}))}{\hat{c}} \right \} + 2K$$
Code to perform this QAIC computation is found in the function `qaic.pass1`:
```{r showpass1}
chat <- function(modobj) {
# computes c-hat for a dsmodel object using Method 1 of Howe et al. (2018)
test <- gof_ds(modobj)
num <- test$chisquare$chi1$chisq
denom <- test$chisquare$chi1$df
chat <- num/denom
return(chat)
}
qaic <- function(modobj, chat) {
# computes QAIC for a dsmodel object given a c-hat
value <- 2* modobj$ddf$ds$value/chat + 2 * (length(modobj$ddf$ds$pars)+1)
return(value)
}
qaic.pass1 <- function(...) {
# Performs Pass 1 model selection based upon Method 1 of Howe et al. (2018)
# Arguments are dsmodel objects; assumed all based on same key function
# c-hat is computed for the most parameter-rich model in the group
# qaic is calculated for each model in group based upon this c-hat
# Result returned in the form of a data.frame with model name, npar, aic and qaic
models <- list(...)
num.models <- length(models)
npar <- unlist(lapply(models, function(x) length(x$ddf$ds$par)))
modname <- unlist(lapply(models, function(x) x$ddf$name.message))
aic <- unlist(lapply(models, function(x) x$ddf$criterion))
chat.bigmod <- chat(models[[which.max(npar)]])
qaic <- vector(mode="numeric", length = num.models)
for (i in 1:num.models) {
qaic[i] <- qaic(models[[i]], chat.bigmod)
}
nicetab <- data.frame(modname, npar, aic, qaic)
return(nicetab)
}
```
Tables of QAIC values for each key function family are shown below (code for `kable()` calls suppressed for easier readability of results).
```{r pass1a, echo=FALSE}
knitr::kable(qaic.pass1(hn0, hn1),
caption="QAIC values for half normal key models.")
```
```{r pass1b, echo=FALSE}
knitr::kable(qaic.pass1(uni1, uni2),
caption="QAIC values for uniform key models.")
```
```{r pass1c, echo=FALSE}
knitr::kable(qaic.pass1(hr0, hr1, hr2),
caption="QAIC values for hazard rate key models.")
```
From this first pass of model selection based on QAIC values, we find the preferable model with the hazard rate key function is one without adjustment terms. The model with the uniform key function preferred by QAIC has a single adjustment term; likewise for the half normal key function.
The second step of model selection ranks the models by their $\hat{c}$ values.
```{r pass2}
winners <- list(hn1, uni1, hr0)
chats <- unlist(lapply(winners, function(x) chat(x)))
modnames <- unlist(lapply(winners, function(x) x$ddf$name.message))
results <- data.frame(modnames, chats)
results.sort <- results[order(results$chats),]
knitr::kable(results.sort, digits=2, row.names = FALSE,
caption="Compare with Table S5 of Howe et al. (2018)")
```
For this data set, the model chosen by this algorithm that adjusts for overdispersion is the same model (half normal key without adjustments) as would have been chosen by conventional model selection.
# Sensibility check for detection parameter estimates
As a check of the detection function vis-a-vis @howeetal, the paper reports the effective detection radius ($\rho$) to be 9.4m for the peak activity data set.
The effective detection radius can be derived from $\hat{P_a}$ as reported by the function `ds` as
$$\hat{\rho} = \sqrt{\hat{P_a} \cdot w^2}$$
```{r}
p_a <- hr0$ddf$fitted[1]
w <- 15
rho <- sqrt(p_a * w^2)
```
$\hat{P_a}$ is estimated to be `r round(p_a,3)`, resulting in an estimate of $\hat{\rho}$ of `r round(rho,3)`.
# Selected detection function
```{r}
par(mfrow=c(1,2))
plot(hr0, main="Peak activity", xlab="Distance (m)",
showpoints=FALSE, lwd=3, xlim=c(0, 15))
plot(hr0, main="Peak activity", xlab="Distance (m)", pdf=TRUE,
showpoints=FALSE, lwd=3, xlim=c(0, 15))
```
```{r, echo=FALSE}
par(mfrow=c(1,1))
```
## Density estimates
The camera traps do not view the entire area around them, as would be the case with simple point transect sampling. The portion of the area sampled needs to be incorporated in the estimation of abundance. The data file contains a column `multiplier` that represents the proportion of the circle sampled. @howeetal notes the camera angle of view (AOV) of 42$^{\circ}$. The proportion of the circle viewed is this value over 360$^{\circ}$.
An argument to `dht2` is `sample_fraction`, an obvious place to include this quantity.
```{r, sampfrac}
viewangle <- 42 # degrees
samfrac <- viewangle / 360
conversion <- convert_units("meter", NULL, "square kilometer")
peak.hr.dens <- dht2(hr0, flatfile=DuikerCameraTraps, strat_formula = ~1,
sample_fraction = samfrac, er_est = "P2", convert_units = conversion)
print(peak.hr.dens, report="density")
```
# Bootstrap for variance estimation
To produce a more reliable estimate of the precision of the point estimate, produce bootstrap estimates using `bootdht`. Note, with an Intel i5 generation processor, this call to `bootdht` required ~3hrs of computing; you may desire more bootstrap replicates than provided here. Two issues to note when using `bootdht` with these camera trap data:
- because of the viewing angle of the camera, we must specify the `sample_fraction` argument just as was done in the call to `dht2`
- in @howeetal, estimates of density rather than estimates of abundance are presented. To produce estimates from `bootdht` consistent with @howeetal, we write our own simple summary function to extract density estimates from each bootstrap replicate. This is instead of relying upon the default function `bootdht_Nhat_summarize` provided in the `Distance` package, which produces estimates of abundance rather than density.
```{r, bootstrap, results='hide', eval=TRUE}
viewangle <- 42 # degrees
samfrac <- viewangle / 360
mysummary <- function(ests, fit){
return(data.frame(Dhat = ests$individuals$D$Estimate))
}
duiker.boot.hr <- bootdht(model=hr0, flatfile=DuikerCameraTraps, resample_transects = TRUE,
nboot=400, summary_fun=mysummary, sample_fraction = samfrac,
convert.units = conversion)
```
Confidence limits computed via the percentile method of the bootstrap.
```{r bootresult, eval=TRUE}
print(summary(duiker.boot.hr))
```
```{r, fig.width=8, fig.cap="Distribution of density estimates from bootstrap replicates.", eval=TRUE}
hist(duiker.boot.hr$Dhat, breaks = 20,
xlab="Estimated density", main="D-hat estimates bootstraps")
abline(v=quantile(duiker.boot.hr$Dhat, probs = c(0.025,0.975), na.rm=TRUE), lwd=2, lty=3)
```
Note the confidence limits computed from the bootstrap are somewhat wider than the confidence limits computed via `dht2`.