Analysis of camera trapping data

Example analysis with Ivory Coast Maxwell’s duiker.

Eric Howe, Eric Rexstad and Len Thomas http://distancesampling.org (CREEM, Univ of St Andrews)https://creem.st-andrews.ac.uk
2023-12-20

Analysis of camera trapping data using distance sampling

A distance sampling approach to the analysis of camera trapping data offers the potential advantage that individual animal identification is not required. However, accurate animal-to-camera detection distances are required. This requires calibration prior to the survey with images of objects taken at known distances from the camera. See details in Howe, Buckland, Després-Einspenner, & Kühl (2017) for description of the field work and data analysis. Here we present analysis of data from Howe et al. (2017) using the R package Distance (Miller, Rexstad, Thomas, Marshall, & Laake, 2019).

Estimating temporal availability for detection

Heat- and motion-sensitive camera traps detect only moving animals within the range of the sensor and the field of view of the camera. Animals are therefore unavailable for detection by camera traps when they are stationary, and when they are above (e.g., semi-arboreal species) or below (e.g., semi-fossorial species) the range of the sensor or the camera, regardless of their distance from the camera in two dimensions. This temporally limited availability for detection must be accounted for to avoid negative bias in estimated densities. When data are abundant, researchers may choose to include only data from times when 100% of the population can be assumed to be active within the vertical range of camera traps (Howe et al., 2017). However, for rarely-detected species or surveys with lower effort, it might be necessary to include most or all observations of distance. In these situations, survey duration (\(T_k\)) might be 12- or 24-hours per day, and it becomes necessary to estimate the proportion of time included in \(T_k\) when animals were available for detection. Methods for estimating this proportion directly from CT data have been described (Rowcliffe, Kays, Kranstauber, Carbone, & Jansen, 2014), and it can be included in analyses to estimate density (Bessone et al., 2020), for example as another multiplier, potentially with an associated standard errors.

Data input

Times of independent camera triggering events for the period 28 June 21 September 2014 at 23 cameras are recorded in a file described in the data repository Howe, Buckland, Després-Einspenner, Kühl, & Buckland (2018).

trigger.events <- read.table(file="https://datadryad.org/stash/downloads/file_stream/73223",
                             header=TRUE)

The format of the trigger.events data frame is adjusted to create a datetime field for use in the activity package Rowcliffe (2021)

trigger.events$date <- paste("2014",
                       sprintf("%02i", trigger.events$month), 
                       sprintf("%02i", trigger.events$day),
                       sep="/")
trigger.events$time <- paste(sprintf("%02i", trigger.events$hour),
                       sprintf("%02i", trigger.events$minute),
                       sep=":")
trigger.events$datetime <- paste(trigger.events$date, trigger.events$time)

Functions in the activity package

We will employ two functions from the activity package. First, convert the time of day of a camera triggering event into the fraction of the 24hr cycle when the event took place, measured in radians. In other words, an event occurring at midday is recorded as \(\pi\) and an event occurring at midnight is recorded as 2\(\pi\).

library(activity)
trigger.events$rtime <- gettime(trigger.events$datetime, 
                                tryFormats = "%Y/%m/%d %H:%M",
                                scale = "radian")

With the radian conversion of the camera triggering times, the distribution of the triggering events times is smoothed, using a kernel smoother by the function fitact. The function estimates the proportion of time (in a 24hr day) animals were active. In addition, the triggering time data can be resampled to provide a measure of uncertainty in the point estimate of activity proportion.

act_result <- fitact(trigger.events$rtime, sample="data", reps=100)

A plot of the histogram of triggering times, along with the fitted smooth is provided by a plot function applied to the object returned by fitact.

plot(act_result)
Fitted smooth to histogram of camera triggering times for Maxwell's duiker data.

Figure 1: Fitted smooth to histogram of camera triggering times for Maxwell’s duiker data.

The value computed by the smooth through the activity histogram can be extracted from the object created by fitact. The extraction reaches into the object to look at the slot called act. The uncertainty around the point estimate is derived from resampling that takes place within fitact. The slot will display the point estimates, standard error and confidence interval bounds.

print(act_result@act)
       act         se   lcl.2.5%  ucl.97.5% 
0.33463831 0.02133658 0.29386939 0.37740705 

The output above would be used to adjust density estimates for temporal activity if the cameras were in operation 24hrs per day. However, in this study, cameras were only active for 11.5 hours per day (0630-1800).

Adjustment for temporal availability

We use the temporal availability information to create a multiplier. Our multiplier must be defined as > proportion of the camera operation time animals were available to be detected

This is not equivalent to the value produced by the fitact function; that value is the proportion of 24hr animals were available to be detected. The availability multiplier must be adjusted based on the daily camera operation period. Uncertainty in this proportion is also included in our computations.

The point estimate and standard error are pulled from the fitact object, adjusted for daily camera operation time and placed into a data frame named creation in a named list, specifically in the fashion shown.

camera.operation.per.day <- 11.5
prop.camera.time <- camera.operation.per.day / 24
avail <- list(creation=data.frame(rate = act_result@act[1]/prop.camera.time,
                                  SE   = act_result@act[2]/prop.camera.time))

A more robust way of incorporating uncertainty in the temporal availability estimate will be described later.

Detection data analysis

Detection distances for the full daytime data set is also available on Howe et al. (2018) and is read in the code chunk below:

DuikerCameraTraps <- read.csv(file="https://datadryad.org/stash/downloads/file_stream/73221", 
                              header=TRUE, sep="\t")
DuikerCameraTraps$Area <- DuikerCameraTraps$Area / (1000*1000)
DuikerCameraTraps$object <- NA
DuikerCameraTraps$object[!is.na(DuikerCameraTraps$distance)] <- 1:sum(!is.na(DuikerCameraTraps$distance))

Data file recorded study area size in square meters; second line above converts this to area in square kilometers; the remaining lines create an object field, which uniquely identify each observation.

Exploratory Data Analysis

A quick summary of the data set including: How many camera stations and how many detections in total.

sum(!is.na(DuikerCameraTraps$distance))
[1] 11180
table(DuikerCameraTraps$Sample.Label)

  A1   A2   A3   A4   B1   B2   B3   B4   C1   C2   C3   C4   C5   C6 
 388   66  988  420    3 1951   73  208   52  195  767  153   41 2682 
  D3   D4   D5   E3   E4   E5   E6 
 342  193  524  518    1  375 1241 

Note, three sampling stations (B1, C5, E4) had no detections. The one record for each of those stations has distance recorded as NA, but the record is important because it contains effort information.

Distance recording

A quick examination of the distribution of detection distances; note the bespoke cutpoints causing distance bins to be narrow out to 8m, then increasing in width to the maximum detection distance of 21m.

breakpoints <- c(seq(0,8,1), 10, 12, 15, 21)
hist(DuikerCameraTraps$distance, breaks=breakpoints, main="Peak activity data set",
     xlab="Radial distance (m)")

Truncation decisions

As described by Howe et al. (2017):

a paucity of observations between 1 and 2 m but not between 2 and 3 m, so we left-truncated at 2 m. Fitted detection functions and probability density functions were heavy-tailed when distances >15 m were included, so we right truncated at 15 m.

Detection function fits

The conversion factor must be included both in the call to ds() and the call to bootdht().

Candidate models considered here differ from the candidate set presented in Howe et al. (2017). This set includes

The maximum number of parameters in models within the candidate model set is no more than 3.

library(Distance)
trunc.list <- list(left=2, right=15)
mybreaks <- c(seq(2,8,1), 10, 12, 15)
conversion <- convert_units("meter", NULL, "square kilometer")
uni1 <- ds(DuikerCameraTraps, transect = "point", key="unif", adjustment = "cos",
           nadj=1, convert_units = conversion,
           cutpoints = mybreaks, truncation = trunc.list)
uni2 <- ds(DuikerCameraTraps, transect = "point", key="unif", adjustment = "cos",
           nadj=2, convert_units = conversion,
           cutpoints = mybreaks, truncation = trunc.list)
uni3 <- ds(DuikerCameraTraps, transect = "point", key="unif", adjustment = "cos",
           nadj=3, convert_units = conversion,
           cutpoints = mybreaks, truncation = trunc.list)

hn0 <- ds(DuikerCameraTraps, transect = "point", key="hn", adjustment = NULL,
          convert_units = conversion, cutpoints = mybreaks, truncation = trunc.list)
hn1 <- ds(DuikerCameraTraps, transect = "point", key="hn", adjustment = "cos",
          nadj=1, convert_units = conversion,
          cutpoints = mybreaks, truncation = trunc.list)
hn2 <- ds(DuikerCameraTraps, transect = "point", key="hn", adjustment = "cos",
          nadj=2, convert_units = conversion,
          cutpoints = mybreaks, truncation = trunc.list)

hr0 <- ds(DuikerCameraTraps, transect = "point", key="hr", adjustment = NULL,
          convert_units = conversion, cutpoints = mybreaks, truncation = trunc.list)
hr1 <- ds(DuikerCameraTraps, transect = "point", key="hr", adjustment = "poly",
          nadj=1, convert_units = conversion,
          cutpoints = mybreaks, truncation = trunc.list)

We do not present the density estimates produced from the fitted detection function models because a) we have not chosen a preferred model and b) the density estimates have not been adjusted for viewing angle and temporal availability.

Model selection adjustments from overdispersion

Overdispersion causes AIC to select overly-complex models, so analysts should specify the number/order of adjustment terms manually when fitting distance sampling models to data from camera traps, rather than allowing automated selection using AIC. Howe, Buckland, Després-Einspenner, & Kühl (2019) describe two methods for performing model selection of distance sampling models in the face of overdispersion. Here we provide R functions to perform the first of these methods. The first method of Howe et al. (2019) employs a two-step process. First, an overdisersion factor \((\hat{c})\) is computed for each key function family from the most complex model in each family. The \(\hat{c}\) is derived from the \(\chi^2\) goodness of fit test statistic divided by its degrees of freedom. This results in an adjusted AIC score for each model in the key function family:

\[QAIC = -2 \left \{ \frac{log(\mathcal{L}(\hat{\theta}))}{\hat{c}} \right \} + 2K\]

Code to perform this QAIC computation is found in the function QAIC in the Distance package, and produces the following results:

Tables of QAIC values for each key function family are shown below (code for kable() calls suppressed for easier readability of results).

Table 1: QAIC values for uniform key models.
df QAIC
uni1 1 2825.316
uni2 2 2826.822
uni3 3 2823.921
Table 2: QAIC values for half normal key models.
df QAIC
hn0 1 2296.204
hn1 2 2292.252
hn2 3 2294.096
Table 3: QAIC values for hazard rate key models.
df QAIC
hr0 2 2465.119
hr1 3 2466.395

From this first pass of model selection based on QAIC values, we find the model with the uniform key function preferred by QAIC has three cosine adjustment terms. The preferred model from the half normal key function family has one cosine adjustment term. Finally, the preferable model from the hazard rate key function family has no adjustment terms.

The second step of model selection ranks the models by their \(\hat{c}\) values.

chats <- chi2_select(uni3, hn1, hr0)$criteria
modnames <- unlist(lapply(list(uni3, hn1, hr0), function(x) x$ddf$name.message))
results <- data.frame(modnames, chats)
results.sort <- results[order(results$chats),]
knitr::kable(results.sort, digits=2, row.names = FALSE,
             caption="Compare with Table S5 of Howe et al. (2018)") %>%
  kable_paper(full_width = FALSE) %>%
  row_spec(1, bold=TRUE,  background = "#4da6ff")
Table 4: Compare with Table S5 of Howe et al. (2018)
modnames chats
uniform key function with cosine(1,2,3) adjustments 15.63
half-normal key function with cosine(2) adjustments 16.54
hazard-rate key function 17.21

For this data set, the model chosen by this algorithm that adjusts for overdispersion is the same model (uniform key with three cosine adjustments) as would have been chosen by conventional model selection; but again, not the model selected by Howe et al. (2017) because of the differing candidate model sets.

Sensibility check for detection parameter estimates

As a check of the detection function vis-a-vis Howe et al. (2017), the paper reports the effective detection radius (\(\rho\)) to be 9.4m for the peak activity data set. Howe et al. (2017) employed a different candidate model set, resulting in the unadjusted hazard rate model as the preferred model. Here we present the estimated effective detection radius from the selected uniform key function with three cosine adjustment terms.

The effective detection radius can be derived from \(\hat{P_a}\) as reported by the function ds as

\[\hat{\rho} = \sqrt{\hat{P_a} \cdot w^2}\]

p_a <- uni3$ddf$fitted[1]
w <- range(mybreaks)[2] - range(mybreaks)[1]
rho <- sqrt(p_a * w^2)

\(\hat{P_a}\) is estimated to be 0.329, resulting in an estimate of \(\hat{\rho}\) of 7.457.

Selected detection function

par(mfrow=c(1,2))
plot(uni3, main="Daytime activity", xlab="Distance (m)",
     showpoints=FALSE, lwd=3, xlim=c(0, 15))
plot(uni3, main="Daytime activity", xlab="Distance (m)", pdf=TRUE,
     showpoints=FALSE, lwd=3, xlim=c(0, 15))

Density estimates

The camera traps do not view the entire area around them, as would be the case with simple point transect sampling. The portion of the area sampled needs to be incorporated in the estimation of abundance. The data file contains a column multiplier that represents the proportion of the circle sampled. Howe et al. (2017) notes the camera angle of view (AOV) of 42\(^{\circ}\). The proportion of the circle viewed is this value over 360\(^{\circ}\).

An argument to dht2 is sample_fraction, an obvious place to include this quantity. We also add the multiplier for temporal availability described in the section on temporal availability The dht2 function will produce analytical measures of precision with this call.

viewangle <- 42 # degrees
samfrac <- viewangle / 360
peak.uni.dens <- dht2(uni3, flatfile=DuikerCameraTraps, strat_formula = ~1,
                     sample_fraction = samfrac, er_est = "P2", multipliers = avail,
                     convert_units = conversion)
print(peak.uni.dens, report="density")
Density estimates from distance sampling
Stratification : geographical 
Variance       : P2, n/L 
Multipliers    : creation 
Sample fraction : 0.1166667 


Summary statistics:
 .Label  Area CoveredArea   Effort     n  k ER se.ER cv.ER
  Total 40.37    2596.317 31483179 10284 21  0     0 0.268

Density estimates:
 .Label Estimate    se    cv   LCI     UCI     df
  Total  17.2357 4.805 0.279 9.791 30.3411 23.322

Component percentages of variance:
 .Label Detection   ER Multipliers
  Total      2.17 92.6        5.23

Bootstrap for variance estimation

To produce a more reliable estimate of the precision of the point estimate, produce bootstrap estimates using bootdht. The user needs to create a function and another named list to facilitate use of the bootstrap: a summary function to extract information from each replicate and a multiplier list describing how temporal availability is being derived.

Summary function

As constructed, mysummary will keep the density estimate produced by each bootstrap replicate and the stratum (if any) to which the estimate pertains.

mysummary <- function(ests, fit){
  return(data.frame(Label = ests$individuals$D$Label,
                    Dhat = ests$individuals$D$Estimate))
}

Multiplier function

This rather complex list makes use of make_activity_fn that exists in the Distance package used to call the fitact function from the activity package. For the user, your responsibility is to provide three arguments to this function:

mult <- list(availability= make_activity_fn(trigger.events$rtime, sample="data",
                                            detector_daily_duration=camera.operation.per.day))

Speeding up the bootstrap

If you’ve tried running this example code in R you might have noticed that detection function fitting is quite slow. In general, camera traps produce a large amount of distance sampling data, and in addition these data tend to be “overdispersed” meaning (in this case) that there are lots of observations with the same distances. Together, this can cause analyses to run slowly, and this can be especially true for bootstrap analyses for variance estimation.

One way to speed up the bootstrap is to run multiple analyses in parallel, using multiple cores of your computer. You can achieve this using the cores argument to bootdht - for fastest results set this to the number of cores on your machine minus 1 (best to leave 1 free for other things). You can find the number of cores by calling parallel::detectCores() and we do this in the code below.

The second, which is only available to Microsoft Windows users, is to use the alternative optimization engine MCDS.exe to fit detection functions. To find out more about this, see our online example on the Alternative optimization engine. If you are using a Windows machine, you can download MCDS.exe into the required directory by executing the following lines (note, you only need to do this once, although if in future you update the Distance package you’ll need to re-download it):

download.file("http://distancesampling.org/R/MCDS.exe", 
              paste0(system.file(package="mrds"),"/MCDS.exe"), mode = "wb")
#Detach and reload the Distance package to make use of it
detach("package:Distance", unload = TRUE)
library(Distance)

We make use of this engine here by re-fitting the selected detection function model, uni3, using this engine, setting the ds argument optimizer="MCDS". If you are not using a Windows machine or have not doanloaded the MCDS.exe engine, then you should not execute this line, and your bootstrap will take longer to execute.

uni3 <- ds(DuikerCameraTraps, transect = "point", key="unif", adjustment = "cos",
           nadj=3, convert_units = conversion,
           cutpoints = mybreaks, truncation = trunc.list,
           optimizer = "MCDS")

Remaining arguments to bootdht

Just as with dht2 there are arguments for the model, flatfile, sample_fraction, convert.units and multipliers (although for bootdht multipliers uses a function rather than a single value). The only novel arguments to dht2 are resample_transects indicating camera stations are to be resampled with replacement, and nboot for the number of bootstrap replicates.

n.cores <- parallel::detectCores()
daytime.boot.uni <- bootdht(model=uni3, flatfile=DuikerCameraTraps,
                          resample_transects = TRUE, nboot=500, 
                          cores = n.cores - 1,
                          summary_fun=mysummary, sample_fraction = samfrac,
                          convert_units = conversion, multipliers=mult)

Confidence limits computed via the percentile method of the bootstrap.

print(summary(daytime.boot.uni))
Bootstrap results

Boostraps          : 500 
Successes          : 500 
Failures           : 0 

     median  mean   se  lcl   ucl   cv
Dhat  18.16 19.16 7.12 8.43 34.07 0.39
hist(daytime.boot.uni$Dhat, breaks = 20, 
     xlab="Estimated density", main="D-hat estimates bootstraps")
abline(v=quantile(daytime.boot.uni$Dhat, probs = c(0.025,0.975), na.rm=TRUE), lwd=2, lty=3)
Distribution of density estimates from bootstrap replicates.

Figure 2: Distribution of density estimates from bootstrap replicates.

The confidence interval derived from the bootstrap is considerably wider than the confidence interval derived from analytical methods.

An esoteric note on starting values and bootstrapping

Feel free to skip this unless you’re a fairly advanced user!

In some cases, it may be necessary to set starting values for the detection function optimization, to help it converge. This can be achieved using the initial_values arguemnt of the ds function. As an example, say we want to use the fitted values from the uniform + 2 cosine function uni2 as starting values for the first two parameters of the uniform + 3 cosine function fitting (and 0 for the third parameter). The following code does this:

uni3.with.startvals <- ds(DuikerCameraTraps, transect = "point", key="unif", adjustment = "cos",
           nadj=3,
           cutpoints = mybreaks, truncation = trunc.list, 
           initial_values = list(adjustment = c(as.numeric(uni2$ddf$par), 0)))

What about when it comes to bootstrapping for variance esitmation. You can pass this model in to boot.dht with no problems, so long as you don’t set ncores to more than 1. If you do set ncores to more than 1 it won’t work, returning 0 successful bootstraps. Why? Because uni2$ddf$par is not passed along to all those parallel cores. To fix this you have to hard-code the starting values. So, in this example, we see that the values are

print(uni2$ddf$par)
[1] 0.97178178 0.03541633

and so we use

uni3.with.startvals <- ds(DuikerCameraTraps, transect = "point", key="unif", adjustment = "cos",
           nadj=3,
           cutpoints = mybreaks, truncation = trunc.list, 
           initial_values = list(adjustment = c(0.97178178, 0.03541633, 0)))

and this will work fine in bootdht.

A final tip is that setting starting values can sometimes speed up the bootstrap (as optimization is faster if it starts from a good initial spot), so you might want to pass in the starting values from uni3 to your bootstrap routine - something like the following, using the MCDS optimizer: (Note - this code is set not to run in this examples file - just here to show what you might use).

print(uni3$ddf$par)
uni3.with.startvals <- ds(DuikerCameraTraps, transect = "point", key="unif", adjustment = "cos",
           nadj=3,
           cutpoints = mybreaks, truncation = trunc.list, 
           optimizer = "MCDS",
           initial_values = list(adjustment = c(0.93518220, -0.05345965, -0.08073799)))
daytime.boot.uni <- bootdht(model=uni3.with.startvals, flatfile=DuikerCameraTraps,
                          resample_transects = TRUE, nboot=500, 
                          cores = n.cores - 1,
                          summary_fun=mysummary, sample_fraction = samfrac,
                          convert_units = conversion, multipliers=mult)
Bessone, M., Kühl, H. S., Hohmann, G., Herbinger, I., N’Goran, K. P., Asanzi, P., … Fruth, B. (2020). Drawn out of the shadows: Surveying secretive forest species with camera trap distance sampling. Journal of Applied Ecology, 57(5), 963–974. https://doi.org/10.1111/1365-2664.13602
Howe, E. J., Buckland, S. T., Després-Einspenner, M.-L., & Kühl, H. S. (2017). Distance sampling with camera traps. Methods in Ecology and Evolution, 8(11), 1558–1565. https://doi.org/10.1111/2041-210X.12790
Howe, E. J., Buckland, S. T., Després-Einspenner, M.-L., & Kühl, H. S. (2019). Model selection with overdispersed distance sampling data. Methods in Ecology and Evolution, 10(1), 38–47. https://doi.org/10.1111/2041-210X.13082
Howe, E. J., Buckland, S. T., Després-Einspenner, M.-L., Kühl, H. S., & Buckland, S. T. (2018). Data from: Distance sampling with camera traps. https://doi.org/https://doi.org/10.5061/dryad.b4c70
Miller, D., Rexstad, E., Thomas, L., Marshall, L., & Laake, J. (2019). Distance sampling in r. Journal of Statistical Software, Articles, 89(1), 1–28. https://doi.org/10.18637/jss.v089.i01
Rowcliffe, J. M. (2021). Activity: Animal activity statistics. Retrieved from https://CRAN.R-project.org/package=activity
Rowcliffe, J. M., Kays, R., Kranstauber, B., Carbone, C., & Jansen, P. A. (2014). Quantifying levels of animal activity using camera trap data. Methods in Ecology and Evolution, 5(11), 1170–1179. https://doi.org/10.1111/2041-210X.12278

References