Multipliers and indirect surveys

Dung surveys including estimates of production and decay rates.

Eric Rexstad (CREEM, Univ of St Andrews)

We consider indirect methods to estimate abundance and hence include multipliers in the abundance calculations. The first problem uses data from a dung survey of deer and there are two levels of multipliers that need to be incorporated in the analysis (dung production rate and dung decay rate).


The objectives of this exercise are to

Dung survey of deer

The question is how to estimate of the density of sika deer in a number of woodlands in the Scottish Borders (Marques et al., 2001). These animals are shy and will be aware of the presence of an observer before the observer detects them, making surveys of this species challenging. As a consequence, indirect estimation methods have been applied to this problem. In this manner, an estimate of density is produced for some sign generated by deer (in this case, faecal or dung pellets) and this estimate is transformed to density of deer (\(D_{\textrm{deer}}\)) by

\[ \hat D_{\textrm{deer}} = \frac{\textrm{dung deposited daily}}{\textrm{dung production rate (per animal)}} \] where the dung deposited daily is given by

\[ \textrm{dung deposited daily} = \frac{\hat D_{\textrm{pellet groups}}}{\textrm{mean time to decay}} \] Hence, we use distance sampling to produce a pellet group density estimate, then adjust it accordingly to account for the production and decay processes operating during the time the data were being acquired. We will also take uncertainty in the dung production and decay rates into account in our final estimate of deer density.

Data from 9 woodlands (labelled A-H and J) were collected according to the survey design (Figure 1) but note that data from block D were not included in this exercise.

Location of sika deer survey in southern Scotland and the survey design (from  [@Maretal01]). Note the differing amounts of effort in different woodlands based on information derived from pilot surveys.

Figure 1: Location of sika deer survey in southern Scotland and the survey design (from (Marques et al., 2001)). Note the differing amounts of effort in different woodlands based on information derived from pilot surveys.

In addition to these data, we also require estimates of the production rate. From a literature search, we learn that sika deer produce 25 pellet groups daily but this source did not provide a measure of variability of this estimate. During the course of our surveys we also followed the fate of some marked pellet groups to estimate the decay (disappearance) rates of a pellet group. A thorough discussion of methods useful for estimating decay rates and associated measures of precision can be found in Laing et al. (2003).

There are many factors that might influence both production and decay rates, and for purposes of this exercise we will make the simplifying assumption that decay rate is homogeneous across these woodlands; with their mean time to decay of 163 days and a standard error of 13 days. (If you were to conduct a survey such as this, you would want to investigate this assumption more thoroughly.)

Getting started

These data (called sikadeer) are available in the Distance package. Detection of deer dung takes place at small spatial scales; perpendicular distances are measured in centimeters. But transects were long; measured in kilometers and deer densities are customarily reported in numbers kilometer-2.

conversion.factor <- convert_units("centimeter", "kilometer", "square kilometer")

Fit detection function to dung pellets

Fit the usual series of models (i.e. half normal, hazard rate, uniform) models to the distances to pellet groups and decide on a detection function. This detection function (Figure 2) will be used to obtain \(\hat D_{\textrm{pellet groups}}\).

deer.df <- ds(sikadeer, key="hn", truncation="10%", convert_units = conversion.factor)
plot(deer.df, main="Half normal detection function")
Simple detection function to deer pellet line transect data.

Figure 2: Simple detection function to deer pellet line transect data.

  Region Area CoveredArea Effort    n  k        ER      se.ER
1      A 13.9    0.005950   1.70 1217 13 715.88234 119.918872
2      B 10.3    0.003850   1.10  396 10 359.99999  86.859289
3      C  8.6    0.001575   0.45   17  3  37.77778   8.521202
4      E  8.0    0.002975   0.85   30  5  35.29412  16.568939
5      F 14.0    0.000700   0.20   29  1 145.00000   0.000000
6      G 15.2    0.001400   0.40   32  3  80.00000  39.686269
7      H 11.3    0.000700   0.20    3  1  15.00000   0.000000
8      J  9.6    0.000350   0.10    7  1  70.00000   0.000000
9  Total 90.9    0.017500   5.00 1731 37 346.19999  68.158305
1 0.1675120
2 0.2412758
3 0.2255612
4 0.4694533
5 0.0000000
6 0.4960784
7 0.0000000
8 0.0000000
9 0.1968755

Have a look at the Summary statistics for this model - note some woodlands have but a single transect of effort allocated.


The next step is to create an object which contains the multipliers we wish to use. We already have estimates of dung production rates but need similar information on dung decay (or persistence) rate. Analysis is based upon methods presented in Laing et al. (2003).

Data to calculate dung persistence has been collected in the file dung_persistence.csv. Following code from (Meredith, 2017).

MIKE.persistence <- function(DATA) {
#  Purpose: calculate mean persistence time (mean time to decay) for dung/nest data 
#  Input: data frame with at least two columns:
#         DAYS - calendar day on which dung status was observed
#         STATE - dung status: 1-intact, 0-decayed
#  Output: point estimate, standard error and CV of mean persistence time
#  Attribution: code from Mike Meredith website: 
#   Citing: CITES elephant protocol
  ##   Fit logistic regression model to STATE on DAYS, extract coefficients
  dung.glm <- glm(STATE ~ DAYS, data=DATA, family=binomial(link = "logit"))
  betas <- coefficients(dung.glm)
  ##   Calculate mean persistence time
  mean.decay <- -(1+exp(-betas[1])) * log(1+exp(betas[1])) / betas[2]
  ## Calculate the variance of the estimate
  vcovar <- vcov(dung.glm)
  var0 <- vcovar[1,1]  # variance of beta0
  var1 <- vcovar[2,2]  # variance of beta1
  covar <- vcovar[2,1] # covariance
  deriv0 <- -(1-exp(-betas[1]) * log(1+exp(betas[1])))/betas[2]
  deriv1 <- -mean.decay/betas[2]
  var.mean <- var0*deriv0^2 + 2*covar*deriv0*deriv1 + var1*deriv1^2
  ## Calculate the SE and CV and return
  se.mean <- sqrt(var.mean)
  cv.mean <- se.mean/mean.decay
  out <- c(mean.decay, se.mean, 100*cv.mean)
  names(out) <- c("Mean persistence time", "SE", "%CV")
  plot(decay$DAYS, jitter(decay$STATE, amount=0.10), xlab="Days since initiation",
       ylab="Dung persists (yes=1)",
       main="Eight dung piles revisited over time")
  curve(predict(dung.glm, data.frame(DAYS=x), type="resp"), add=TRUE)
  abline(v=mean.decay, lwd=2, lty=3)
decay <- read.csv("dung_persistence.csv")
persistence.time <- MIKE.persistence(decay)
Logistic curve fitted to pellet persistence survey data.  Vertical line represents day at which 50% of pellets have decayed to non-detectable.

Figure 3: Logistic curve fitted to pellet persistence survey data. Vertical line represents day at which 50% of pellets have decayed to non-detectable.

Mean persistence time                    SE                   %CV 
           163.396748             14.226998              8.707026 

Running the above command should have produced a plot of dung persistence versus days since produced and fitted a logistic regression (this is like a simple linear regression but restricts the response to taking values between 0 and 1). Note the points can in reality only take values between 0 and 1 but for the purposes of plotting have been ‘jittered’ to avoid over-plotting.

An estimate of mean persistence time and measure of variability are also provided - make a note of these as they will be required below. Dotted vertical line indicates the time at which the estimated probability of persistence is 0.5.

As stated above, we want an object which contains information on the dung production rate (and standard error) and dung decay rate (and standard error). The following command creates a list containing two data frames:

# Create list of multipliers
mult <- list(creation = data.frame(rate=25, SE=0),
             decay    = data.frame(rate=163, SE=14.2))
  rate SE
1   25  0

  rate   SE
1  163 14.2

The final step is to use these multipliers to convert \(\hat D_{\textrm{pellet groups}}\) to \(\hat D_{\textrm{deer}}\) (as in the equations above) - for this we need to employ the dht2 function. In the command below the multipliers= argument allows us to specify the rates and standard errors. There are a couple of other function arguments that need some explanation:

deer.ests <- dht2(deer.df, flatfile=sikadeer, strat_formula=~Region.Label,
                 convert_units=conversion.factor, multipliers=mult, 
print(deer.ests, report="density")
Density estimates from distance sampling
Stratification : geographical 
Variance       : R2, n/L 
Multipliers    : creation, decay 
Sample fraction : 1 

Summary statistics:
 Region.Label Area CoveredArea Effort    n  k      ER   se.ER cv.ER
            A 13.9    0.005950   1.70 1217 13 715.882 119.919 0.168
            B 10.3    0.003850   1.10  396 10 360.000  86.859 0.241
            C  8.6    0.001575   0.45   17  3  37.778   8.521 0.226
            E  8.0    0.002975   0.85   30  5  35.294  16.569 0.469
            F 14.0    0.000700   0.20   29  1 145.000   0.000 0.000
            G 15.2    0.001400   0.40   32  3  80.000  39.686 0.496
            H 11.3    0.000700   0.20    3  1  15.000   0.000 0.000
            J  9.6    0.000350   0.10    7  1  70.000   0.000 0.000
        Total 90.9    0.017500   5.00 1731 37 346.200 154.425 0.446

Density estimates:
 Region.Label Estimate     se    cv     LCI      UCI        df
            A  73.9165 14.248 0.193 49.6888 109.9573    21.037
            B  37.1708  9.643 0.259 21.3190  64.8091    12.031
            C   3.9006  0.955 0.245  1.7460   8.7141     2.779
            E   3.6442  1.746 0.479  1.0713  12.3958     4.337
            F  14.9716  1.428 0.095 12.4246  18.0407 63232.093
            G   8.2602  4.173 0.505  1.2114  56.3216     2.151
            H   1.5488  0.148 0.095  1.2853   1.8663 63232.096
            J   7.2277  0.689 0.095  5.9981   8.7093 63232.091
        Total  20.8475  3.011 0.144 15.5122  28.0179    25.610

Component percentages of variance:
 Region.Label Detection    ER Multipliers
            A      4.05 75.53       20.43
            B      2.23 86.49       11.28
            C      2.51 84.84       12.65
            E      0.66 96.04        3.31
            F     16.54  0.00       83.46
            G      0.59 96.44        2.97
            H     16.54  0.00       83.46
            J     16.54  0.00       83.46
        Total      0.75 99.25        0.00

Other stratification choices with dht2

This example of Sika deer on different hunting estates uses geographical stratification. There is also the option of using the option replicate for the stratification argument. This is useful when there are repeated surveys in a geographic area; the average abundance is computed and variance is variability between surveys. Alternatively effort_sum is used with replicate surveys, but few replicates reporting average variance. Finally, the specification of stratification="object" can be used when detections are made of different species, sexes or ages of animals. This option will produce species-specific abundance estimates as well as abundance estimate over all species, properly calculating variance of total abundance. More information is available in this diagramatic comparison as well as in the help file for ?dht2.

The function dht2 also provides information on the components of variance. Make a note of the these (contribution of detection function, encounter rate, decay rate and what happened to production rate component?) in each strata.

Notes regarding this dung survey

In woodland A, there were 13 transects on which over 1,200 pellet groups were detected: uncertainty in the estimated density (measured by CV) was 19% and the variance components were apportioned as detection probability 4%, encounter rate 76% and multipliers 20%.

In woodland E, there were 5 transects and 30 pellet groups resulting in a coefficient of variation (CV) of 48%: the variance components were apportioned as detection probability 0.7%, encounter rate 96% and multipliers 3%.

The CV of the abundance estimates for blocks F, H and J are identical (9%) because a pooled detection function was used across all blocks and the dung deposition and decay rates were not block-specific. The only element of the computation remaining that is block-specific is the encounter rate; and for these three blocks there was but a single transect per block, meaning the encounter rate variance could not be computed and was set to zero.

The estimated abundance across all blocks had a CV of 14%. But far and away, the greatest contribution to this uncertainty was encounter rate variance–differences in pellet encounters between transects. In the context of distance sampling, the uncertainty in the parameter estimates of the detection function accounts for <1% in the total estimate of deer abundance across the blocks.

Laing, S. E., Buckland, S. T., Burn, R. W., Lambie, D., & Amphlett, A. (2003). Dung and nest surveys: Estimating decay rate. Journal of Applied Ecology, 40, 1102–1111.
Marques, F. F. C., Buckland, S. T., Goffin, D., Dixon, C. E., Borchers, D. L., Mayle, B. A., & Peace, A. J. (2001). Estimating deer abundance from line transect surveys of dung: Sika deer in southern scotland. Journal of Applied Ecology, 38, 349–363.
Meredith, M. (2017). How long do animal signs remain visible? Retrieved from