Alternative optimization engine for fitting detection functions

Examples demonstrating the use of the mcds.exe alternative optimization engine for fitting single platform detection functions in the Distance and mrds packages.

Len Thomas http://distancesampling.org (CREEM, Univ of St Andrews)https://creem.st-andrews.ac.uk
2023-12-20

Here we demonstrate the use of the alternative optimization engine mcds.exe in the Distance and mrds packages.

This vignette requires the packages Distance version 1.0.8 or later and mrds version 2.2.9 or later. It is designed for use in the Microsoft Windows operating system – the mcds.exe engine currently only has experimental support for MacOS or Linux.

Objectives

Introduction

The Distance package is designed to provide a simple way to fit detection functions and estimate abundance using conventional distance sampling methodology (i.e., single observer distance sampling, possibly with covariates, as described by Buckland et al. (2015)). The main function is ds. Underlying Distance is the package mrds – when the function ds is called it does some pre-processing and then calls the function ddf in the mrds package to do the work of detection function fitting. mrds uses maximum likelihood to fit the specified detection function model to the distance data using a built-in algorithm written in R.

An alternative method for analyzing distance sampling data is using the Distance for Windows software (Thomas et al., 2010). This software also uses maximum liklihood to fit the detection function models, and relies on software written in the programming language FORTRAN to do the fitting. The filename of this software is MCDS.exe.

In a perfect world, both methods would produce identical results given the same data and model specification, since the likelihood has only one maximum. However, the likelihood surface is sometimes complex, especially when monotonicity constraints are used (which ensures the estimated detection probability is flat or decreasing with increasing distance when adjustment terms are used) or with “overdispersed” or “spiked” data (see Figure 2 in Thomas et al. (2010)), and so in some (rare) cases one or other piece of software fails to find the maximum. To counteract this, it is possible to run both the R-based optimizer and MCDS.exe from the ds function within the Distance package or the ddf function within mrds package.

Another motivation for using the MCDS.exe software from within R is that the R-based optimizer is sometimes slow to converge and so using MCDS.exe in place of the R-based optimizer can save significant time, particularly when doing a nonparametric bootstrap for large datasets.

This vignette demonstrates how to download and then use the MCDS.exe sofware from within the Distance and mrds packages. For more information, see the MCDS.exe help page within the mrds package.

Downloading and verifying MCDS.exe

The program MCDS.exe does not come automatically with the Distance or mrds packages, to avoid violating CRAN rules, so you must first download it from the distance sampling website. You can check whether MCDS.exe is installed already or not by loading the Distance library:

Loading required package: mrds
This is mrds 2.2.9
Built: R 4.2.3; ; 2023-07-20 01:36:15 UTC; windows
MCDS.exe not detected, single observer analyses will only be run using optimiser in mrds R library. See ?MCDS for details.

Attaching package: 'Distance'
The following object is masked from 'package:mrds':

    create.bins

If MCDS.exe is not installed, then you will receive the message MCDS.exe not detected, single observer analyses will only be run using optimiser in mrds R library. See ?MCDS for details.

In this case, you need to download it from the Distancesampling.org web site:

download.file("http://distancesampling.org/R/MCDS.exe", paste0(system.file(package="mrds"),"/MCDS.exe"), mode = "wb")

Now if you reload the Distance package, the MCDS.exe not detected message should not be shown:

detach("package:Distance", unload = TRUE)
library(Distance)

Attaching package: 'Distance'
The following object is masked from 'package:mrds':

    create.bins

Now that this software is available, both it and the R optimizer will be used by default for each analysis; you can also choose to use just one or the other, as shown below.

Example with Golf Tee data

Both MCDS.exe and the R-based optimizer

This example (of golf tee data, using only observer 1) is taken from the R help for the ds function: (There is a warning about cluster sizes being coded as -1 that can be ignored.)

#Load data
data(book.tee.data)
tee.data <- subset(book.tee.data$book.tee.dataframe, observer==1)
#Fit detection function - default is half-normal with cosine adjustments
ds.model <- ds(tee.data, truncation = 4)
Starting AIC adjustment term selection.
Fitting half-normal key function
AIC= 311.138
Fitting half-normal key function with cosine(2) adjustments
AIC= 313.124

Half-normal key function selected.
No survey area information supplied, only estimating detection function.
summary(ds.model)

Summary for distance analysis 
Number of observations :  124 
Distance range         :  0  -  4 

Model       : Half-normal key function 
AIC         :  311.1385 
Optimisation:  mrds (nlminb) 

Detection function parameters
Scale coefficient(s):  
             estimate         se
(Intercept) 0.6632435 0.09981249

                       Estimate          SE         CV
Average p             0.5842744  0.04637627 0.07937412
N in covered region 212.2290462 20.85130344 0.09824906

Assuming you have MCDS.exe installed, the default is that both it and the R-based optimizer are run. Both give the same result in this example, and when this happens the result from the R-based optimizer is used. You can see this from the line of summary output:

Optimisation: mrds (nlminb)

where mrds is the R package that the Distance package relies on, and nlminb is the R-based optimizer.

You can see the process of both optimizers being used by setting the debug_level argument of the ds function to a value larger than the default of 0 and then examining the output:

ds.model <- ds(tee.data, truncation = 4, debug_level = 1)
Starting AIC adjustment term selection.
Fitting half-normal key function
DEBUG: initial values = -0.1031529 
Running MCDS.exe...
Command file written to C:\Users\lt5\AppData\Local\Temp\RtmpCSltvm\cmdtmp53e8649a53d1.txt
Stats file written to C:\Users\lt5\AppData\Local\Temp\RtmpCSltvm\stat53e832ca2d2c.txt
DEBUG: initial values = 0.6632378 

DEBUG: Convergence! 
       Iteration  0.0 
       Converge   = 0 
       nll        = 154.5692 
       parameters = 0.6632378 
MCDS.exe log likehood: -154.5697
MCDS.exe pars: 1.941067
mrds refitted log likehood: -154.5692276
mrds refitted pars: 0.6632378

DEBUG: Convergence! 
       Iteration  0.0 
       Converge   = 0 
       nll        = 154.5692 
       parameters = 0.6632435 
AIC= 311.138
Fitting half-normal key function with cosine(2) adjustments
DEBUG: initial values = -0.1031529 0 
Running MCDS.exe...
Command file written to C:\Users\lt5\AppData\Local\Temp\RtmpCSltvm\cmdtmp53e85790220e.txt
Stats file written to C:\Users\lt5\AppData\Local\Temp\RtmpCSltvm\stat53e85b436126.txt
DEBUG: initial values = 0.6606793 -0.0159333 

DEBUG: Convergence! 
       Iteration  0.0 
       Converge   = 0 
       nll        = 154.5619 
       parameters = 0.6606793, -0.0159333 
MCDS.exe log likehood: -154.5624
MCDS.exe pars: 1.936107, -0.0159333
mrds refitted log likehood: -154.5619307
mrds refitted pars: 0.6606793, -0.0159333

Iter: 1 fn: 154.5619     Pars:   0.66068 -0.01591
Iter: 2 fn: 154.5619     Pars:   0.66069 -0.01592
solnp--> Completed in 2 iterations

DEBUG: Convergence! 
       Iteration  0.0 
       Converge   = 0 
       nll        = 154.5619 
       parameters = 0.6606853, -0.0159233 
AIC= 313.124

Half-normal key function selected.
No survey area information supplied, only estimating detection function.

First the half-normal with no adjustments is run; for this model the MCDS.exe sofware is run first, followed by the R-based (mrds) optimizer. Both converge and both give the same nll (negative log-likelihood) or 154.5692, giving an AIC of 311.138. The model with half-normal and a cosine adjustment of order 2 is then fitted to the data, with first the MCDS.exe optimizer and then the R-based optimizer. Again both give the same result of nll 154.5619 and an AIC of 313.124. This is higher than the AIC with no adjustments so half-normal with no adjustments is chosen.

In this case, both optimizers produced the same result, so there is no benefit to run MCDS.exe.

Specifying which optimzier to run

As we said earlier, the default behaviour when MCDS.exe has been downloaded is to run both MCDS.exe and the R-based optimizer. However, the optimizer argument can be used to specify which to use – either both, R or MCDS. Here is an example with just the MCDS.exe optimizer:

ds.model <- ds(tee.data, truncation = 4, optimizer = "MCDS")
Starting AIC adjustment term selection.
Fitting half-normal key function
AIC= 311.138
Fitting half-normal key function with cosine(2) adjustments
AIC= 313.124

Half-normal key function selected.
No survey area information supplied, only estimating detection function.

Demonstration using ddf in mrds package

Here we demonstrate using both optimizers in the ddf function, rather than via ds.

#Half normal detection function
ddf.model <- ddf(dsmodel = ~mcds(key = "hn", formula = ~1), data = tee.data, method = "ds",
                 meta.data = list(width = 4))
#Half normal with cos(2) adjustment
ddf.model.cos2 <- ddf(dsmodel = ~mcds(key = "hn", adj.series = "cos", adj.order = 2, formula = ~1),
                      data = tee.data, method = "ds", meta.data = list(width = 4))
#Compare with AIC
AIC(ddf.model, ddf.model.cos2)
               df      AIC
ddf.model       1 311.1385
ddf.model.cos2  2 313.1239
#Model with no adjustment term has lower AIC; show summary of this model
summary(ddf.model)

Summary for ds object
Number of observations :  124 
Distance range         :  0  -  4 
AIC                    :  311.1385 
Optimisation           :  mrds (nlminb) 

Detection function:
 Half-normal key function 

Detection function parameters 
Scale coefficient(s): 
             estimate         se
(Intercept) 0.6632435 0.09981249

                       Estimate          SE         CV
Average p             0.5842744  0.04637627 0.07937412
N in covered region 212.2290462 20.85130344 0.09824906

As an exercise, fit using just the MCDS.exe optimizer:

ddf.model <- ddf(dsmodel = ~mcds(key = "hn", adj.series = "cos", adj.order = 2, 
                               formula = ~1), data = tee.data, method = "ds",
                 meta.data = list(width = 4),
                 control = list(optimizer = "MCDS"))
summary(ddf.model)

Summary for ds object
Number of observations :  124 
Distance range         :  0  -  4 
AIC                    :  313.1239 
Optimisation           :  MCDS.exe 

Detection function:
 Half-normal key function with cosine adjustment term of order 2 

Detection function parameters 
Scale coefficient(s): 
             estimate        se
(Intercept) 0.6606782 0.1043327

Adjustment term coefficient(s):  
                estimate        se
cos, order 2 -0.01593274 0.1351281

                       Estimate          SE        CV
Average p             0.5925856  0.08165144 0.1377884
N in covered region 209.2524623 31.22790760 0.1492356

Point transect example - wren data

This is an example of point transect data for a bird (wren), from Buckland (2006). In this case one of the optimizers fails correctly to constrain the detection function so the probability of detection is more than zero at all distances, and so we use the other optimizer for inference.

We load the wren 5 minute example dataset and define cutpoints for the distances (they were collected in intervals).

data("wren_5min")
bin.cutpoints.100m <- bin.cutpoints <- c(0, 10, 20, 30, 40, 60, 80, 100)

The following call to ds gives several warnings. Some warnings are about the detection function being less than zero at some distances. There is also a warning about the Hessian (which is used for variance estimation), but this relates to the Hermite(4, 6) model (i.e., two Hermite adjustment terms of order 4 and 6) which is not chosen using AIC and so this warning can be ignored.

wren5min.hn.herm.t100 <- ds(data=wren_5min, key="hn", adjustment="herm", 
                            transect="point", cutpoints=bin.cutpoints.100m)
Warning in create_bins(data, cutpoints): Some distances were outside
bins and have been removed.
Starting AIC adjustment term selection.
Fitting half-normal key function
AIC= 427.471
Fitting half-normal key function with Hermite(4) adjustments
Warning in check.mono(result, n.pts = control$mono.points): Detection
function is less than 0 at some distances
Warning in check.mono(result, n.pts = control$mono.points): Detection
function is less than 0 at some distances
AIC= 422.228
Fitting half-normal key function with Hermite(4,6) adjustments
Warning: First partial hessian is singular and second-partial hessian is NULL, no hessian

Warning: Detection function is less than 0 at some distances

Warning: Detection function is less than 0 at some distances
AIC= 423.255

Half-normal key function with Hermite(4) adjustments selected.
Warning in mrds::check.mono(model, n.pts = 20): Detection function is
less than 0 at some distances
summary(wren5min.hn.herm.t100)

Summary for distance analysis 
Number of observations :  132 
Distance range         :  0  -  100 

Model       : Half-normal key function with Hermite polynomial adjustment term of order 4 

Strict monotonicity constraints were enforced.
AIC         :  422.2284 
Optimisation:  MCDS.exe 

Detection function parameters
Scale coefficient(s):  
            estimate    se
(Intercept) 12.08697 1e+05

Adjustment term coefficient(s):  
               estimate         se
herm, order 4 0.5723854 0.07889437

                       Estimate         SE         CV
Average p             0.4399177  0.0253497 0.05762374
N in covered region 300.0561563 26.0954740 0.08696863

Summary statistics:
    Region Area CoveredArea Effort   n  k     ER     se.ER      cv.ER
1 Montrave 33.2     2010619     64 132 32 2.0625 0.1901692 0.09220324

Abundance:
  Label    Estimate         se        cv         lcl         ucl
1 Total 0.004954625 0.00053871 0.1087287 0.003988055 0.006155458
        df
1 57.84101

Density:
  Label     Estimate          se        cv          lcl          ucl
1 Total 0.0001492357 1.62262e-05 0.1087287 0.0001201222 0.0001854054
        df
1 57.84101

The MCDS.exe optimizer is the chosen one (see the `Optimisation’ line of output).

The warnings persist if only the MCDS.exe optimizer is used:

wren5min.hn.herm.t100.mcds <- ds(data=wren_5min, key="hn", adjustment="herm", 
                            transect="point", cutpoints=bin.cutpoints.100m,
                            optimizer = "MCDS")
Warning in create_bins(data, cutpoints): Some distances were outside
bins and have been removed.
Starting AIC adjustment term selection.
Fitting half-normal key function
AIC= 427.471
Fitting half-normal key function with Hermite(4) adjustments
Warning in check.mono(result, n.pts = control$mono.points): Detection
function is less than 0 at some distances
Warning in check.mono(result, n.pts = control$mono.points): Detection
function is less than 0 at some distances
AIC= 422.228
Fitting half-normal key function with Hermite(4,6) adjustments
Warning: First partial hessian is singular and second-partial hessian is NULL, no hessian

Warning: Detection function is less than 0 at some distances

Warning: Detection function is less than 0 at some distances
AIC= 423.255

Half-normal key function with Hermite(4) adjustments selected.
Warning in mrds::check.mono(model, n.pts = 20): Detection function is
less than 0 at some distances

Looking at a plot of the fitted object, it seems that the evaluated pdf is less than 0 at distances close to the truncation point (approx. 95m and greater):

plot(wren5min.hn.herm.t100.mcds, pdf = TRUE)

What appears to be happening here is a failure of the optimization routine to appropriately constrain the model parameters so that the detection function is valid. This happens on occasion (the routines aren’t perfect!) and where it does we recommend trying the other optimization routine. Here we use the R-based optimizer:

wren5min.hn.herm.t100.r <- ds(data=wren_5min, key="hn", adjustment="herm", 
                            transect="point", cutpoints=bin.cutpoints.100m,
                            optimizer = "R")
Warning in create_bins(data, cutpoints): Some distances were outside
bins and have been removed.
Starting AIC adjustment term selection.
Fitting half-normal key function
AIC= 427.471
Fitting half-normal key function with Hermite(4) adjustments
AIC= 422.743
Fitting half-normal key function with Hermite(4,6) adjustments
AIC= 424.52

Half-normal key function with Hermite(4) adjustments selected.

Here the fitted AIC for the chosen model (half normal with one Hermite adjustment of order 4) is 422.74, higher than that with the MCDS.exe optimizer (which was 422.23), which explains why the MCDS.exe optimizer fit was chosen when we allowed ds to choose freely. However, the detection function fit from MCDS.exe was invalid, because it went lower than 0 at about 95m, while the fit with the R-based optimizer looks valid:

plot(wren5min.hn.herm.t100.r, pdf = TRUE)

Hence in this case, we would use the R-based optimizer’s fit.

Camera trap example

For this example, it helps if you are familiar with the Analysis of camera trapping data vignette on the distanceexamples web site.

We first read in the Duiker data.

#Read in data and set up data for analysis
DuikerCameraTraps <- read.csv(file="https://datadryad.org/stash/downloads/file_stream/73221", 
                              header=TRUE, sep="\t")
DuikerCameraTraps$Area <- DuikerCameraTraps$Area / (1000*1000)
DuikerCameraTraps$object <- NA
DuikerCameraTraps$object[!is.na(DuikerCameraTraps$distance)] <- 1:sum(!is.na(DuikerCameraTraps$distance))

#Specify breakpoints and truncation
trunc.list <- list(left=2, right=15)
mybreaks <- c(seq(2,8,1), 10, 12, 15)

Then we fit the detection function selected in the camera trap vignette, uniform plus 3 cosine adjustment terms, and time how long the fitting takes:

start.time <- Sys.time()
uni3.r <- ds(DuikerCameraTraps, transect = "point", key="unif", adjustment = "cos",
           nadj=3, cutpoints = mybreaks, truncation = trunc.list, optimizer = "R")
Fitting uniform key function with cosine(1,2,3) adjustments
AIC= 44012.394
R.opt.time <- Sys.time() - start.time
summary(uni3.r)

Summary for distance analysis 
Number of observations :  10284 
Distance range         :  2  -  15 

Model       : Uniform key function with cosine adjustment terms of order 1,2,3 

Strict monotonicity constraints were enforced.
AIC         :  44012.39 
Optimisation:  mrds (nlminb) 

Detection function parameters
Scale coefficient(s):  
NULL

Adjustment term coefficient(s):  
                estimate         se
cos, order 1  0.93541299 0.01504249
cos, order 2 -0.05304180 0.02437441
cos, order 3 -0.08043313 0.01557445

                        Estimate           SE         CV
Average p           3.288267e-01 1.348162e-02 0.04099917
N in covered region 3.127483e+04 1.306897e+03 0.04178751

Summary statistics:
  Region  Area CoveredArea   Effort     n  k           ER
1    Tai 40.37 21858518573 31483179 10284 21 0.0003266506
         se.ER     cv.ER
1 8.970466e-05 0.2746196

Abundance:
  Label     Estimate           se        cv          lcl          ucl
1 Total 5.776078e-05 1.603804e-05 0.2776632 3.276637e-05 0.0001018211
        df
1 20.90147

Density:
  Label     Estimate           se        cv          lcl          ucl
1 Total 1.430785e-06 3.972762e-07 0.2776632 8.116514e-07 2.522197e-06
        df
1 20.90147

Fitting takes quite a while! - 2 mins. Here we try the MCDS.exe optimizer:

start.time <- Sys.time()
uni3.mcds <- ds(DuikerCameraTraps, transect = "point", key="unif", adjustment = "cos",
           nadj=3, cutpoints = mybreaks, truncation = trunc.list, optimizer = "MCDS")
Fitting uniform key function with cosine(1,2,3) adjustments
AIC= 44012.211
MCDS.opt.time <- Sys.time() - start.time
summary(uni3.mcds)

Summary for distance analysis 
Number of observations :  10284 
Distance range         :  2  -  15 

Model       : Uniform key function with cosine adjustment terms of order 1,2,3 

Strict monotonicity constraints were enforced.
AIC         :  44012.21 
Optimisation:  MCDS.exe 

Detection function parameters
Scale coefficient(s):  
NULL

Adjustment term coefficient(s):  
                estimate         se
cos, order 1  0.93518220 0.01504583
cos, order 2 -0.05345965 0.02438049
cos, order 3 -0.08073799 0.01557817

                        Estimate           SE         CV
Average p           3.290679e-01 1.349917e-02 0.04102246
N in covered region 3.125191e+04 1.306645e+03 0.04181008

Summary statistics:
  Region  Area CoveredArea   Effort     n  k           ER
1    Tai 40.37 21858518573 31483179 10284 21 0.0003266506
         se.ER     cv.ER
1 8.970466e-05 0.2746196

Abundance:
  Label     Estimate           se        cv          lcl         ucl
1 Total 5.771844e-05 1.602649e-05 0.2776666 3.274219e-05 0.000101747
       df
1 20.9025

Density:
  Label     Estimate         se        cv          lcl          ucl
1 Total 1.429736e-06 3.9699e-07 0.2776666 8.110524e-07 2.520361e-06
       df
1 20.9025

This took only 8 secs. Hence, for some datasets, it may be quicker to use the MCDS.exe optimizer. This makes a particularly big difference if using the nonparametric bootstrap to estimate variance.

Discussion

We have shown how to fit distance sampling detection functions (for single platform data) using either the R-based optimizer built into the ddf function (via calling ddf or, more likely, calling the ds function in the Distance package) or the MCDS.exe analysis engine used by Distance for Windows. In the vast majority of cases both fitting methods give the same result, and so there is no need to use both. However, the only downside is that fitting takes longer, as each is called in turn. If you have downloaded the MCDS.exe file and want to speed things up, you can use just the R-based optimizer by specifying optimizer = "R" in the call to ds or ddf, or just the MCDS.exe optimizer with optimizer = "MCDS".

Some situations where the two may produce different results are given below.

If you are interested in seeing more comparisons of the optimizers on various datasets, we maintain a test suite of both straightforward and challenging datasets together with test code to run and compare the two optimizers – this is available at the MCDS_mrds_compare repository.

If you encounter difficulties when using both optimizers, one possible troubleshooting step is to run the analysis first choosing one optimizer (e.g., specifing the argument optimizer = "MCDS") and then choosing the other (optimizer = "R"). This allows you clearly to see what the output of each optimizer is (including any error messages) and facilitates their comparison.

One other criterion to favour one optimizer over the other is speed, and we have found that for large datasets the MCDS.exe optimizer is quicker.

One thing to note is that the MCDS.exe file will get deleted each time you update the mrds package, so you’ll need to re-download the file if you want to continue using the MCDS.exe optimizer. As shown above, this only requires running one line of code.

Going forward, we plan to work on making further improvements to the R-based optimizer and it is possible that at some point in the future we are confident this optimizer is uniformly better (in terms of better fit and speed) than the MCDS.exe one. If this happens, we will update this vignette and also make announcements on the distance sampling list.

Buckland, S. T. (2006). Point transect surveys for songbirds: Robust methodologies. The Auk, 123(2), 345–345. https://doi.org/10.1642/0004-8038(2006)123[345:psfsrm]2.0.co;2
Buckland, S., Rexstad, E., Marques, T., & Oedekoven, C. (2015). Distance sampling: Methods and applications. Springer.
Thomas, L., Buckland, S. T., Rexstad, E. A., Laake, J. L., Strindberg, S., Hedley, S. L., … Marques, T. A. (2010). Distance software: Design and analysis of distance sampling surveys for estimating population size. Journal of Applied Ecology, 47, 5–14. https://doi.org/110.1111/j.1365-2664.2009.01737.x

References