Introductions

Who are we?

Ross is a Postdoctoral Research Fellow in the School of Biological Sciences at the University of Queensland. His research falls within the field of animal ecology, where he uses animal occurrence, abundance and movement information to test hypotheses and make predictions regarding individual- and population-level responses to environmental change. He’s used R for over 12 years, has published two R packages on CRAN, was lead R developer on the ZoaTrack.org project and regularly teaches R training courses at The University of Queensland and at other institutes around Australia. For more information and teaching resources, check out Ross’s website.

Vinay is an Postdoctoral Research Fellow at the Australian Institute of Marine Science. He is an ecologist that is particularly interested in using large spatio-temporal datasets to understand animal movements and distributions patterns. He has considerable experience using R to analyse and visualise large and complex spatial datasets. He has developed R code to analyse 2 and 3 dimensional movement patterns of animals using acoustic telemetry data from single study sites to continental scale arrays. Vinay’s R codes can be found on his github page

Course outline

In this course you will learn how to analyse and interpret your aquatic telemetry datasets in R. This workshop will demonstrate how R can make the processing of spatial data much quicker and easier than using standard GIS software, and can help you plot some deadly figures! At the end of this workshop you will also have the annotated R code that you can re-run at any time, share with collaborators and build on with those newly acquired data!

We designed this course not to comprehensively cover all the tools in R, but rather to teach you skills for taking your experience with R to the next level. Every new project comes with its own problems and questions and you will need to be independent, patient and creative to solve these challenges. It makes sense to invest time in becoming familiar with R, because today R is the leading platform for environmental data analysis and has some other functionalities which may surprise you!

This R workshop will be divided into 4 sessions intended to run about 1 hr 15 mins each.

Session 1: Wrangling Large Tracking Datasets and Data Visualisation

A brief introduction to R
Import large telemetry datasets into R using data.table
Filter and manipulate our data using %pipes% and the tidyverse group of R packages
Plot summary statistics using ggplot2

Session 2: Working with Spatial Objects and Interactive Mapping

Working with Spatial* objects
Generating interactive maps using leaflet
ESRI Shapefile import and export

Session 3: Visualising Animal Movements through Time

Producing track animations
Integration with Google Earth
Calculating distance metrics (direct, circuitous, least cost)
Plotting distance travelled through time

Session 4: Centres of Activity and Home Range Estimation

Calculating short-term centers of activity (COA)
Calculating and plotting home range area metrics using
- Minimum Convex Polygons
- Kernel distributions
- Brownian bridges

The main principles we hope you will learn today are…

Data wrangling in R is safe, fast, reliable and repeatable
R can easily handle large datasets
R is the ideal suite for performing your GIS operations
that in R, it is easy to produce amazing publication-ready images

Session 1: Wrangling Large Tracking Datasets and Data Visualisation

A brief Introduction to R

The process of turning raw telemetry data into publishable results is a highly involved process. Tracking data sets are becoming larger, and larger as they are being gathered over longer time periods, over larger spatial extents and at increasing temporal resolutions. While this is increasing our ability to detect subtle patterns, these data sets are becoming vast and require analytical tools that easily handle, manipulate and visualise these complex datasets.

Processing and analysing telemetry datasets can require a huge investment in time: rearranging data, removing erroneous values, purchasing, downloading and learning the new software, and running analyses. Furthermore merging together Excel spreadsheets, filtering data and preparing data for statistical analyses and plotting in different software packages can introduce all sorts of errors.

There is a better way…

R is a powerful language for data wrangling and analysis because…

It is relatively fast to run and process commands
You can create repeatable scripts
You can trace errors back to their source
You can share your scripts with other people
It is easy to identify errors in large data sets
Having your data in R opens up a huge array of cutting edge analysis tools.
R is also totally FREE!

As R is open source, the more people we can get helping out on the R mailing lists (e.g. R-sig-geo) and contributing their own packages to the wider community, the more powerful R becomes!

For this course, we assume you have a basic understanding of the R environment, and working with RStudio. If you do not have experience working in R we encourage you to go through this online course that will bring you up to speed!

Course Resources

To access data and scripts we will work through in this course, download the Animals-in-Motion-2018 folder from GitHub. This folder contains the course documents, telemetry data and R scripts we are going to work with. To download the folder click on the green Clone or download, dropdown menu and select “Download ZIP”

Once downloaded, the workshop folder contains the following sub folders:

Data
Documents
GIS

First unzip this folder and extract the folder to a location on your computer that you typically store your research files.

To link with these folders, we are going to use the Project functionality of R Studio.

First, open the R Studio program on your computer and in the TOP LEFT corner of the R Studio window, click File > New Project….

Next, create a New Project using the Create Project command (available on the Projects menu and on the global toolbar):

Select Existing Directory then save the project within the “Animals-in-Motion-2018” folder.

When a New Project is created, R Studio:

Creates a Project file (with an .Rproj extension) within the project directory. This file contains various project options (discussed below) and can also be used as a shortcut for opening the project directly from the file system.
Creates a hidden directory (named .Rproj.user) where project-specific temporary files (e.g. auto-saved source documents, window-state, etc.) are stored.
Loads the project into R Studio and display its name in the Projects toolbar (which is located on the far right side of the main toolbar (very TOP RIGHT of the R Studio window).

Why Projects are Awesome

There are a number of good reasons why you should work with Projects in R Studio:

We may want our R scripts to be saved into a place where they link seamlessly to other documents and data files for our research project.
We may also want our tables, figures and statistical results to be written to locations on our computer where they are easy to locate.
We may work on multiple workstations (desktops, laptops) that we want our code to be completely transferable using cloud infrastructure (e.g. Dropbox, OneDrive)
It allows it to share work folders, data and R code with collaborators who are working on the same projects

Installing packages

Part of the reason R has become so popular is the vast array of packages that are freely available and highly accessible. In the last few years, the number of packages has grown exponentially > 10,000 at my last check on CRAN! These can help you to do a galaxy of different things in R, including running complex analyses, drawing beautiful figures, running R as a GIS, constructing your own R packages, building web pages and even writing R course handbooks like this one!

Let’s suppose you want to load the sp package to access this package’s incredible spatial functionality. If this package is not already installed on your machine, you can download it from the web by using the following command in R.

install.packages("sp", repos='http://cran.us.r-project.org')

In this example, sp is the package to be downloaded and ‘http://cran.us.r-project.org’ is the repository where the package will be accessed from.

Multiple packages can be loaded at the same time by listing the required packages in a vector…

install.packages(c("rgdal","rgeos","adehabitatHR",
                   "raster","gdistance",
                   "tidyverse",
                   "XML","gstat","Hmisc",
                   "checkmate","devtools",
                   "vembedr",
                   "leaflet","htmlwidgets"), repos='http://cran.us.r-project.org')

We then load the required packages from our computer’s R package library using the library() function.

library(sp)
library(rgdal)
library(rgeos)
library(adehabitatHR)
library(leaflet)
library(tidyverse)
library(devtools)

It is that simple to access powerful statistical, visualisation and GIS tools in R. Best of all, it is completely FREE!!

Importing large telemetry datasets into R

In the first session we are going to work with a data set containing detection data from 3 Australian Blacktip Sharks (Carcharhinus tilstoni) shown in the image above. These animals were captured and tagged within Cleveland Bay, Townsville roughly one month prior to the landfall of Cyclone Yasi in 2011. Blacktip sharks were tracked using a network of acoustic hydrophones deployed in a grid pattern on the East and West side of Cleveland Bay.

Telemetry data from these sharks were analysed alongside 45 others from five species to examine movement patterns of coastal sharks before, during and after three extreme weather events in Australia (Cyclone Yasi and Tropical Storm Anthony, 2011) and the US (Tropical Storm Gabrielle, 2001). You can read more about that study here.

The web map of detection data we will explore by the end of Session 1

Before we can analyse these data, we first need to read this dataset into R. As with most acoustic detection datasets exported from VUE or other acoustic telemetry data management software, our data set is in the ‘comma sperated value’ (.csv) format.

A .csv file can simply be imported into R using the read.csv base function, and by telling R which file to load (Blacktip_ClevelandBay.csv) and where to find it (i.e. in the ‘Data’ folder).

# Load the blacktip shark data using base read.csv function
blacktip <- read.csv('Data/Blacktip_ClevelandBay.csv', header = TRUE)

However, loading very large datafiles can take a long time when using the read.csv function. A useful alternative is the fread function in the ‘data.table’ package, which is very effective at loading very large files such as those exported from the Vemco VUE database.

library(data.table)
# Load the blacktip shark data using data.table
blacktip <- fread('Data/Blacktip_ClevelandBay.csv', header = TRUE)

# You can also use fread to input data directly from a website URL
blacktip <- fread('https://raw.githubusercontent.com/RossDwyer/OCS-2018-Rworkshop/master/Data/Blacktip_ClevelandBay.csv')

A note about Excel files

Don’t use ‘.xlsx’ or ‘.xls’ files for saving data. The problem with ‘.xls’ and ‘.xlsx’ files are that they store extra info with the data that makes files larger than necessary and Excel formats can also unwittingly reformat or alter your data!

A stable way to save your data is as a ‘.csv’ file. These are simply values separated by ‘commas’ and rows defined by ‘returns’. If you select ‘Save as’ in Excel, you can choose ‘.csv’ as one of the options. If you open the .csv file provided in the ‘Data’ folder using a text editor, you will see it is just words, numbers and commas.

Filtering and manipulating datasets

The ability to quickly filter and manipulate datasets is important when working with large telemetry datasets. The standard input format of tabulated data in R is a data.frame. A data.frame is a special ‘class’ of an object, where there are multiple variables, stored in named columns and multiple rows for our detections. Every variable has the same number of columns. There are simple rules to filter and select data from a data.frame. Rows and columns are indexed by using a comma as a separator df[row,column]. Try the following code:

blacktip[1,] # Provides the first row of data
blacktip[1,3] # Provides the cell value from the 1st row and 3rd column
blacktip[,3]  # Provides the first column of data
blacktip[c(1,3),] # Provides the first and third date in the data frame

We can also access objects in a data.frame by their names:

blacktip$Transmitter #provides all the data for that object
blacktip$Transmitter[1:5] #provides athe first five values for that object
blacktip[,'Transmitter'] #provides all the data for that object

The data.table package enhances the functionality of the traditional data.frame format and allows you to do more than just select data based on row and column names.

Firstly, data loaded using the fread function inputs detection data as a data.table format. This format is slightly different to the data.frame format but can still be maniuplated using the same rules as above.

To understand the enhancements of the data.table format we must first understand the general form of the data.table syntax. data.tables take the form of DT[row, column, by]. The additional by variable allows for quick and easy subsetting and aggregating. Try the following code:

# Subset rows within a data.table
blacktip[Transmitter.Name == "Ana" & Receiver == "VR2W-104912",] # Select detections from 'Ana' detected on "VR2W-104912" receiver.

# Subset rows and columns within a data.table
blacktip[Transmitter.Name == "Ana", .(Receiver, Latitude, Longitude),] # Subsets Receiver, Latitude and Longitude data from `Ana`

# Aggregate numbers of detections from each of the three tagged animal
blacktip[ , .(.N), by = "Transmitter.Name"]

The .N call is a special variable that holds the number of rows in that current group. Grouping by = Transmitter.Name obtains the number of rows (i.e. detections) for each shark. We can include multiple variables in the by parameter to subset our datset further:

# Aggregate numbers of detections at each reciever for each tagged animal
blacktip[ , .(.N), by = c("Transmitter.Name", "Receiver")]

We encourage you to use the package vignettes to explore additional functionality of the data.table package to make data subsetting and manipulation quicker and easier.

browseVignettes("data.table")

Pipes `%>%`

Now that we’ve successfully loaded in our tracking dataset, lets start having a closer look at the data using pipes %>%

Originally from the magrittr package but has been imported to the tidyverse.
%>% is an infix operator. This means it takes two operands, left and right.
‘Pipes’ the output of the last expression/function (left) forward to the first input of the next funciton (right).

# For example, to see what class our data is in, we could use this code...
class(blacktip)

# Alternatively in the tidy verse we could use this code...
blacktip %>% class()

Benefits of pipes %>%

Functions flow in natural order that tells story about data.
Code effects are easy to reason about by inserting View() or head() into pipe chain.
Common style makes it easy to understand collaborator (or your own) code.

We can have a quick look at the data by typing:

# Now insert functions into the pipe chain
blacktip %>% View()
blacktip %>% head() # first 6 rows by default
blacktip %>% tail(10) # specify we want to look at the last 10 rows

This functionality is particularly useful if the data is very large!

Note the (), as opposed to the [] we used for indexing. The () signify a function.

We can look at the data more closely using the nrow(), ncol(), length(), unique(), str() and summary() functions.

blacktip %>% nrow() # number of rows in the data frame
blacktip %>% ncol() # number of columns in the data frame
blacktip %>% str() # provides internal structure of an R object
blacktip %>% summary() # provides result summary of the data frame

# pipes can be used for single column within data frames
blacktip$Transmitter.Name<-
  blacktip$Transmitter.Name %>% as.factor()

# pipes are used to conduct multiple functions on the dataset in a certain order
blacktip %>% 
  subset(Transmitter.Name == "Colin") %>% # subset dataset to include only detections by 'Colin'
  nrow() # number of rows (i.e. detections) from 'Colin'

Pipes can also be used to pre-process our data before plotting them. Lets now use pipes to plot a simple barplot of the number of Colins detections at each reciever.

blacktip %>% 
  subset(Transmitter.Name == "Colin") %>% # subset dataset to include only detections by 'Colin'
  with(table(Station.Name)) %>% # create a table with the number of rows (i.e. detections) per receiver
  barplot(las=2, xlab="Receiver station", ylab="Number of Detections") # barplot of number of Colin's detections recorded per receiver

The tidyverse

Now that you’ve fully jumped into the world of pipes %>%, it’s time to fully introduce you to the tidyverse group of R packages. These have really revolutionised the way we work with big data and the way we visualise our results.

What is the tidyverse?

The tidyverse is the collective name given to suite of R packages designed mostly by Hadley Wickham.

Members of the tidyverse

broom, dplyr, forcats, ggplot2, haven, httr, hms, jsonlite, lubridate, magrittr, modelr, purrr, readr, readxl, stringr, tibble, rvest, tidyr, xml2

`lubridate`

lubridate is an easy way to convert date and time data into a form that R can recognise
Allows for calculation of durations and intervals between dates.
Recognises multiple date time formats and parses them to a standardised ‘POSIX’ format that R uses (ymd for dates; ymd_hms for date and time parsing)
These features are very important when working with spatio-temporal datasets like telemetry data

Currently in our blacktip dataset the “Date.and/Time..UTC.” column is recognised as “characters” and recorded in the Universal Coordinated Time Zone (UTC). Let’s use lubridate to convert this column into the ‘POSIX’ format and into the local date time (i.e. UTC + 10 hours).

library(lubridate)
blacktip$local.Date.time<- 
  blacktip$`Date.and.Time..UTC.` %>%
  ymd_hms(tz="UTC") %>% # first convert the `Date and Time (UTC)` column into a 'POSIX' format 
  with_tz(tzone="Australia/Brisbane") # convert to local "Australia/Brisbane" date time (UTC + 10hrs)

`dplyr`

dplyr is the data wrangling workhorse of the tidyverse.
Provides functions, verbs, that can manipulate data into the shape you need for analysis.
Has many backends allowing dplyr code to work on data stored in SQL databases and big data clusters.
- Works via translation to SQL. Keep an eye out for the SQL flavour in dplyr

Basic vocabulary

select() columns from a tibble
filter() to rows matching a certain condition
mutate() a tibble by changing or adding rows
arrange() rows in order
group_by() a variable
summarise() data over a group using a function

Check out this useful Cheatsheet for data wrangling.

`select`

We can use the select function in dplyr to choose the columns we want to include for our analyses and plotting

# Select the rows we are interested in
blacktip <- 
  blacktip %>% 
  select(local.Date.time, Latitude, Longitude, Receiver, Station.Name, Transmitter.Name, Transmitter, Sensor.Value) %>% # columns we want to include
  select(-Sensor.Value) # the minus symbol denotes columns we want to drop

head(blacktip)

##        local.Date.time  Latitude Longitude    Receiver Station.Name
## 1: 2011-01-24 02:41:31 -19.27705  146.9030    VR2-5052          E18
## 2: 2011-01-24 02:43:35 -19.27705  146.9030    VR2-5052          E18
## 3: 2011-01-24 02:48:25 -19.27705  146.9030    VR2-5052          E18
## 4: 2011-01-24 07:07:34 -19.27527  146.8894 VR2W-104912           E2
## 5: 2011-01-24 07:12:06 -19.27527  146.8894 VR2W-104912           E2
## 6: 2011-01-24 20:57:27 -19.28930  146.8771 VR2W-104197           E1
##    Transmitter.Name    Transmitter
## 1:              Ana A69-1303-63639
## 2:              Ana A69-1303-63639
## 3:              Ana A69-1303-63639
## 4:              Ana A69-1303-63639
## 5:              Ana A69-1303-63639
## 6:              Ana A69-1303-63639

`filter` and `arrange`

We can use these functions to subset the data to rows matching logical conditions and then arrange according to particular attributes

blacktip %>%
  filter(Transmitter.Name == "Ana") %>%
  arrange(local.Date.time) # arrange Ana's detections in chronological order

blacktip %>%
  filter(Transmitter.Name == "Bruce") %>%
  arrange(desc(local.Date.time)) # arrange Bruce's detections in descending chronological order

`group_by` and `summarise`

Determine the total number of detections for each tagged shark

blacktip %>%
  group_by(Transmitter.Name) %>%
  summarise(NumDetections = n()) # summarise number of detections per tagged shark

blacktip %>%
  group_by(Transmitter.Name, Receiver) %>%
  summarise(NumDetections = n()) # summarise number of detections per shark at each receiver

`mutate`

Adding and removing data to the data frame through a pipe

blacktip <- 
  blacktip %>%
  mutate(date=date(local.Date.time)) %>% # adding a column to the blacktip data with date of each detection
  mutate(Transmitter=NULL) # removing the `Transmitter` column

head(blacktip)

##       local.Date.time  Latitude Longitude    Receiver Station.Name
## 1 2011-01-24 02:41:31 -19.27705  146.9030    VR2-5052          E18
## 2 2011-01-24 02:43:35 -19.27705  146.9030    VR2-5052          E18
## 3 2011-01-24 02:48:25 -19.27705  146.9030    VR2-5052          E18
## 4 2011-01-24 07:07:34 -19.27527  146.8894 VR2W-104912           E2
## 5 2011-01-24 07:12:06 -19.27527  146.8894 VR2W-104912           E2
## 6 2011-01-24 20:57:27 -19.28930  146.8771 VR2W-104197           E1
##   Transmitter.Name       date
## 1              Ana 2011-01-24
## 2              Ana 2011-01-24
## 3              Ana 2011-01-24
## 4              Ana 2011-01-24
## 5              Ana 2011-01-24
## 6              Ana 2011-01-24

Data visualisation using `ggplot2`

ggplot2 is a powerful data visualization package for the R programming language. The package makes it very easy to generate some very impressive figures and utilise a range of colour palettes, taking care of many of the fiddly details that can make plotting graphs in R a hassle.

The system provides mappings from your data to aesthetics which are used to construct beautiful plots.

Documentation for ggplot2 can be found here and here.

There is also this awesome Cheetsheet for ggplot2

`ggplot2` grammar

The basic idea: independently specify plot building blocks and combine them to create just about any kind of graphical display you want.

Building blocks of a graph include:

data
aesthetic mapping
geometric object
statistical transformations
scales
coordinate system
position adjustments
faceting

Aesthetic Mapping

In ggplot2, aesthetic means “something you can see”. Aesthetic mapping (i.e., with aes()) only says that a variable should be mapped to an aesthetic. It doesn’t say how that should happen. For example, when mapping a variable to shape with aes(shape = x) you don’t say what shapes should be used. Similarly, aes(color = z) doesn’t say what colors should be used. Describing what colors/shapes/sizes etc. to use is done by modifying the corresponding scale.

In ggplot2 scales include:

position (i.e., on the x and y axes)
color (“outside” color)
fill (“inside” color)
shape (of points)
linetype
size

Each type of geom accepts only a subset of all aesthetics–refer to the geom help pages to see what mappings each geom accepts. Aesthetic mappings are set with the aes() function.

Geometic Objects (geom)

Geometric objects are the actual marks we put on a plot. Examples include:

points (geom_point, for scatter plots, dot plots, etc)
lines (geom_line, for time series, trend lines, etc)
boxplot (geom_boxplot, for, well, boxplots!) A plot must have at least one geom; there is no upper limit. You can add a geom to a plot using the + operator

You can get a list of available geometric objects using the code below:

help.search("geom_", package = "ggplot2")

In the below script we call the data set we have just made (blacktip) and then pipe it into the ggplot() function. We than tell ggplot that we want to plot a box plot.

library(ggplot2)   

blacktip %>%
  group_by(Transmitter.Name, date) %>% 
  summarise(DailyDetections= n()) %>% # use summarise to calculate numbers of detections per day per animal
  ggplot(mapping = aes(x = Transmitter.Name, y = DailyDetections)) + # define the aesthetic map (what to plot)
  xlab("Tag") + ylab("Number of detections per day") +
  geom_boxplot() # define the geometric object (how to plot it).. in this case a boxplot

A common plot used in passive acoustic telemetry to assess temporal patterns in detection is the ‘abacus plot’. This plot can help quickly assess which animals are being detected consistently within your array, and identify any temporal or spatial patterns in detection frequency.

We can adapt the above script to create an abacus plot using our blacktip dataset.

blacktip %>%
  ggplot(mapping = aes(x = local.Date.time, y = as.factor(Transmitter.Name))) + 
  xlab("Date") + 
  ylab("Tag") +
  geom_point()

Additional Task: Now that you’ve plotted the raw dates, can you figure out how to plot daily detections of our tagged sharks

We can also use the facet_wrap() function to explore the detection data further and look at how animals were detected at each reciever.

blacktip %>%
  ggplot(mapping = aes(x = local.Date.time, y = as.factor(Station.Name))) + 
  xlab("Date") + ylab("Receiver station") +
  geom_point() +
  facet_wrap(~Transmitter.Name, nrow=1) # This time plot seperate boxplots for each shark

Additional Task: Can you now plot this with a different colour for each shark?

Session 2: Working with Spatial Objects in R and Interactive Mapping

R offers a variety of functions for importing, manipulating, analysing and exporting spatial data. Although one might at first consider this to be the exclusive domain of GIS software, using R can frequently provide a much more lightweight, yet equally effective solution that embeds within a larger analytic workflow.

One of the tricky aspects of pulling spatial data into your analytic workflow is that there are numerous complicated data formats. In fact, even within R itself, functions from different user-contributed packages often require the data to be structured in very different ways. The good news is that just like the tidyr package family, efforts are underway to standardize spatial data classes in R.

This movement is facilitated by sp, an important base package for spatial operations in R. It provides definitions for basic spatial classes (points, lines, polygons, pixels, and grids) in an attempt to unify the way R packages represent and manage these sorts of data. It also includes some core functions for creating and manipulating these data structures. The hope is that all spatial R packages will use (or at least provide conversions to) the ‘Spatial’ data class and its derivatives, as now defined in the sp package. All else being equal, we favor R functions and packages that conform to the sp standard, as these are likely to provide the greatest future utility and durability.

Here is a very useful style guide for coding using Spatial objects.

Coordinate Reference Systems (CRS)

Central to working with spatial data, is that these data have a coordinate reference system (CRS) associated with it. Geographical CRS are expressed in degrees and associated with an ellipse, a prime meridian and a datum. Projected CRS are expressed in a measure of length (meters) and a chosen position on the earth, as well as the underlying ellipse, prime meridian and datum.

Most countries have multiple coordinate reference systems, and where they meet there is usually a big mess — this led to the collection by the European Petroleum Survey Group (EPSG) of a geodetic parameter dataset.

The EPSG list among other sources is used in the workhorse PROJ.4 library, and handles transformation of spatial positions between different CRS. This library is interfaced with R in the rgdal package, and the CRS is defined partly in the sp package and partly in rgdal.

In the next step, we will convert our blacktip dataset (blacktip) into a spatial object and specify the CRS. We therefore need to refer to the correct proj4 string information which is contained within the rgdal package.

Our reciever coordinates, and hence detection coordinates, were recorded in the WGS 84 geographic datum in Decimal Degrees.

For simplicity, each projection can be referred to by a unique ID from the European Petroleum Survey Group (EPSG) geodetic parameter dataset. You can find the relevant EPSG code for your coordinate system from this website. There, simply enter in a key word in the search box and select from the list the correct coordinate system. There is a map image in the top right of the site to help you.

The equivalent EPSG code for WGS 84 is 4326

To set the spatial projection we use the proj4string() function

# convert our blacktip data.frame into a spatial object by assigning latitude and longitude values
coordinates(blacktip)<- c("Longitude","Latitude")

# Lets check if we have assigned a CRS to our spatial data
proj4string(blacktip)

# We can now assign the correct CRS class to our blacktip data
WGS <- CRS("+init=epsg:4326")
proj4string(blacktip) <- WGS

Now our blacktip data are associated with the correct CRS, we can use this spatial object with many of R’s GIS tools. However, any measurements of distance and area calculated using this dataset will provide information in decimal degrees, which may not be very useful. In order to calculate distances and areas correctly, we need to transform our data to the correct spatial projection.

From the above graphic, can you see which is the correct projection for the Townsville area - here’s a hint

GDA <- CRS("+init=epsg:28355") # The equivalent EPSG code for WGS 84 is 28355. 
blacktip.GDA <- spTransform(blacktip, CRSobj = GDA)

ESRI Shapefile import and export

Once you have the Spatial object the way you like it, you will want to export it to view in a GIS. Here we show you two options for exporting your Spatial object, as a shapefile or as a .kml for viewing in Google Earth. As with reading in spatial objects, there are a number of R packages out there to help you. Simply type ??kml or ??shapefile to look up a few of these options.

In this example we are goin to use the writeOGR() function in the rgdal package to write the data containing our blacktip dataset to a shapefile to view in a GIS.

# First lets make sure our data is a spatial object
class(blacktip)

We can then run the writeOGR function to write this file to disk. We specify the folder as being within the GIS folder in our R course project.

writeOGR(blacktip, # This field needs to be a SpatialPoints*, SpatialLines* or SpatialPolygons* object
         dsn="GIS/blacktip.shp", layer= "blacktip.shp", driver="ESRI Shapefile")

Additional Task: Can you output the blacktip shark data as a KML file to view on Google Earth?

Similarly, we can use the readOGR() function to upload our blacktip data onto R

blacktip_shp<-
  readOGR(dsn="GIS/blacktip.shp", layer= "blacktip")

`SpatialPoints` and receiver locations

The most basic spatial data object is a point, which may have 2 (X, Y) or 3 components (X, Y, Z). A single coordinate, or a collection of coordinates, may be used to define a SpatialPoints object.

In this exercise we are going to convert our receiver locations from a standard data frame object into a SpatialPointsDataFrame object.

# First load our VR2-W installation dataset for the Cleveland Bay study
statinfo <- read.csv('Data/Station_information.csv')
cb <- statinfo[statinfo$Installation %in% "Cleveland Bay",]

# Now convert the data.frame object into a SpatialPoints object
coordinates(cb) <- c("Longitude", "Latitude")

# Have a look at the created object
class(cb)
str(cb)

Notice the class has now become a SpatialPointsDataFrame. The str() output contains lots of @ symbols which denote a different slot in this S4 R object. Typing cb@data will extract the attribute data (similar to the attribute table in ArcGIS). The X and Y locational information can now be found in the @coords slot. In addition @bbox contains the bounding box coordinates and the @pro4string contains the projection, or coordinate reference system (CRS) information.

cb@coords
cb@proj4string
cb@bbox
head(cb@data)

# alternatively use the slot command to extract different 
# packages of data. As the data is stored in the data slot
slot(cb,'data')

Now let’s draw a simple spatial plot of the receiver locations.

# Now create a plot for the receiver stations
data.frame(cb) %>%
  ggplot(mapping = aes(x = Longitude, y = Latitude)) + 
  xlab("Longitude") + ylab("Latitude") +
  geom_point()

We can also plot nicer looking static maps using the ggmap package

# Lets replot our reciever stations with a Google base map using the `ggmap` package
library(ggmap)

# First define our region of Cleveland Bay
cb_bbox <- make_bbox(lon = cb$Longitude, 
                     lat = cb$Latitude, f = 0.3) # f controls the Google Zoom

# Lets download a Google base map for our region
cb_map <- get_map(location = cb_bbox, 
                  source = "google", maptype = "hybrid")

# Finally plot the reciever stations using the same `ggplot` grammar
ggmap(cb_map) + 
  geom_point(data = data.frame(cb), 
             mapping = aes(x = Longitude, y = Latitude)) +
  xlab("Longitude") + ylab("Latitude")

Interactive maps using `leaflet`

Now that we have allocated our data as a Spatial* object, we can also use the incredibly powerful Leaflet package in R to plot the locations of our hydrophones on an interactive map that you can share with your collaborators. The leaflet package utilises the open-access leaflet JavaScript library (a web programming language), to create web-based maps. We will cover a basic map with some points on it today. See Rstudio’s leaflet page for help and more mapping features, like polygons.

To get started with leaflet, first, make sure you have the leaflet and htmlwidgets packages installed (we did this earlier in the workshop using install.packages()), and then load it into this session:

library(leaflet)
library(htmlwidgets)

We start by specifying a base layer, then we can add features to it. We will string together our layers using the pipes (%>%) command. See this page for the numerous base map options. Finally, we add some markers, located using the longitude and latitude variables in the blacktip data frame:

myleafletplot <- leaflet() %>%
  
  # Base groups (you can add multiple basemaps)
  addProviderTiles(providers$Esri.WorldImagery, group="Satellite") %>%
  addProviderTiles(providers$OpenStreetMap, group="Map") %>%
  
  # Add reciever location data
  addCircles(lng=cb$Longitude, lat=cb$Latitude, fill=FALSE, color="gray", weight=8) 

myleafletplot # Print the map

We can also now add the detection data from our tagged sharks to show their positions within the array.

# First lets convert our blacktip data into a spatial points data frame object
blacktip <- read.csv('Data/Blacktip_ClevelandBay.csv', header = TRUE)
coordinates(blacktip) <- c("Longitude","Latitude")

# Subset detection data from "Ana" and remove duplicate positions
ana <-
  blacktip %>%
  subset(Transmitter.Name == "Ana") %>%
  remove.duplicates() # remove duplicated positions to reduce the number of points to plot on our leaflet map

Now lets add Ana’s detection data to our leaflet map

myleafletplot %>%
  
  # add the tag detection data
  addCircleMarkers(lng = ana$Longitude, lat = ana$Latitude, 
                   weight = 2, radius = 4, color = "red",
                   stroke = FALSE, fillOpacity = 1, 
                   group = "Ana") %>% # Dont forget to assign a group to the markers
  
  # Layers control
  addLayersControl(
    baseGroups = c("Satellite", "Map"),
    overlayGroups = c("Ana"), # add the groups you want to overlay on the base maps
    options = layersControlOptions(collapsed = FALSE))

Additional Task: Now try adding detection data from “Colin” and “Bruce” to the map!

Finally, you can save this leaflet document as a .html file and share it with your friends or colleagues. To save the interactive map as an html document, type the you can use the saveWidget() function and the file will be saved in your project folder.

saveWidget(myleafletplot, file="mymap.html") # uses the htmlwidgets package

Session 3: Visualising Animal Movements through Time

In this third session we are going to work with a data set containing detection data from 3 Estuarine Crocodiles (Crocodylus porosus) as shown in the image above. These animals were captured and tagged within the Wenlock River, Cape York. Crocodiles were dual tagged using both GPS transmitters and using a network of acoustic hydrophones deployed along the river system. You can read more about this study here

For this session, we will need the most recent version of the VTrack package from our GitHub repository. This contains some new functionality that currently is not on CRAN.

library(devtools) # Helps download development versions of R packages

devtools::install_github("RossDwyer/VTrack") # Install development version of VTrack directly from github

We then load the dataset containing the crocodile tracks and the receiver locations from within the VTrack package using the data() command

library(VTrack)

data(crocs) # Load crocodile demo  datset from the VTrack package
crocs %>% head()

##             Date.Time Code.Space  ID Sensor.1 Units.1 Sensor.2 Units.2
## 1 2008-09-01 00:00:48   A69-1105 139     0.20       m       NA      NA
## 2 2008-09-01 02:49:23   A69-1105 139     0.00       m       NA      NA
## 3 2008-09-01 03:02:51   A69-1105 139     0.20       m       NA      NA
## 4 2008-09-01 03:14:18   A69-1105 139     0.20       m       NA      NA
## 5 2008-09-01 03:27:18   A69-1105 139     0.20       m       NA      NA
## 6 2008-09-01 03:35:18   A69-1105 139     0.08       m       NA      NA
##   Transmitter.Name Transmitter.S.N Receiver.Name Receiver.S.N Station.Name
## 1  Jack-o-saurus D         1057130   VR2W-103551       103551           NA
## 2  Jack-o-saurus D         1057130   VR2W-103551       103551           NA
## 3  Jack-o-saurus D         1057130   VR2W-103551       103551           NA
## 4  Jack-o-saurus D         1057130   VR2W-103551       103551           NA
## 5  Jack-o-saurus D         1057130   VR2W-103551       103551           NA
## 6  Jack-o-saurus D         1057130   VR2W-103551       103551           NA
##   Station.Latitude Station.Longitude
## 1               NA                NA
## 2               NA                NA
## 3               NA                NA
## 4               NA                NA
## 5               NA                NA
## 6               NA                NA

data(PointsDirect_crocs) # Load reciever location dataset
# Load Wenlock points file
data(PointsDirect_crocs)
PointsDirect_crocs %>% head()

##   LOCATION  LATITUDE LONGITUDE RADIUS
## 1   103561 -12.25719  141.9228    100
## 2   103549 -12.25867  141.9304    100
## 3   103562 -12.25553  141.9419    100
## 4   103547 -12.25526  141.9436    100
## 5   103563 -12.27036  141.9725    100
## 6   103548 -12.27114  142.0695    100

Lets use leaflet to have a look at the receiver array

leaflet() %>%
  # Base group
  addProviderTiles(providers$Esri.WorldImagery, group="Satellite") %>%

  # Add reciever location data
  addCircles(lng = PointsDirect_crocs$LONGITUDE, 
             lat = PointsDirect_crocs$LATITUDE, 
             fill=TRUE, color="white", weight=10)

Additional Task: Add the detections of “Gecko” the croc (from the crocs dataset) to this leaflet plot. [hint: you will have to use the ‘left_join()’ function to merge the receiver location to the crocs dataset, lookup ?left_join()]

Producing track animations

Understanding movement paths using telemetry data can easily be done using animations of tracks. Here we will use the move and moveVis packages to create a simple animation of croc tracks.

# install packages and load libraries
devtools::install_github("16EAGLE/moveVis") # Install development version of moveVis directly from github
library(move)
library(moveVis)

External library requirements

To create movies (.mp4, .mov) and GIF, we will need at least one of three external libraries: ‘ffmpeg’, ‘libav’ and/or ‘ImageMagick’. They support different types of output formats (gif, mov, mp4 etc.). If you have ‘ImageMagick’ and either ‘ffmepg’ or ‘libav’ installed, you can use all output formats supported by moveVis.

First, check if you have these external libraries already installed by running this code

get_libraries()

If you do not have any of the above external libraries installed on your system, follow these instructions to get you up and running!

Now lets format our croc data.

# Merge locations of recievers with detection data for "Jack-o-saurus D"
jacko_detections <- 
  crocs %>%
  subset(Transmitter.Name == "Jack-o-saurus D") %>%
  merge(., PointsDirect_crocs, by.x="Receiver.S.N", by.y="LOCATION") %>%
  mutate(Latitude=LATITUDE, Longitude=LONGITUDE, Transmitter=ID,
         Transmitter.Serial=Transmitter.S.N, Sensor.Value=Sensor.1, Sensor.Unit=Units.1)

# Temporally standardise detection data using short-term centers of activity 
# (We will cover this in more detail in Session 4!!)
jacko <- COA(jacko_detections, "Transmitter.Name", 200)

# Format the date time field so R can read it properly
jacko$dt <- lubridate::ymd_hms(jacko$DateTime)

# Lets have a quick look at what jacko's tracks look like
ggplot() +
  geom_point(aes(x = Longitude, y = Latitude), data = jacko_detections, shape = 19) +
  geom_point(aes(x = Longitude.coa, y = Latitude.coa), data = jacko, color = 2, size = 0.5) +
  xlab("Longitude") + ylab("Latitude")

Now lets animate Jacko’s tracks using the move and moveVis packages

# We first have to create a 'move' object that has at least 3 sets of data
# 'x' (longitude), 'y' (latitude) and 'time'
jacko_move <- move(x = jacko$Longitude.coa,
                   y = jacko$Latitude.coa,
                   time = jacko$dt,
                   proj = CRS("+proj=longlat +datum=WGS84"),
                   data = jacko)

Now lets set some additional variables needed to create the animation:

# Identify the libraries you have available on your system
conv_dir <- get_libraries()

# Find out which output file formats can be created
get_formats()

# Specify output directory and create a folder to output your animation
out_dir <- ("~/Desktop/animation")
dir.create("~/Desktop/animation")

#Specify some optional appearance variables
img_title <- "Movement of Estuarine crocodiles within the Wenlock River, Cape York"
img_sub <- "Jack-o-saurus D"

Finally, we can use the animate_move() function to create our track animation.

# Create jacko's track animation
animate_move(jacko_move, out_dir, conv_dir = conv_dir,
             paths_mode = "simple", frames_nmax = 50, layer="basemap",
             img_title = img_title, img_sub = img_sub, out_format = "mp4")

# Note: for this demonstration we have limited the animation to 50 frames
# It can take multiple hours/days to produce a full track animation when you
# have so much data!!

Integration with Google Earth

Google Earth offers a simple yet powerful way of visualising your acoustic tracking data through time. However pulling detection datasets into Google Earth can be challenging given the size of many detection files. The VTrack R package has a few handy functions for visualising your tag detections as track animations in Google Earth).

For this to work, your receiver locations MUST be in the WGS84 coordinate reference system (CRS) and you will need to have Google Earth downloaded on your machine. If you have not already got it, Google Earth can be downloaded for free here

First let’s load our crocodile detection dataset into the VTrack format.

# Load crocodile datset into VTrack archive
Vcrocs <- ReadInputData(infile = crocs,
                        iHoursToAdd = 10, # Our dataset is in UTC +10 hours
                        sVemcoFormat = '1.0')
Vcrocs %>% head()

##              DATETIME TRANSMITTERID SENSOR1 UNITS1 RECEIVERID STATIONNAME
## 1 2008-09-01 10:00:48           139    0.20      m     103551     Unknown
## 2 2008-09-01 12:49:23           139    0.00      m     103551     Unknown
## 3 2008-09-01 13:02:51           139    0.20      m     103551     Unknown
## 4 2008-09-01 13:14:18           139    0.20      m     103551     Unknown
## 5 2008-09-01 13:27:18           139    0.20      m     103551     Unknown
## 6 2008-09-01 13:35:18           139    0.08      m     103551     Unknown

Once we have our dataset in the VTrack archive format and a seperate data frame containing the receiver locations, we can then run VTrack’s KML creator functions. GenerateAnimationKMLFile_Track() generates a moving arrow for a single transmitter as it moves between the detection field of adjacent receiver stations.

levels(Vcrocs$TRANSMITTERID) # Extracts the transmitter code

# Run the function to generate the KML for a single transmitter
GenerateAnimationKMLFile_Track(Vcrocs, # VTrack archive file
                               "139", # Transmitter code
                               PointsDirect_crocs, # points file
                               "GIS/Track1.kml", # file name
                               "cc69deb3", # colour of the track
                               sLocation="RECEIVERID")

The file will be written in your Project folder and can be opened in Google Earth. Users can adjust the time slider to visualise individual time periods for display or can click the spanner icon to slow down the speed of the animation.

Additional Task: try generating animations for the other crocodiles in the data frame

There is also a another handy function for visualising the movements of multiple animals at the same time.

GenerateAnimationKMLFile_Multitag(Vcrocs,
                                  PointsDirect_crocs,
                                  "GIS/Croc Multi.kml",
                                  sLocation="RECEIVERID")

Calculating distance metrics (direct, circuitous, least cost)

In this next exercise we will extract a metric to compare how far each of our tagged animals moved during the period of study. To do this, we use one of the Generate*Distance() functions within VTrack which calculates the distances between each of our hydrophone stations and assembles these distances within a matrix.

Direct distance

If our receivers are in open water or in an enclosed bay or dam, there may be no obvious barriers to animal movement so we could use the GenerateDirectDistance() function to generate our distance matrix.

VR2ArrayDM <- GenerateDirectDistance(PointsDirect_crocs) 
VR2ArrayDM

##        DM     103561     103549    103562    103547    103563    103548
## 1  103561  0.0000000  0.6471471  1.881276  2.074469  5.394396 15.819379
## 2  103549  0.6471471  0.0000000  1.090251  1.282652  4.549648 14.976537
## 3  103562  1.8812763  1.0902511  0.000000  0.000000  3.512702 13.779413
## 4  103547  2.0744690  1.2826518  0.000000  0.000000  3.356438 13.593494
## 5  103563  5.3943955  4.5496478  3.512702  3.356438  0.000000 10.344637
## 6  103548 15.8193785 14.9765370 13.779413 13.593494 10.344637  0.000000
## 7  103564 19.3263822 18.4792507 17.342354 17.163531 13.747736  3.760611
## 8  103551 20.9430834 20.1043298 19.063703 18.897261 15.366617  6.512850
## 9  103557 21.5580310 20.7371723 19.789242 19.635306 16.078979  8.376548
## 10 103556 22.7669620 21.9521408 21.028185 20.877384 17.323324  9.705648
## 11 103566 24.5782507 23.7808895 22.920078 22.777795 19.242669 12.099660
## 12 103550 26.5498008 25.7662512 24.949360 24.812961 21.299637 14.387823
## 13 103565 28.9378394 28.1573622 27.347990 27.212376 23.701708 16.615428
## 14 103552 30.9965651 30.1835023 29.260079 29.108549 25.553530 17.156852
## 15 103555 32.3819385 31.5579707 30.588612 30.430790 26.875910 17.979824
## 16 103554 32.7809210 31.9530064 30.965332 30.805043 27.253680 18.176971
## 17 103560 33.9036784 33.0773337 32.096558 31.937131 28.384290 19.335772
## 18 103559 34.9619912 34.1310983 33.128016 32.965581 29.419037 20.150172
## 19 103553 37.2913375 36.4564511 35.431472 35.266074 31.729162 22.238218
## 20 103558 37.7319469 36.8931403 35.844499 35.676017 32.153396 22.474688
##       103564    103551    103557    103556    103566    103550    103565
## 1  19.326382 20.943083 21.558031 22.766962 24.578251 26.549801 28.937839
## 2  18.479251 20.104330 20.737172 21.952141 23.780889 25.766251 28.157362
## 3  17.342354 19.063703 19.789242 21.028185 22.920078 24.949360 27.347990
## 4  17.163531 18.897261 19.635306 20.877384 22.777795 24.812961 27.212376
## 5  13.747736 15.366617 16.078979 17.323324 19.242669 21.299637 23.701708
## 6   3.760611  6.512850  8.376548  9.705648 12.099660 14.387823 16.615428
## 7   0.000000  2.931700  5.196961  6.388394  8.781520 11.020657 13.098692
## 8   2.931700  0.000000  2.172210  3.268928  5.651550  7.889866  9.996114
## 9   5.196961  2.172210  0.000000  1.152558  3.523165  5.811686  8.074579
## 10  6.388394  3.268928  1.152558  0.000000  2.210228  4.488931  6.726023
## 11  8.781520  5.651550  3.523165  2.210228  0.000000  2.088907  4.402984
## 12 11.020657  7.889866  5.811686  4.488931  2.088907  0.000000  2.203678
## 13 13.098692  9.996114  8.074579  6.726023  4.402984  2.203678  0.000000
## 14 13.287779 10.500000  9.284697  8.032929  6.417234  5.151216  3.776521
## 15 14.035756 11.498302 10.629864  9.478970  8.177470  7.164322  5.866624
## 16 14.220242 11.792460 11.060138  9.955027  8.780395  7.875045  6.644070
## 17 15.378827 12.942941 12.165327 11.034062  9.749719  8.673083  7.202649
## 18 16.190297 13.901206 13.278172 12.195775 11.028359 10.029008  8.579156
## 19 18.293816 16.170599 15.684223 14.639039 13.529542 12.518597 10.979743
## 20 18.557189 16.588880 16.258082 15.271921 14.306338 13.425388 11.995031
##       103552     103555     103554     103560    103559    103553
## 1  30.996565 32.3819385 32.7809210 33.9036784 34.961991 37.291337
## 2  30.183502 31.5579707 31.9530064 33.0773337 34.131098 36.456451
## 3  29.260079 30.5886116 30.9653319 32.0965579 33.128016 35.431472
## 4  29.108549 30.4307902 30.8050429 31.9371306 32.965581 35.266074
## 5  25.553530 26.8759101 27.2536799 28.3842901 29.419037 31.729162
## 6  17.156852 17.9798237 18.1769705 19.3357717 20.150172 22.238218
## 7  13.287779 14.0357565 14.2202423 15.3788268 16.190297 18.293816
## 8  10.500000 11.4983025 11.7924596 12.9429407 13.901206 16.170599
## 9   9.284697 10.6298640 11.0601381 12.1653266 13.278172 15.684223
## 10  8.032929  9.4789702  9.9550274 11.0340619 12.195775 14.639039
## 11  6.417234  8.1774696  8.7803950  9.7497192 11.028359 13.529542
## 12  5.151216  7.1643220  7.8750455  8.6730825 10.029008 12.518597
## 13  3.776521  5.8666244  6.6440699  7.2026486  8.579156 10.979743
## 14  0.000000  1.8965910  2.6686379  3.3314273  4.700188  7.174198
## 15  1.896591  0.0000000  0.5864414  1.3743882  2.676557  5.180581
## 16  2.668638  0.5864414  0.0000000  0.9589006  2.055783  4.549720
## 17  3.331427  1.3743882  0.9589006  0.0000000  1.177639  3.645523
## 18  4.700188  2.6765574  2.0557827  1.1776389  0.000000  2.305652
## 19  7.174198  5.1805815  4.5497200  3.6455228  2.305652  0.000000
## 20  8.116282  6.0612341  5.3654575  4.5982252  3.220943  0.998162
##       103558
## 1  37.731947
## 2  36.893140
## 3  35.844499
## 4  35.676017
## 5  32.153396
## 6  22.474688
## 7  18.557189
## 8  16.588880
## 9  16.258082
## 10 15.271921
## 11 14.306338
## 12 13.425388
## 13 11.995031
## 14  8.116282
## 15  6.061234
## 16  5.365458
## 17  4.598225
## 18  3.220943
## 19  0.998162
## 20  0.000000

Circuitous distance

In our crocodile example however, our animal’s movements are bounded by the banks of the river and so cannot move directly (as the crow flies) between hydrophone stations. As a work around, VTrack has a function GenerateCircuitousDistance() that calculates distances in series through a set of locations and waypoints to form an indirect path between receiver stations.

# Load the points file
data(PointsCircuitous_crocs)

# Generate the Circuitous Distance Matrix
CircuitousDM <- GenerateCircuitousDistance(PointsCircuitous_crocs)

# Now plot example of how circuitous distances between receivers were generated
# In this example an individual must follow the course of the river in order to
# move between receivers

PointsCircuitous_crocs %>% 
  ggplot(aes(x=LONGITUDE, y=LATITUDE)) +
  xlab("Longitude") +
  ylab("Latitude") +
  geom_path() + 
  geom_point() +
  geom_point(data = PointsDirect_crocs, 
             aes(x = LONGITUDE, y = LATITUDE), 
             color = "red", size = 3, shape = 16) + # Plots the hydrophones in a different colour)
  geom_point(data = PointsDirect_crocs[1,], 
             aes(x = LONGITUDE, y = LATITUDE), 
             color = "green", size = 3, shape = 16) # Plots 103561 (i.e. the most downstream hydrophone) in a different colour)

We would then use this within the RunResidenceExtraction() function in VTrack to extract the movements between hydrophone stations and link this to our measure of distance travelled

Res <- RunResidenceExtraction(Vcrocs,  # Our data frmae
                             "RECEIVERID",  # Whether we want to work with receiver serial numbers of station names 
                             1, # the minimun number of detections at a receiver to record the presence of a tag             
                             60*60*12, # the time period between detections before a residence efent 'times out'
                             sDistanceMatrix = CircuitousDM) # our distance matrix containing distances between receiver locations

# Our residence file
head(Res$residences)

##             STARTTIME             ENDTIME RESIDENCEEVENT TRANSMITTERID
## 1 2008-09-01 10:00:48 2008-09-02 02:23:25              1           139
## 2 2008-09-02 15:25:58 2008-09-02 18:50:24              2           139
## 3 2008-09-03 09:26:51 2008-09-03 09:26:51              3           139
## 4 2008-09-03 12:29:03 2008-09-03 12:40:24              4           139
## 5 2008-09-03 18:26:26 2008-09-03 18:37:17              5           139
## 6 2008-09-05 03:48:35 2008-09-05 04:55:22              6           139
##   RECEIVERID DURATION ENDREASON NUMRECS
## 1     103551    58957   timeout      61
## 2     103551    12266   timeout       9
## 3     103551        0  receiver       1
## 4     103557      681  receiver       2
## 5     103556      651  receiver       2
## 6     103566     4007  receiver       3

# Our log file
head(Res$residenceslog)

##              DATETIME RESIDENCEEVENT RECORD TRANSMITTERID RECEIVERID
## 1 2008-09-01 10:00:48              1      0           139     103551
## 2 2008-09-01 12:49:23              1      1           139     103551
## 3 2008-09-01 13:02:51              1      2           139     103551
## 4 2008-09-01 13:14:18              1      3           139     103551
## 5 2008-09-01 13:27:18              1      4           139     103551
## 6 2008-09-01 13:35:18              1      5           139     103551
##   ELAPSED
## 1       0
## 2   10115
## 3     808
## 4     687
## 5     780
## 6     480

# Our movements file
head(Res$nonresidences)

##             STARTTIME             ENDTIME NONRESIDENCEEVENT TRANSMITTERID
## 3 2008-09-03 09:26:51 2008-09-03 12:29:03                 1           139
## 4 2008-09-03 12:40:24 2008-09-03 18:26:26                 2           139
## 5 2008-09-03 18:37:17 2008-09-05 03:48:35                 3           139
## 6 2008-09-05 04:55:22 2008-09-05 06:45:32                 4           139
## 7 2008-09-05 06:45:32 2008-09-05 10:59:28                 5           139
## 8 2008-09-05 12:17:41 2008-09-05 15:03:10                 6           139
##   RECEIVERID1 RECEIVERID2 DURATION DISTANCE        ROM
## 3      103551      103557    10932 2994.901 0.27395726
## 4      103557      103556    20762 1597.359 0.07693667
## 5      103556      103566   119478 3068.188 0.02567994
## 6      103566      103556     6610 3068.188 0.46417366
## 7      103556      103557    15236 1597.359 0.10484111
## 8      103557      103551     9929 2994.901 0.30163166

Least cost distance

In many situations however, acoustic array designs will not fall into perfect series like our above example. Arrays will insead be made up of multiple creeks and branches like this.

In this image the green area is the river and red crosses are the hydrophone locations. In this sort of array design, it would be very challenging to calculate the distances between so many receiver stations.

This function works by using a rasterised version of our study region (1 = water, 0 = land) then calculating the shortest (i.e. least cost) route transitioning from one cell to another to connect adjacent receiver stations. The function makes use of the gdistance R package and will work using any arrangement of acoustic arrays provided they are connected by water.

First we load the raster layer of our study area using the raster package and convert this to a transition object using the gdistance package.

library(raster)
library(gdistance)

WaterRaster <- raster("GIS/wenlock raster UTM.tif") # Load the raster
tr <- transition(WaterRaster, 
                 transitionFunction = mean, 
                 directions = 8) # Create a Transition object from the raster

Next we load the tracking dataset and points file which uses a multi-branch array

PointsLeastCost <- read.csv("Data/PointsLeastCost_crocs.csv")

# convert the raster to points for plotting
map.p <- rasterToPoints(WaterRaster)
df <- data.frame(map.p) # Make the points a dataframe for ggplot
colnames(df) <- c("X", "Y", "Water") # Make appropriate column headings

# Now plot the map
ggplot(data = df, aes(y = Y, x = X)) +
  geom_raster(aes(fill = Water)) +
  geom_point(data = PointsLeastCost, 
             aes(x = X, y = Y), 
             color = "red", size = 3, shape = 20) +
  theme_bw() +
  coord_equal()

Finally we run the new GenerateLeastCostDistance() function in VTrack to generate the distance matrix containing the least cost paths.

GenerateLeastCostDistance(sPointsFile = PointsLeastCost,
                          sTransition = tr)

Plotting distance travelled through time

Generating distance matrices can also be a useful way of extracting the distance of each receiver stations to a specific location (i.e. receiver positioned at the mouth of an estuary, instream structure or other point of interest). These distances can then be merged with the original detection dataset and plotted through time to look at associations and commonality in the timing of animal movements.

In this next exercise we are going to use the left_join function within the dplyr package.

DistDownstream <- CircuitousDM[,1:2] # i.e. distance from the the most downstream hydrophone - serial number 103561 
names(DistDownstream) <- c("RECEIVERID", "Distance.DS")

# Make sure R knows that the Receiver IDs are both character vectors
Vcrocs$RECEIVERID <- as.character(Vcrocs$RECEIVERID)
DistDownstream$RECEIVERID <- as.character(DistDownstream$RECEIVERID)

# Use left join to return all rows from x, and all columns from x and y. 
Vcrocsm <- left_join(x = Vcrocs, y = DistDownstream)

# Plot the data
Vcrocsm %>% 
  ggplot(aes(x = DATETIME,
             y = Distance.DS,
             group = TRANSMITTERID,
             colour = TRANSMITTERID)) +
  geom_line(alpha = 0.5) +
  scale_y_continuous(name = "Distance from most downstream receiver (km)")+
  xlab("Date") +
  theme_bw()+
  theme(legend.title = element_blank(),
        legend.position = 'none',
        axis.line = element_line(colour = "black"),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.border = element_blank(),
        panel.background = element_blank())

Session 4: Centres of Activity and Home Range Estimation

In this last session, we will be working with data from three Tiger Sharks (Galeocerdo cuvier) that were monitored around Heron and One Tree Islands, and was contributed for this workshop by Dr. Michelle Heupel. These tiger sharks were monitored over three years as part of a larger study examining the ecology and movement patterns of reef predators. In this session we will use these data to calculate three types of home ranges for tiger sharks around Heron and One Tree Island.

Estimating short-term centers of activity (COA)

A primary use of telemetry data is to understand where animals go and estimate how much space they use. Passive acoustic telemetry, compared to other forms of telemetry (e.g. satellite telemetry, active tracking), are inherently spatio-temporally biased depending on where we deploy our recievers and how often animals come close to recievers. Using raw detection data can produce heavily biased measures of home ranges.

Estimating short-term centers of activity (COA) as a first step can help account for some of the spatio-temporal biases when calculating home range metrics. COAs are estimated by calculating mean positions of a tagged animal within set temporal bins. These positions do not account for the varying listening ranges between recievers within an array, therefore detection probability needs to be considered using other techniques (e.g. kernel distribution smoothing parameter). This technique of calculating short-term centers of activity is based on this study.

In this next exercise, lets estimate and plot short-term centers of activity for the three tiger sharks monitored at Heron and One Tree Islands.

First, lets use the techniques we learnt earlier today to input, manipulate and plot the raw detection data.

library(data.table)
library(lubridate)
library(ggmap)

# First lets input the tigershark dataset
tiger <- fread("Data/TigerShark_HeronIsland.csv")

# Lets convert the UTC time recorded to local time (UTC + 10 hours)
tiger$local.Date.time <- 
  tiger$`Date.and.Time..UTC.` %>%
  ymd_hms(tz = "UTC") %>% 
  with_tz(tzone = "Australia/Brisbane") 

tiger <- 
  tiger %>%
  mutate(id = Transmitter.Name) %>%
  dplyr::select(-`Date.and.Time..UTC.`)

# Lets plot our raw detection using `ggmap`
hoi_bbox <- make_bbox(lon = tiger$Longitude, lat = tiger$Latitude, f=0.3)

hoi_map <- get_map(location = hoi_bbox, source = "google", maptype = "satellite")

tiger_plot <-
  ggmap(hoi_map) + 
  geom_point(data = tiger, mapping = aes(x = Longitude, y = Latitude), shape = 1, color = 8) +
  xlab("Longitude") + 
  ylab("Latitude") +
  facet_wrap(~`id`, nrow=1)

tiger_plot # view map of raw detections

Now that the data is correctly formatted, we can calculate centers of activity using the COA() function in the VTrack package.

library(VTrack)
?COA

There are three input variables needed to calculate COAs using this function:

data : the data frame with your tag detection data
id : the name of the column with a unique identifier for each tag/animal
timestep : the size of the temporal bin used to calculate COAs (in minutes)

Note: The selection of time bins (timesteps) can greatly influence your COA positions, therefore test multiple timestep bins before you conduct further analyses. See this paper for more details on how to choose a timestep.

# lets estimate COAs for our tiger dataset. We will use 60 min timesteps for this excersise
coa_list <- COA(tagdata = tiger, id="Transmitter.Name", timestep= 60)

# If your tagdata file has multiple ids this function outputs a list with COAs estimated for each tag id
summary(coa_list)

# to help with plotting lets combine all the COAs into one data.table using the `rbindlist()` function
tiger_coa <- 
  rbindlist(coa_list) %>% 
  mutate(id=Transmitter.Name)

# Now we can compare our COAs with the raw detections
tiger_plot +
  geom_point(data = tiger_coa, mapping = aes(x = Longitude.coa, y = Latitude.coa), color=2, size=0.2)

Calculating and plotting home range area metrics

Now that we have our COA positions estimated for the tiger shark dataset (tiger_coa) lets look at different options of calculating home ranges. In this workshop we will go through three commonly used home range metrics used in movement ecology. These range in modelling complexity, and will give you different values of home range. Which one you use will depend on the research question you want to answer and what level of modelling complexity is most appropriate for your work.

In this workshop we will be using the adehabitatHR package to calculate all three metrics for our tiger shark data.

# lets load up the adehabitatHR library and others we will need
library(adehabitatHR)
library(raster)

Minimum Convex Polygons (MCP)

Minimum convex polygons (MCP) are the simplest metric of home range we can calculate for a tagged animal. This is probably the most widely used metric for home range in both marine and terrestrial systems. All this technique does is simply calculating the smallest convex polygon that encapsulates your relocation data, in our case COA positions.

Lets use the mcp() function to calculate MCPs for all three tiger sharks and calculate the area within using the mcp.area() function.

# Before we calculate the MCP we need to make our tiger_coa data into a spatial object
coordinates(tiger_coa) <- c("Longitude.coa", "Latitude.coa")
proj4string(tiger_coa) <- CRS("+init=epsg:4326")

# Now we can use the mcp() fuction to create mcps
tiger_mcp <- mcp(tiger_coa[,"id"], percent = 100)

Lets add our COA positions and mcps on a leaflet map for all our tiger sharks

mcp_ll <-
  tiger_mcp %>%
  fortify()

tiger_plot +
  geom_point(data = data.frame(tiger_coa), mapping = aes(x = Longitude.coa, y = Latitude.coa), color=2, size=0.2) +
  geom_polygon(data= mcp_ll, aes(x = long, y = lat), fill=NA, color="green")

We can also calculate the area of MCPs using the mcp.area() function

# first lets project our data so the area calculations are not in degrees
tiger_GDA <- spTransform(tiger_coa, CRSobj = CRS("+init=epsg:28355"))

# now we can calculate MCP area in square kilometers
mcp.area(tiger_GDA[,"id"], unin = "m", unout ="km2", percent = 100, plotit=FALSE)

##     Danielle     Emma     Tony
## 100 99.12507 84.75528 125.7409

Kernel Utilisation Distributions (KUD)

Although MCPs are the most commonly used metric of space use of animals, they provide limited information on how animals use that space. The next level of complexity when examining activity space is to understand its ‘utilisation distribution’. This is in essence, a bivariate probability density function fitted to relocation data from a tagged animal. In movement ecology, kernel models are most often used to fit these probability functions.

For this next step, lets calculate kernel utilisation distributions (KUDs) for our tiger shark dataset using the kernelUD() function.

One of the main considerations for this analysis is the selection of the ‘smoothing parameter’ h. The selection of smoothing parameter is related to the level of imprecision related to each relocation, and in the context of passive acoustic telemetry can be related to reciever listening range. There are several things to consider when selecting an appropriate h value, and you can read more about that here, but for this workshop lets use the “reference bandwidth”.

# Lets use our projected data `tiger_GDA` to calculate KUDs for each shark
# We will use "href" as our smoothing parameter
tiger_KUD <- kernelUD(tiger_GDA[,"Transmitter.Name"], h="href", grid=300) # grid value indicates output resolution

Core home ranges are often defined as the 50% KUD contour and extent of home range as the 95% UD contour. Now we can use the kernel.area() function to calculate the area of 50% and 95% KUD.

kernel.area(tiger_KUD, percent=c(50, 95), unin="m", unout="km2")

##     Danielle      Emma      Tony
## 50  8.198461  7.624968  25.68275
## 95 49.538158 41.144330 101.36400

We can also plot these KUDs on our map

# define core and extent home ranges, transform them back to geographic projection and reformat for plotting
core_KUD <-
  getverticeshr(tiger_KUD, percent = 50) %>%
  spTransform(CRS("+init=epsg:4326"))
  
extent_KUD <-
  getverticeshr(tiger_KUD, percent = 95) %>%
  spTransform(CRS("+init=epsg:4326"))

# Now lets add it to our plot
tiger_plot +
  geom_point(data = data.frame(tiger_coa), 
             mapping = aes(x = Longitude.coa, y = Latitude.coa), 
             color = 2, size = 0.2) +
  geom_polygon(data= extent_KUD, 
               aes(x = long, y = lat, group = group), 
               fill = "white", color = "white", alpha = 0.1) +
  geom_polygon(data = core_KUD, 
               aes(x = long, y = lat, group=group), 
               fill = "red", alpha = 0.4, color = "red")

BONUS SECTION: For the eager folk who want to progress onto more advanced utilisation distribution models, the following section will introduce the Brownian bridge models. We will not be going through this at the workshop (for lack of time), but you can work through this on your own time.

Incorporating movement paths using Brownian bridges (BBKUD)

Finally, for the last excersise for this workshop, we will use an extended version of the KUD, taking movement paths of tagged individuals into account when estimating its utilisation distribution. This is done by incorporating the time dependence between relocations using a ‘Brownian bridge’ process.

In short, the process models a probability distribution on a set of nodes (in our case COAs) and the trajectories between the nodes. The probabilities assigned on the nodes are relatively fixed (i.e. detection probability of a reciever), with the probabilities varying along the length of trajectories based on brownian motion.

We will use the kernelbb() function to produce brownian bridge KUDs (BBKUDs) for our tiger sharks. First we need to produce trajectories for our tiger shark data using the as.ltraj() function.

# Produce movement trajectories for tigershark data
traj <- as.ltraj(xy = coordinates(tiger_GDA),
                 date = lubridate::ymd_hms(tiger_GDA$DateTime, tz="Australia/Brisbane"),
                 id = tiger_GDA$id,
                 proj4string = CRS("+init=epsg:28355"))

traj

## 
## *********** List of class ltraj ***********
## 
## Type of the traject: Type II (time recorded)
## * Time zone: Australia/Brisbane *
## Irregular traject. Variable time lag between two locs
## 
## Characteristics of the bursts:
##         id    burst nb.reloc NAs          date.begin            date.end
## 1 Danielle Danielle     3118   0 2013-08-17 00:00:00 2015-11-27 17:00:00
## 2     Emma     Emma     4011   0 2011-08-09 16:00:00 2013-11-20 05:00:00
## 3     Tony     Tony      670   0 2012-03-18 13:00:00 2012-10-23 15:00:00
## 
## 
##  infolocs provided. The following variables are available:
## [1] "pkey"

# Have a look at the trajectories
plot(traj)

The next step is parameter selection. The calculation of BBKUDs require two smoothing parameters:

sig1 : This parameter is related to the uncertainty of relocation along each step trajectory. This parameter is somewhat related to the speed of the tagged animal but not equal to the speed of the animal.
sig2 : This parameter is related to the uncertainty of relocation at the nodes (in our case COAs). This parameter is similar to the h parameter we used earlier and somewhat related to the listening range of recievers in your array.

For this exercise we will use a conservative estimate of receiver listening range of ~50 m. We encourage you to conduct range testing on your acoustic arrays to improve these models!

We can use the liker() function to define sig1 based on a maximum liklihood estimation.

sig2 <- 50 # our conservative estimate of reciever listening range on Heron and One Tree Island
sig1.df <- liker(traj, 
                 rangesig1 = c(0, 100), 
                 sig2 = sig2,
                 plotit = FALSE) # this function can plot the results of the maximum likelihood estimation

sig1 <- (unname(sapply(sig1.df, '[[', 1)))/7 # extract the sig1 values for each animal

Note: The sig1 parameters are divided by 7. This is a smoothing parameter divisor used to adjust the shape of the final BBKUD. Higher values for divisors will produce narrower BBKUD estimates. This divisor needs to be chosen carefully, by first testing multiple values and choosing the one that produces KUD estimates that look reasonable.

Now that we have estimates for smoothing parameters, we can calculate the BBKUDs for our tiger sharks

# Calculate brownian bridge KUD.. This may take some time with large datasets!!
# *we have reduced the grid resolution here so it runs quicker for the workshop
tiger_BBKUD <- kernelbb(traj, sig1 = sig1, sig2 = sig2, grid = 100, extent = 1)

From here, the process of estimating area and plotting BBKUD is the same as the fixed KUD (see above)

kernel.area(tiger_BBKUD, percent=c(50, 95),
            unin="m", unout="km2")

##     Danielle       Emma      Tony
## 50  9.068449   7.934893  16.62549
## 95 74.436853 219.154186 102.02005

Let’s add BBKUDs to our map.

# define core and extent home ranges, transform them back to geographic projection and reformat for plotting
core_BBKUD.GDA <- getverticeshr(tiger_BBKUD, percent = 50)
proj4string(core_BBKUD.GDA) <- CRS("+init=epsg:28355")
core_BBKUD <- spTransform(core_BBKUD.GDA, CRS("+init=epsg:4326"))
  
extent_BBKUD.GDA <- getverticeshr(tiger_BBKUD, percent = 95)
proj4string(extent_BBKUD.GDA) <- CRS("+init=epsg:28355")
extent_BBKUD <- spTransform(extent_BBKUD.GDA, CRS("+init=epsg:4326"))

# Now lets add it to our plot
tiger_plot +
  geom_point(data = data.frame(tiger_coa), 
             mapping = aes(x = Longitude.coa, y = Latitude.coa), 
             color=2, size=0.2) +
  geom_polygon(data= extent_BBKUD, aes(x = long, y = lat, group=group), 
               fill="white", color="white", alpha=0.1) +
  geom_polygon(data= core_BBKUD, 
               aes(x = long, y = lat, group=group), 
               fill = "red", alpha = 0.4, color = "red")

Remember, you can use the writeOGR() function here to output the MCPs, KUDs and BBKUD spatial objects as shapefiles or kml files to use in other GIS software (see ESRI Shapefile import and export section of this workshop)

Signoff!

This is where we end our R workshop! There may have been a few bits of code that you had trouble with or need more time to work through. We encourage you to discuss these with us as well as others at the workshop to help get a handle on the R code.

If you have any comments or queries reguarding this workshop feel free to contact us:

Dr. Ross Dwyer: ross.dwyer@uq.edu.au
Dr. Vinay Udyawer: v.udyawer@aims.gov.au

Happy Tracking!

Animals in Motion 2018 Workshop

North Stradbroke Island

Dr Ross Dwyer & Dr Vinay Udyawer

Feb 20, 2018