That’s weird! Anomaly detection using R

Author
Affiliation

Monash University, Australia

Preface

This book is under development. Please don’t ask me when it will be finished — I have to fit this around my other work. In the meantime, I hope the incomplete book is useful.

This book is about tools and techniques for finding and understanding anomalies. We will begin with some simple data sets containing only one variable, and build up slowly to much more complicated data. We will cover popular but inadvisable methods to identify anomalies (pointing out their shortcomings), as well as more reliable and recommended approaches.

The book is written for two audiences: (1) people finding themselves doing data analysis without having had formal training in anomaly detection; and (2) students studying data science.

I will assume a knowledge of statistics and probability at about second year undergraduate level. So if you’ve done a couple of statistics subjects, and are familiar with multiple regression, hypothesis tests and probability distributions, then you should be prepared for what follows. I will also assume some basic knowledge of matrix algebra; some of the matrix results we use are provided in Appendix A — Matrix algebra.

R

I will also assume that you already know how to use R and are familiar with the tidyverse packages such as dplyr, tidyr and ggplot2.

All R examples in the book assume that you have loaded the weird package first:

# remotes::install_github("robjhyndman/weird-package")
library(weird)

This will load the relevant data sets, and various other packages needed to reproduce the examples.

 

Rob J Hyndman
December 2023