EAA2020: Abstract

Abstract is part of session #464:

Title & Content

Title:
The seven deadly sins of modelling
Content:
“Statisticians, like artists, have the bad habit of falling in love with their models”
George Box

Many Bad Models are Better than One: With Decision trees, once you combine an ensemble, lots of little specialised trees with less overall accuracy, this can improve upon a single decision tree. No individual tree can match an ensemble tree approach, which is currently one of the leading methods for classification. However, these multi-tree approaches do not allow interpretation like a singular tree approach.


Measuring things can affect the outcome: Establishing metrics is always dangerous if they are incentivised, as the tendency of the system to optimise to those key metrics is almost too strong.

Metrics and Rat Tails: A fable often told to data scientists, is the rat tail tale bounty. The King decides to outsource pest control. A reward is offered for every rat tail - as evidence that the rat has been killed. Initially, this is amazing; thousands of rat tails are brought in. However, one day an officer is walking the streets and notices something quite odd, tailless rats. People are merely removing the tails and not killing the rats; this is then followed by reports of people breeding rats and cutting their tails to claim the reward; the programme is instantly discontinued. The rat farmers now seeing no profit, dump all their rats on the street to get rid of them. Now there are more rats than ever.

We will also cover: Assessing Model Slippage: When do your models not work and when to stop and hand over. Model Stacking, Abstractions, Stories and Human Modelling.
Keywords:
Modelling, Measurement, Metrics
Downloads:

authors

Main authors:
Henry C. W. Price1
Co-author:
Chiara Girotto,2
Affiliations:
1 Imperial College London
2 Independent