### Introduction

Winter officially arrived last week when the temperature dropped to 5°F. To help fend off the cold, I started looking for a new bottle of single malt to add to my collection and provide some much needed warmth. Buying whisky is tricky business. There is tremendous variability in taste ratings, price, quality, and availability. Instead of casually browsing reviews, this year I decided to take a more systematic approach and analyze some whisky data to make an informed buying decision.

The two best single malt data sets I know of are the Malt Maniacs Matrix and the LA Whisky Society Whiskey Ratings. For my analysis, I decided to use the LA Whiskey Society ratings because they frequently do blind tasting sessions with a variety of tasters, so their data has the least bias. After cleaning up the data, I analyzed over 6000 single malt ratings from greater than 100 distilleries.

To get a sense of the data, I first generated a boxplot1 with ggplot2. The LA Whisky Society uses a grading system to rate whisky from A+ through F, which I transformed to a numerical scale to facilitate plotting. Below is the boxplot depicting score distributions for each whisky grouped by distillery. The plot is ordered by median score of each distillery.

I was initially surprised to see Yamazaki and Ardmore Distilleries at the top of the list. On closer inspection, the sample size for these distilleries is quite small and the whisky reviews are limited to old vintages of high quality—hence the high score bias and small interquartile ranges. Other legendary distilleries at the top of the plot such as the legendary St. Magdalene and Port Ellen were of no surprise, since they are well known to have produced some of the best single malts on the planet from the early to mid 20th century.

For my buying decision, I decided to limit my whisky purchase to bottles only from distilleries with a median score above 9. This reduced my list to possible malts to 38 distilleries and, in theory, reduced the chance of purchasing a bad whisky. I next plotted malt scores or age as a function of price to see the relationships between these variables:

Interestingly, the relationships between age ~ price and score ~ price are definitely different.2 A linear model regressing price on age could explain about 60% of the data, while a 3rd order polynomial fit roughly the same percentage of the score ~ price data. The data indicated that as price increases, there is an approximately linear increase with age, which is correlated with a cubic increase in score.

I next further refined the malt list by examining only malts that were less than $100 and with scores greater than 9 (pink square above). This list had a large number of bottles from independent bottlers like Douglas Laing, so I further filtered the list by only examining official bottles released from actual distilleries. Malts from independent bottlers are difficult to find in the US and I prefer drinking whisky released as it was intended directly from the distillery. After filtering potential bottles with these criteria, I found 16 malts that fit my constraints: DistilleryVintageAgePrice BenRiach199412 yo$54
Bowmore1989-199116 yo$80 GlendronachN/A15 yo$75
GlenfarclasN/A15 yo$65 Glengoyne N/A17 yo$70
Highland Park N/A18 yo$80 Highland ParkN/A21 yo$99
Lagavulin 199616 yo$80 Lagavulin199512 yo$90
Laphroaig 199510 yo$45 Longrow199510 yo$80
LongrowN/A14 yo$73 Oban199214 yo$76
Old Pulteney199016 yo$95 TaliskerN/A18 yo$80
Talisker1992-199312-13 yo DE$70 I was happy to see that four of these bottles were already in my collection (indicated by ) and that these malts are among some of my all-time favorites. Of the remaining bottles, I decided to purchase the Talisker 18 because it won the Best Single Malt Whisky In The World in 2007. Here are my tasting notes after the first sample: 1. Nose: Fragrant but clean; apples with subtle wafts of peat and burning leaves. 2. Palate: Thick and oily with initial notes of apples and orange rind changing to sweet toffee and creme bruelle, ending salty with traces of iodine, cough medicine, old books, and leather. 3. Finish: Rather long, peppery. Rating 8.5/10 — a phenomenal dram! Interested in whisky? See my What’s Inside Scotch Whisky post. 1. If your interested in the history of the boxplot, definitely check out Hadley Wickham's recent 40 years of boxplots article. 2. I've deliberately fixed the upper bound of the y-axis to a maximum price of five-hundred dollars for aesthetic reasons. Note that the data set includes some extreme values like a 50yo 1939 Mortlach and a 1966$6200 Macallan.