How Unhealthy is your Starbucks Drink?¶

Introduction¶

In this notebook, I am analyzing Starbucks nutrition data I scraped from https://www.starbucks.com/menu on September 2, 2020. This data only includes items from the Starbucks Drinks menu.

About the Data¶

Starbucks provides a nutrition analysis of its menu items to help you balance your Starbucks order with other foods you eat. Their goal is to provide you with the information you need to make sensible decisions about balance, variety, and moderation in your diet.

The data file _sbuxnutrition.csv contains the drink nutrition data for this analysis. It contains the following variables:

drink_name: Name of the drink
type: Type of drink, categories defined by Starbucks
size: Size of the drink
calories: Number of calories
fat: Total fat (g)
cholesterol: Cholesterol (mg)
sodium: Sodium (mg)
carb: Total carbohydrates (g)
sugar: Sugars (g)
protein: Protein (g)
caffeine: Caffeine (g)

All drinks from the Starbucks online main menu (collected in Fall 2020) are included, with the exception of Clover® Brewed Coffees, Coffee Travelers, Iced Clover® Brewed Coffees, Bottled Teas, Milk, Sparkling Water, and Water. There are 11 columns and 525 rows.

For the purpose of this comparison analysis, I am filtering the dataset to only include drinks in grande size. Therefore, each line is a unique drink with a unique drink name. I am also omitting drinks in which grande size nutrition data was not provided on the Starbucks website menu.

Load and Explore the Data¶

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt 
import seaborn as sns

data = pd.read_csv("data/sbux_nutrition.csv")
data.head()

# How many NAs?
data.isnull().sum()

drink_name     0
type           0
size           0
calories       8
fat            8
cholesterol    8
sodium         8
carb           8
sugar          8
protein        8
caffeine       8
dtype: int64

# Clean up
data = data.dropna()
data = data[data['size'] == 'Grande']

data['type'] = data['type'].replace({'Frappuccino® Blended Beverages':'Frappuccinos'})
data['drink_name'] = data['drink_name'].str.replace('Frappuccino® Blended Beverage', 'Frappuccino')
data['drink_name'] = data['drink_name'].str.replace('Frappuccino®', 'Frappuccino')
data.head()

# New number of rows/columns
data.shape

(139, 11)

After data cleaning, we will be analyzing a total of 139 rows (unique drinks).

Summary table of each column (nutrition value). Some things to note from the table below:

Calories for a grande drink on the menu can go up to 470, w/o addons
Caffeine for a grande drink can go up to 360mg. It is generally recommended to not go over 400mg a day.

data.describe()

What are the average nutrition values for each drink category (type)?

data.groupby('type').mean()

What ratio of Starbucks drinks contain caffeine?

caf_perc = round(len(data[data.caffeine > 0]) / len(data) * 100, 2)
print('Percentage of menu drinks that contain caffeine: ', caf_perc, '%', sep='')

Percentage of menu drinks that contain caffeine: 85.61%

How many varieties of each category does Starbucks offer?

data['type'].value_counts().plot(kind='bar', figsize=(8, 6), rot=0, color='green')
plt.title('Variety in Starbucks Drink Categories', fontsize=16)
plt.xlabel("Drink Category")
plt.ylabel("Number of Drinks")
plt.show()

plt.figure(figsize=(10,8), dpi= 80)
sns.heatmap(data.corr(), xticklabels=data.corr().columns, yticklabels=data.corr().columns, cmap='Greens', center=0, annot=True)

# Decorations
plt.title('Correlogram of Starbucks Nutrition Types', fontsize=16)
plt.xticks(fontsize=12)
plt.yticks(fontsize=12)
plt.show()

This correlogram shows correlations between each nutrition type in Starbucks drinks. Darker green colors encode a higher (more positive) nutrition type correlation. Every nutrition type shows a positive correlation with each other with the exception of caffiene - caffeine has a slight negative correlation or close to zero correlation with the other nutrition types. This may suggest that different levels of caffeine can be present in both sugary and non-sugary drinks. The strongest positive correlation is between fat and cholesterol, which makes sense because cholesterol is one of many types of lipids. Similarly, carbohydrates and sugar have the second strongest correlation as sugar is a carbohydrate.

Other very strong positive relationships include: calories~carbohydrates, calories~fat, calories~cholesterol, and calories~sugar. While these correlations don't point out any unique observations in Starbucks nutrition, it's good to verify that the correlations make sense overall.

Analyzing Drink Nutrition¶

Focus: Calories, Sugar, & Caffeine¶

Why are we looking mostly at calories and sugar?

According to NPD’s Health Aspirations and Behavioral Tracking Service, the top two items consumers look for on nutrition labels are sugars and calories. I also want to discuss caffeine as it is in over 85% of items on the drink menu, and it is a hot topic in today's nutrition and health discussions.

Calories¶

Which drinks have the most calories?
What is the distribution of calories among Starbucks drinks?

data.nlargest(5, 'calories')[['drink_name', 'calories']]

fig = plt.figure(figsize = (6, 4))
plt.hist(data.calories, bins = 8, rwidth= 0.85, color='green')
plt.title('Distribution of Starbucks Calories')
plt.xlabel("Calories")
plt.ylabel("Number of Drinks")
plt.show()

Sugar¶

Which drinks have the most sugar?
What is the distribution of sugar among Starbucks drinks?

data.nlargest(5, 'sugar')[['drink_name', 'sugar']]

fig = plt.figure(figsize = (6, 4))
plt.hist(data.sugar, bins = 8, rwidth= 0.85, color='green')
plt.title('Distribution of Starbucks Sugar')
plt.xlabel("Sugar (g)")
plt.ylabel("Number of Drinks")
plt.show()

Caffeine¶

Which drinks have the most caffeine?
What is the distribution of caffeine among Starbucks drinks?

data.nlargest(5, 'caffeine')[['drink_name', 'caffeine']]

fig = plt.figure(figsize = (6, 4))
plt.hist(data.caffeine, bins = 8, rwidth= 0.85, color='green')
plt.title('Distribution of Starbucks Caffeine')
plt.xlabel("Caffeine (g)")
plt.ylabel("Number of Drinks")
plt.show()

The distributions show that Starbucks drinks are slightly left-skewed for calories, sugar, and caffeine. This makes sense as Starbucks does offer a number of zero-calorie and zero-sugar drinks, mostly falling in the tea category. It also offers a good amount of non-coffee and decaf drinks.

data.nsmallest(10, 'calories')[['drink_name', 'calories']]

fig = plt.figure(figsize = (8, 6))
plt.scatter(data.calories, data.sugar, c=data.caffeine, cmap='Greens')
cbar = plt.colorbar()
cbar.set_label('Caffeine (g)', rotation=270)

# draw line
plt.plot(np.unique(data.calories), np.poly1d(np.polyfit(data.calories, data.sugar, 2))
         (np.unique(data.calories)), color = 'black')

plt.title('Calories vs Sugar in Starbucks Drinks', fontsize=16)
plt.xlabel("Calories")
plt.ylabel("Sugar (g)")
plt.show()

The above scatterplot and linear regression shows the relationship between calories and sugar in all Starbucks drinks. Caffeine is denoted by color, where darker color means more caffeine. As expected, there is a clear and consistent positive relationship between calories and sugar. Caffeine looks to have no clear relationship with the other two variables, with the exception where drinks with very little to no calories and sugar contain a high amount of caffeine. This makes sense - coffee roasts, which do not have any milk or sugar added, have a very high amount of caffeine relative to other drinks.

Let's look into drink categories to see if we can uncover deeper patterns in caffeine.

Analyzing Drink Categories¶

sns.set(palette="muted")
sns.catplot(x="type", y="caffeine", hue="type",
            kind="swarm", data=data, aspect=1.5);
plt.title("Caffeine by Drink Type", fontsize=16)
plt.show()

sns.catplot(x="type", y="calories",
            kind="swarm", data=data, aspect=1.5);
plt.title("Calories by Drink Type", fontsize=16)
plt.show()

sns.catplot(x="type", y="sugar",
            kind="swarm", data=data, aspect=1.5);
plt.title("Sugar by Drink Type", fontsize=16)
plt.show()

Caffeine is most apparent in Hot Coffee and Cold Coffee, with most points falling at or above 150mg.

The Frappuccino category easily identified as the category with the most calories and sugar. The majority of points for this category fall above 350 calories and 40g of sugar.

sns.lmplot(x="calories", y="sugar", hue="type",
           data=data, aspect=1.5);
plt.title("Calories & Sugar in Starbucks Drink Types", fontsize=16)
plt.show()

We can use the categorical linear regressions above to compare calories to sugar in Starbucks drinks. For every drink category, we can see that as # of calories increases, sugar increases. Using color to show the relationships by drink category, we can see that data points in the Frappuccino category tend to be placed higher in calories and sugar, relative to the other categories. Data points in the Hot Tea category tend to be in the lower ranges of calories and sugar. Finally, the Hot Coffee and Cold Coffee categories have a very wide distribution in this plot. This shows that the drink's nutrition in these categories are not dependent on the category, but instead on other variables that make up the drink.

Daily Values¶

Looking at g/mg values for all of the nutrition types doesn't tell us much because we cannot compare the amounts of each nutrition type with different scales. To standardize this, I want to know how the nutrition values for each type compare to the daily suggested intake.

As Specified by the FDA Based on a 2,000 Calorie Intake for Adults and Children 4 or More Years of Age.

Calories: 2000
Fat: 78g
Cholesterol: 300mg
Sodium: 2300mg
Carbohydrates: 275g
Sugar: ?? → "There is no Daily Value for total sugars because no recommendation has been made for the total amount to eat in a day." - (FDA)
Protein: 50g

Source: https://www.fda.gov/media/135301/download (updated March 2020)

For healthy adults, it is generally recommended to not go over 400mg a day, an amount not associated with negative affects - (FDA)

Caffeine: 400mg

After this transformation, I wanted to see how nutrition types compare to each other now. What is the average daily intake percentage of each nutrition type for Starbucks drink items?

dv = data.copy(deep=True)
dv.calories = dv.calories / 2000 * 100
dv.fat = dv.fat / 78 * 100
dv.cholesterol = dv.cholesterol / 300 * 100
dv.sodium = dv.sodium / 2300 * 100
dv.carb = dv.carb / 275 * 100
dv = dv.drop(columns=['sugar']) # no daily level for total sugars
dv.protein = dv.protein / 50 * 100
dv.caffeine = dv.caffeine / 400 * 100

dv.head()

# Average daily intake percentage
dv.mean()

calories       10.028777
fat             8.799115
cholesterol     6.570743
sodium          5.089146
carb           10.922171
protein         9.395683
caffeine       27.080935
dtype: float64

# Max daily intake percentage
dv.max()[3:]

calories          23.5
fat            30.7692
cholesterol    21.6667
sodium         16.5217
carb                28
protein             30
caffeine            90
dtype: object

No drinks seem to go above 1/3 of recommended daily value for any nutrition type, with the exception of caffeine. Let's be happy that no grande drinks have a nutrition value intake equivalent to one of our 3 daily meals!

# Highest daily intake % of caffeine
dv.nlargest(10, 'caffeine')[['drink_name', 'caffeine']]

Blonde Roast takes the #1 spot, taking up 90% of the recommended caffeine daily intake! You best not be planning on another coffee later in the day. Unsurprisingly, we find many forms of Nitro Cold Brew in the top 10 most caffeinated drinks.

# Highest daily intake % of calories
dv.nlargest(10, 'calories')[['drink_name', 'calories']]

# Highest daily intake % of carbs (includes sugar)
dv.nlargest(10, 'carb')[['drink_name', 'carb']]

heatmap = dv.groupby('type').mean()

plt.figure(figsize=(8,6))
plt.title('Average Nutrition Daily Intake % of Starbucks Items by Category', fontsize=16)
sns.heatmap(heatmap, cmap="Greens", annot=True)

<matplotlib.axes._subplots.AxesSubplot at 0x1a17ce8690>

A high level of caffeine intake is apparent in this heatmap for the categories Cold Coffee and Hot Coffee. This further supports the observations that caffeine is most apparent in the variant Nitro Cold Brew drinks (Cold Coffee) and the variant "roast" drinks (Hot Coffee).

On a positive note, it is very good to see little darkness in the other nutrition types. With the other nutrition types, we need to worry about how they add up when we start concerning our daily meal intake as well. Having a lighter value is good so that we can save the bulk of our calorie intake for the meals that fuel us. On the other hand, we don't have to worry about caffeine intake for the rest of the day... unless you are someone who has multiple cups of coffee a day.

And the most unhealthy drinks are...¶

With caffeine greater than 60% DV, and with calories and carbohydrates greater than 20% DV, here are your worst drinks!

In terms of caffeine...

dv[(dv['caffeine'] > 60 )].sort_values(by=['caffeine'], ascending=False)

In terms of calories and carbs...

dv[(dv['calories'] >= 20) & (dv['carb'] > 20)].sort_values(by=['calories', 'carb'], ascending=False)

Very unfortunate that one of my absolute favorites is the Salted Caramel Mocha.

	calories	fat	cholesterol	sodium	carb	sugar	protein	caffeine
count	139.000000	139.000000	139.000000	139.000000	139.000000	139.000000	139.000000	139.000000
mean	200.575540	6.863309	19.712230	117.050360	30.035971	27.129496	4.697842	108.323741
std	141.833336	6.804412	21.717636	104.293366	19.873298	18.661296	4.692971	86.717784
min	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000
25%	90.000000	0.000000	0.000000	15.000000	14.000000	11.500000	1.000000	32.500000
50%	190.000000	4.500000	15.000000	110.000000	30.000000	26.000000	3.000000	95.000000
75%	335.000000	14.000000	45.000000	190.000000	44.000000	41.500000	8.000000	170.000000
max	470.000000	24.000000	65.000000	380.000000	77.000000	71.000000	15.000000	360.000000

	calories	fat	cholesterol	sodium	carb	sugar	protein	caffeine
type
Cold Coffees	159.500000	6.362500	18.000000	96.125000	21.025000	18.650000	4.275000	194.000000
Cold Drinks	129.411765	1.382353	0.000000	49.117647	28.352941	25.117647	0.588235	36.470588
Frappuccinos	385.714286	15.642857	45.714286	263.809524	56.428571	51.047619	5.238095	60.238095
Hot Coffees	220.200000	8.240000	22.600000	137.400000	28.440000	24.800000	8.240000	174.600000
Hot Drinks	350.000000	12.000000	40.555556	180.000000	50.222222	47.222222	10.777778	5.555556
Hot Teas	72.307692	1.500000	5.769231	38.076923	12.384615	11.846154	2.615385	37.461538
Iced Teas	114.642857	1.000000	3.928571	35.714286	24.500000	23.357143	1.785714	36.428571

	drink_name	calories
98	Salted Caramel Mocha	470.0
213	Mocha Cookie Crumble Frappuccino	470.0
216	Caramel Ribbon Crunch Frappuccino	470.0
166	Salted Caramel Hot Chocolate	460.0
246	Chocolate Cookie Crumble Crème Frappuccino	450.0

	drink_name	sugar
181	Caramel Apple Spice	71.0
207	Pumpkin Spice Coffee Frappuccino	66.0
210	Salted Caramel Mocha Coffee Frappuccino	63.0
216	Caramel Ribbon Crunch Frappuccino	62.0
237	White Chocolate Mocha Frappuccino	62.0

	drink_name	caffeine
6	Blonde Roast	360.0
18	Pike Place® Roast	310.0
309	Nitro Cold Brew with Dark Cocoa Almondmilk Foam	280.0
313	Starbucks Reserve® Nitro Cold Brew	280.0
317	Nitro Cold Brew	280.0

	drink_name	type	size	calories	sodium	carb	protein	caffeine
0	Caffè Americano	Hot Coffees	Short	5.0	5.0	1.0	0.0	75.0
1	Caffè Americano	Hot Coffees	Tall	10.0	10.0	1.0	1.0	150.0
2	Caffè Americano	Hot Coffees	Grande	15.0	10.0	2.0	1.0	225.0
3	Caffè Americano	Hot Coffees	Venti	15.0	15.0	3.0	1.0	300.0
4	Blonde Roast	Hot Coffees	Short	5.0	5.0	0.0	0.0	180.0

	drink_name	calories
118	Chai Tea	0.0
122	Earl Grey Tea	0.0
130	Royal English Breakfast Tea	0.0
138	Rev Up Brewed Wellness Tea	0.0
142	Emperor’s Clouds & Mist®	0.0
154	Jade Citrus Mint® Brewed Tea	0.0
158	Mint Majesty®	0.0
162	Peach Tranquility®	0.0
6	Blonde Roast	5.0
14	Featured Starbucks® Dark Roast Coffee	5.0

	drink_name	caffeine
6	Blonde Roast	90.00
18	Pike Place® Roast	77.50
309	Nitro Cold Brew with Dark Cocoa Almondmilk Foam	70.00
313	Starbucks Reserve® Nitro Cold Brew	70.00
317	Nitro Cold Brew	70.00
307	Nitro Cold Brew with Cinnamon Almondmilk Foam	68.75
311	Nitro Cold Brew with Cinnamon Oatmilk Foam	68.75
315	Salted Caramel Cream Nitro Cold Brew	67.50
305	Pumpkin Cream Nitro Cold Brew	66.25
319	Nitro Cold Brew with Sweet Cream	66.25

	drink_name	calories
98	Salted Caramel Mocha	23.5
213	Mocha Cookie Crumble Frappuccino	23.5
216	Caramel Ribbon Crunch Frappuccino	23.5
166	Salted Caramel Hot Chocolate	23.0
246	Chocolate Cookie Crumble Crème Frappuccino	22.5
386	Iced Salted Caramel Mocha	22.5
176	White Hot Chocolate	22.0
234	Java Chip Frappuccino	22.0
110	White Chocolate Mocha	21.5
207	Pumpkin Spice Coffee Frappuccino	21.0

	drink_name	carb
181	Caramel Apple Spice	28.000000
98	Salted Caramel Mocha	24.727273
207	Pumpkin Spice Coffee Frappuccino	24.363636
210	Salted Caramel Mocha Coffee Frappuccino	24.363636
166	Salted Caramel Hot Chocolate	24.000000
216	Caramel Ribbon Crunch Frappuccino	24.000000
222	Caffè Vanilla Frappuccino	23.636364
234	Java Chip Frappuccino	23.636364
386	Iced Salted Caramel Mocha	23.272727
213	Mocha Cookie Crumble Frappuccino	22.909091