-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsteakArticle.Rmd
185 lines (148 loc) · 5.6 KB
/
steakArticle.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
---
title: "Steak Preferences in the United States"
subtitle: "Summary Report for the _**Mountain**_ Region"
author: "Gail Clement"
date: "`r format(Sys.Date(), '%A, %B %d, %Y')`"
output: html_document
---
```{r setup, include=FALSE}
# define knitr options
knitr::opts_chunk$set(echo = FALSE)
knitr::opts_chunk$set(warning = FALSE)
knitr::opts_chunk$set(message = FALSE)
# set any missing NAs in tables to blank
options(knitr.kable.NA='')
# load R packages
library(fivethirtyeight)
library(tidyverse)
library(knitr)
library(kableExtra)
library(ggthemes)
# load steak data
data("steak_survey", package="fivethirtyeight")
# create a subset, no missing NAs
# pick a specific geographic region
sdat <- na.omit(steak_survey) %>%
filter(region=="Mountain")
```
## Background
In May 2014, Walt Hickey at
[538.com](http://fivethirtyeight.com/) published an
article entitled
["How Americans Like Their Steak"](https://fivethirtyeight.com/features/how-americans-like-their-steak/). This article utilized survey data collected from 550 people which asked questions related to various risky activities, such as whether they speed in traffic, if they smoke or go skydiving, and if they prefer a riskier lottery. The survey also asked if they eat steak and if so, how they prefer their steak.
## Purpose
The "steak survey" dataset that was used for the ["How Americans Like Their Steak"](https://fivethirtyeight.com/features/how-americans-like-their-steak/) article, is available through
the R package [`fivethirtyeight`](https://cran.r-
project.org/web/packages/fivethirtyeight/index.html).
Using this dataset, this article summarizes the steak
preferences for the _**Mountain**_ Region of the
United States by gender.
For our summary presented here, a subset of the original steak survey was extracted to only include responses with no missing data and only included respondents from the _**Mountain**_ region of the US. The data subset used for our summary had `r nrow(sdat)` respondents.
## Demographic Summary Tables
The age categories of the `r nrow(sdat)` survey
respondents were:
```{r}
# create table of the age categories
tb <- sdat %>%
select(age) %>%
table() %>%
prop.table()*100
# convert table to data frame
dt <- as.data.frame(tb)
# use kable from knitr package to make table
# and use kable styling from kableExtra package
knitr::kable(dt, format="html",
col.names=c("Ages","%"),
digits=2,
caption="Ages of Survey Respondents") %>%
kableExtra::kable_styling(bootstrap_options ="striped",
full_width=FALSE)
```
The education levels of the `r nrow(sdat)` survey respondents were:
```{r}
# repeat for education
tb <-sdat %>%
select(educ) %>%
table() %>%
prop.table()*100
dt <-as.data.frame(tb)
knitr::kable(dt, format="html",
col.names=c("Education","%"),
digits=2,
caption="Education of Survey Respondents") %>%
kableExtra::kable_styling(bootstrap_options = "striped",
full_width=FALSE)
```
The income levels of the `r nrow(sdat)` survey respondents were:
```{r}
# repeat for household income
tb <-sdat %>%
select(hhold_income) %>%
table() %>%
prop.table()*100
dt <-as.data.frame(tb)
knitr::kable(dt, format="html",
col.names=c("Income","%"),
digits=2,
caption="Income of Survey Respondents") %>%
kableExtra::kable_styling(bootstrap_options = "striped",
full_width=FALSE)
```
## _OPTIONAL_ Demographics of Survey Respondents in a Merged Table
This section is provided as an example of how to merge the three demographic summary tables for the `r nrow(sdat)` survey respondents above into a single merged table.
```{r}
# create table summary for ages
tb1 <-sdat %>%
select(age) %>%
table() %>%
prop.table()*100
# create table summary for education
tb2 <-sdat %>%
select(educ) %>%
table() %>%
prop.table()*100
# create table summary for income
tb3 <-sdat %>%
select(hhold_income) %>%
table() %>%
prop.table()*100
# convert all tables to data frames
tb1df <-as.data.frame(tb1)
tb2df <-as.data.frame(tb2)
tb3df <-as.data.frame(tb3)
# merge 1st 2 data frames together
mtb <-merge(data.frame(tb1df, row.names = NULL),
data.frame(tb2df, row.names = NULL),
by=0, all=TRUE)[-1]
# merge result with 3rd data frame
mtb2 <-merge(data.frame(mtb, row.names = NULL),
data.frame(tb3df, row.names = NULL),
by=0, all=TRUE)[-1]
# use the final data frame
# make into a table with kable
# add styling with kableExtra
# add header with labels spanning 2 columns each
mtb2 %>%
knitr::kable(format="html",
col.names=c("Category","%","Category","%","Category","%"),
digits=2,
caption="Demographics of Survey Respondents") %>%
kableExtra::kable_styling(c("striped","bordered"),
full_width=FALSE) %>%
add_header_above(c("Ages"=2,"Education"=2,"Income"=2))
```
## Steak Preparation Preference by Gender
Finally, here is a breakdown of the `r nrow(sdat)` survey respondents for the _**Mountain**_ region of the US on how they prefer their steak to be prepared by gender.
```{r}
ggplot(sdat, aes(x = steak_prep, fill = female)) +
geom_bar(position="dodge",colour="black") +
ggtitle(paste0("Steak Preference by Gender: ",
"Mountain", "Region")) +
xlab("Steak Preparation Preference") +
ylab("Number of Respondents") +
scale_fill_manual(values=c("skyblue","palevioletred"),
name="Gender",
breaks=c(FALSE,TRUE),
labels=c("Male", "Female")) +
theme_fivethirtyeight()
```