Assignment #6
Load the required libraries
require(tidyverse)
## Loading required package: tidyverse
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.4.0
## ✔ readr 2.1.2 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
Import the csv file
MBT_ebird<-read.csv("Data/MBT_ebird2.csv")
(1a) Using the ebird dataset, calculate the number
of species seen each month of each year in each location.
- Group the csv files by location, month, and year
- Then we will summarize the groups by species count
Length ()
function calculates the length of
lists/vectors that have unique common names
.group="drop"
is used because the output becomes
ungrouped
bird_stats<- MBT_ebird %>%
group_by(location, month, year) %>%
summarize(species_count = length(unique(common_name)), .groups = "drop")
(1b) Plot the number of species seen each month
with the color of the points indicating year and facet this plot by
location
- Create a
ggplot
using the data above,
bird_stats
aes
specifies the variables to be plotted
as.factor
is needed or else the months are in
increments of 0.5
facet_wrap
generates multiple plots based on
location
- The geometric is drawn in points and the axes and title were
labeled
- The figure legend will be for year
- Lastly, I changed the color gradient to rainbow
prob1 <- ggplot(data = bird_stats)+
aes(as.factor(month), species_count, color = year) +
facet_wrap(~location) +
geom_point(size=1.5) +
theme_gray() +
xlab("Month") +
ylab("Species Count") +
ggtitle("Species Count by State")
problem1<-prob1+scale_color_gradientn(colours = rainbow(5))
problem1
Using the data set from Assignment #5:
(2) Plot a comparison of mass by treatment
including the individual observations, the mean, and standard error of
the mean. Use point color or shape to indicate the sex.
- Import the data file from Assignment 5
- Create a
ggplot
of the data frame
mass_treat
- Group and mass were specified as the variables to be plotted
geom_jitter
adds a small amount of random variation for
each point
- The axes and title was labeled, as well as a cross bar that is the
mean (purple), and error bars in black
- The figure legend will be for sex
- Error message is for the removed NA values
mass_treat<-read.csv("Results/Combined_Assn5_Data.csv")
prob2 <-ggplot(data= mass_treat,aes(Group,mass)) +
geom_jitter(size = 2, aes(Group, mass, color=Sex)) +
xlab("Group") +
ylab("Mass") +
stat_summary(fun = mean,
geom = "crossbar",
width = 0.5,
color = "purple") +
stat_summary(geom = "errorbar",
width = 0.6)+
ggtitle("Mass by Group and Sex")
labs(color="Sex")
## $colour
## [1] "Sex"
##
## attr(,"class")
## [1] "labels"
prob2
## Warning: Removed 4 rows containing non-finite values (stat_summary).
## Removed 4 rows containing non-finite values (stat_summary).
## No summary function supplied, defaulting to `mean_se()`
## Warning: Removed 4 rows containing missing values (geom_point).
(3) Generate a scatter plot of age and mass,
indicate treatment with point shape or color, and fit separate
regression lines (without CI) to each treatment.
- Create a
ggplot
of the data frame with age and mass as
the variables
- Want geometric plot as points and create a figure legend for group
(dots)
- Label the axes and the title
geom_smooth
adds a trend line and
method=lm
plots a linear model (linear regression
line)
se=false
removes the plotting of a confidence interval
around the line
- The
labs()
function sets the group (control vs
treatment) as the figure legend for the linear model
prob3<-ggplot(data= mass_treat,aes(age,mass) )+
geom_point(size =3, aes(age, mass, color=Group))+
xlab("Age")+
ylab("Mass")+
geom_smooth(size = 2, method = lm,
aes(color = Group, group = Group),
se=FALSE)+
labs(color="Group") +
ggtitle("Mass by Age and Group")
prob3
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 4 rows containing non-finite values (stat_smooth).
## Warning: Removed 4 rows containing missing values (geom_point).
(4) Combine the plots from question 2 and 3 using
patchwork tag each panel with, and number or letter and include a title
for the overall plot.
- Download the library
patchwork
plot_annotation
adds the two plots together and adds a
title over them both
- The plots are then labeled A and B
require(patchwork)
## Loading required package: patchwork
prob2+prob3+plot_annotation(title = "Mass by Group and Age",
tag_levels = "A")
## Warning: Removed 4 rows containing non-finite values (stat_summary).
## Removed 4 rows containing non-finite values (stat_summary).
## No summary function supplied, defaulting to `mean_se()`
## Warning: Removed 4 rows containing missing values (geom_point).
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 4 rows containing non-finite values (stat_smooth).
## Warning: Removed 4 rows containing missing values (geom_point).