trumpton <- read_delim("trumpton.txt")
trumpton %>%
  ggplot(aes(x=Age, y=Weight)) +

Generally, the older people in the trumpton dataset are heavier.

Filter and select

Find the person who weighs more than 100kg.

trumpton %>%
  filter(Weight > 100) %>%
  select(FirstName, LastName)


trumpton %>%
  ggplot(aes(x=LastName, y=Age)) +
  geom_col(fill="magenta2", colour="black")

Child Variants

child <- read_delim("Child_Variants.csv")
Select all of the rows (variants) which occur in the first 5Mbp of Chr X.

x_filtered <- child %>%
  filter(CHR=="X") %>%
  filter(POS <= 5000000)



x_filtered %>%
  ggplot(aes(x=MutantReads, y=COVERAGE, colour=QUAL)) +

The low quality calls have low coverage and a small number of mutant reads.

Chr 1 line plot

child %>%
  filter(dbSNP != ".") %>%
  ggplot(aes(x=POS, y=COVERAGE)) +
  geom_line(colour="grey", size=1)

Remove any variants with a coverage > 200

child %>%
  filter(dbSNP != ".") %>%
  filter(COVERAGE <= 200) %>%
  ggplot(aes(x=POS, y=COVERAGE)) +
  geom_line(colour="grey", size=1)