## Training Courses

As part of its work with the Babraham Institute the Bioinformatics group runs a regular series of training courses on many aspects of bioinformatics.

Whilst these courses were designed for internal use many of them would be suitable for outside participants. If you are interested in having us run these courses for you then we may be able to arrange this through our consultancy service.

Where possible we also aim to make the material from our courses publicly available so that anyone who wants to can download them for their own use.

Below is a list of the courses we currently run. Where they are available there is a link to the training manual and course exercises.

#### Babraham Software

#### Core Bioinformatics Skills

#### Statistics

- Statistical Analysis using GraphPad Prism
- Statistical Analysis using SPSS
- Statistical Analysis using R

#### Application focussed courses

- RNA-Seq Analysis
- Analysing bisulphite methylation sequence data
- Extracting biological information from gene lists
- Viewing 3D Structures in Deep View
- Quality control in Sequencing Experiments

### Analysing Mapped Sequence Data with SeqMonk (Half a day)

SeqMonk is a program which can analyse large data sets of mapped genomic positions. It is most commonly used to work with data coming from high-throughput sequencing pipelines.

The program allows you to view your reads against an annotated genome and to quantitate and filter your data to let you identify regions of interest. It is a friendly way to explore and analysis very large datasets.

This course provides an introduction to the main features of SeqMonk and will run through the anlaysis of a couple of different datasets to show what sort of analysis options it provides.

For a more in-depth look at how to apply SeqMonk's tools to your data you can also look at our Advanced SeqMonk course

## Course content

- What is SeqMonk
- Starting and configuring the program
- Creating a project and importing data
- Using the chromosome viewer
- Quantitating and Filtering Data
- Creating Reports
- Exporting text and graphics

#### Course Material:

- Course Manual (pdf)
- Course Manual (doc)
- Course Exercises (pdf)
- Course Exercises (doc)
- Course Data (zip) [1.5GB] (ChIP-Seq data taken from Mikkelsen et al (2007))

### Advanced analysis with SeqMonk (Half a day)

The advanced SeqMonk course picks up where the basic course finishes. Rather than teaching you how to operate the program this course focusses on how best to apply the tools provided to your data.

The course focusses on optimising your data quantitation and filtering. It discusses biases which are often seen in data and the options SeqMonk provides to normalise these away. It also goes through the various statistical tools available in SeqMonk and under what circumstances each of these should be used.

For more complex experimental designs the course also introduces the clustering tools to try to generate functionally related sets of probes, which might make your data easier to interpret.

## Course content

- Quantitation biases and data normalisation
- Using the statistical tools in SeqMonk
- Clustering
- SeqMonk tips and tricks

#### Course Material:

- Course Manual (pdf)
- Course Manual (docx)
- Course Exercises (pdf)
- Course Exercises (docx)
- Directional RNA-Seq data set. Taken from GEO GSM800443
- Large RNA-Seq data set. Taken from ArrayExpress E-MTAB-822

### Statistical Analysis using R (One day)

Statistics are an important part of most modern studies and being able to effectively use a statistics package can help you to understand your results. This course provides an introduction to statistics illustrated though the use of the R language.

#### Course Content:

- Power analysis
- Recap of basic R functions
- Introduction of the ggplot library for graphing
- Importing data from other software packages
- Preparing your data for analysis
- Getting to know your data
- Graphical representations
- Choosing an appropriate analysis
- Interpreting analysis output

#### Course Material:

### Statistical Analysis using SPSS (One day)

Statistics are an important part of most modern studies and being able to effectively use a statistics package can help you to understand your results. This course provides an introduction to statistics illustrated though the use of the friendly SPSS package.

#### Course Content:

- Introduction to SPSS
- Importing data from other software packages
- Preparing your data for analysis
- Getting to know your data
- Graphical representations
- Choosing an appropriate analysis
- Interpreting analysis output

#### Course Material:

### Statistical Analysis using GraphPad Prism (One day)

GraphPad Prism is a powerful and friendly package which allows you to plot and analyse your data. This course acts not only as an introduction to Prism, but also goes through the basic statistical knowledge which should allow you to make the most of your data.

#### Course Content:

- Introduction to GraphPad Prism
- Preparing your data for analysis
- Getting to know your data
- Graphical representations
- Choosing an appropriate analysis
- Interpreting analysis output

#### Course Material:

- Course Manual (pdf)
- Course Manual (doc)
- Course Slides (pdf)
- Course Slides (ppt)
- Course Exercise Files (zip)

### Learning to Program with Perl (7 x 1.5 hour sessions)

For a long time Perl has been a popular language among those programming for the first time. Although it is a powerful language many of its features mean make it especially suited to first time programmers as it reduces the complexity found in many other languages. Perl is also one of the world's most popular languages which means there are a huge number of resources available to anyone setting out to learn it.

This course aims to introduce the basic features of the Perl language. At the end you should have everything you need to write moderately complicated programs, and enough pointers to other resources to get you started on bigger projects. The course tries to provide a grounding in the basic theory you'll need to write programs in any language as well as an appreciation for the right way to do things in Perl.

#### Course Content:

- Getting Started with Perl
- Conditions, Arrays, Hashes and Loops
- File Handling
- Regular Expressions
- Subroutines, Refereces and Complex Data Structures
- Perl Modules
- Interacting with External Programs
- Cross Platform Issues and Compiling

#### Course Material:

- Course Manual (pdf)
- Course Manual (doc)
- Course Exercises (pdf)
- Course Exercises (doc)
- Code used in the course (zip)

### Viewing 3D Structures with Deep View (Half a day)

Many proteins have had their structures experimentally determined, and an examination of these structures can provide valuable insights into the function of these molecules. This course will show you how to use the free Deep View software to view and analyse both single and multiple protein structures. It will also show you how to make impressive molecular graphics for your reports or posters.

#### Course Content:

- Installing Deep View
- Finding your structure
- Different structural representations
- Looking at residue subsets
- Aligning Multiple Structures
- Producing Publication-quality images

#### Course Material:

### Introduction to R (Half a day)

R is a popular language and environment that allows powerful and fast manipulation of data, offering many statistical and graphical options. This course aims to introduce R as a tool for statistics and graphics, with the main aim being to become comfortable with the R environment. It will focus on entering and manipulating data in R and producing simple graphs. A few functions for basic statistics will be briefly introduced, but statistical functions will not be covered in detail.

#### Course Content:

- What is R
- Getting familiar with the R console
- Entering Data
- Manipulating data
- Importing data files
- Creating Graphs (boxplots, barplots, scatterplots, line graphs)

#### Course Material:

- Course Manual (pdf)
- Course Manual (doc)
- Course Exercises (pdf)
- Course Exercises (doc)
- Course data (zip)

### Advanced R (Half a day)

This course follows on from the introductory course. It goes into more detail on practical guides to filtering and combining complex data sets. It also looks at other core R concepts such as looping with apply statements and using packages. Finally it looks at how to document your R analyses and generate complete analysis reports.

#### Course Content:

- Filtering and selection review
- Text manipulation
- Merging large datasets
- Looping
- Using and writing functions
- R packages
- Documenting your analysis

#### Course Material:

- Course Manual (pdf)
- Course Manual (docx)
- Course Exercises (pdf)
- Course Exercises (docx)
- Course data (zip)

### Plotting complex figures with R (Half a day)

This course is a comprehensive guide to the use of the built in R plotting functionality to construct everything from customised simple plots to complex multi-layered figures. It follows on from the material in our introductory R course and participants are expected to have a basic understanding of R - enough to load and do basic manipulation of datasets.

#### Course Content:

- The R painters model
- Core graph types and options
- Plot area customisation
- Using colour in plots
- Adding plot overlays
- Useful extension packages
- Writing plots to files

#### Course Material:

- Course Manual (pdf)
- Course Manual (docx)
- Course Exercises (pdf)
- Course Exercises (docx)
- Presentation Slides (pdf)
- Presentation Slides (pptx)
- Course data (zip)

### An Introduction to ggplot (Half a day)

Ggplot is the most popular plotting extension to R and replicates many of the graph types found in the core plotting libraries. This course provides an introduction to the ggplot2 libraries and gives a practical guide for how to use these to create different types of graphs.

#### Course Content:

- How ggplot2 works
- Plotting different graph types
- Scatterplots
- Stripcharts
- Histograms
- Boxplots
- Violin / Beanplots
- Bar charts
- Line graphs

- Writing plots to files

#### Course Material:

### An Introduction to Unix (Half a day)

Increasing amounts of bioinformatics work is done in a command line unix environment. Most large scale processing applications are written for unix and most large scale compute environments are also based on this.

This course provides an introduction to the concepts of unix and provides a practical introduction to working in this environment. Internally we link this course to a more specific course illustrating the use of our internal cluster environment and this part of the course could be adapted for other sites with different compute infrastructure

#### Course Content:

- Unix commands
- Files and Directories
- Viewing, Creating, Copying, Moving and Deleting
- Permissions
- Pipes

#### Course Material:

- Course Manual (pdf)
- Course Manual (doc)
- Unix cheat sheet (pdf) (External content from Tufts University)

### Analysing bisulfite methylation sequencing data (One day)

This course builds on the core skills introduced in the Introduction to R, Introduction to Unix and Introduction to SeqMonk courses to provide a more in depth look at the analysis of bisulphite sequencing data. The course is a mix of theoretical lectures and hands on practicals which go through the whole analysis pipeline, starting from raw sequence data and covering QC, visualisation, quantitation and differential methylation analysis.

#### Course Content:

- The theoretical basis for BS-Seq
- Processing raw sequencing data with Bismark
- Visualisation and exploration of methylation calls with SeqMonk
- The theory of differential methylation calling
- Differential methylation analysis with SeqMonk and bsseq

#### Course Material:

- BS-Seq data processing lecture (pptx)
- BS-Seq data processing lecture (pdf)
- BS-Seq data processing exercises (docx)
- BS-Seq data processing exercises (pdf)
- Visualisation and exploration lecture (pptx)
- Visualisation and exploration lecture (pdf)
- SeqMonk tools for methylation analysis (pptx)
- SeqMonk tools for methylation analysis (pdf)
- Visualisation and exploration practical (docx)
- Visualisation and exploration practical (pdf)
- Differential methylation lecture (pptx)
- Differential methylation lecture (pdf)
- Differential methylation practical (docx)
- Differential methylation practical (pdf)
- ox-BS-Seq lecture (pptx)
- ox-BS-Seq practical (pdf)
- Data for all practicals (tar.gz) [WARNING 14GB]

### Extracting biological information from gene lists (One day)

Many experimental designs end up producing lists of hits, usually based around genes or transcripts. Sometimes these lists are small enough that they can be examined individually, but often it is useful to do a more structured functional analysis to try to automatically determine any interesting biological themes which turn up in the lists.

This course looks at the various software packages, databases and statistical methods which may be of use in performing such an analysis. As well as being a practical guide to performing these types of analysis the course will also look at the types of artefacts and bias which can lead to false conclusions about functionality and will look at the appropriate ways to both run the analysis and present the results for publication.

#### Course Content:

- Functional databases
- Statistical test for testing functional enrichment
- Common artefacts in functional analysis
- Presenting functional analysis in publications
- Motif detection tools
- Network analysis methods
- Commerical functional analysis tools

#### Course Material:

- Introduction to Gene List analysis lecture (pptx)
- Introduction to Gene List analysis lecture (pdf)
- Gene List analysis practical (docx)
- Gene List analysis practical (pdf)
- Artefacts and Presenting Results lecture (pptx)
- Artefacts and Presenting Results lecture (pdf)
- Artefacts and Presenting Results Practical (docx)
- Artefacts and Presenting Results Practical (pdf)
- Motif Searching lecture (pptx)
- Motif Searching lecture (pdf)
- Motif Searching practical (docx)
- Motif Searching practical (pdf)
- Networks and Interactions lecture (pptx)
- Networks and Interactions lecture (pdf)
- Networks and Interactions practical (docx)
- Networks and Interactions practical (pdf)
- Commerical Tools lecture (pptx)
- Commerical Tools lecture (pdf)
- Data for all practicals (zip) [5MB]

### RNA-Seq Analysis (Half a day)

This course builds on the core skills introduced in the Introduction to R, Introduction to Unix and Introduction to SeqMonk courses to provide a more in depth look at the analysis of RNA-Seq data. The course starts with a comprehensive lecture covering the theory of RNA-Seq data generation and analysis and is then followed by a long hands-on practical session which runs though the entire RNA-Seq analysis pipeline from raw fastq files to a list of differentially expressed candidate genes.

#### Course Content:

- The theory of RNA-Seq analysis
- Raw data QC
- Mapping RNA-Seq data with tophat
- Viewing RNA-Seq data with SeqMonk
- Differential expression analysis with DESeq
- Reviewing and visualising differential expression hits

#### Course Material:

- Course Presentation (pptx)
- Course Presentation (pdf)
- Practical instructions (docx)
- Practical instructions (pdf)
- Course Data (tar.gz) (1.5GB)

### Quality Control in Sequencing Experiments (Half a day)

This course looks at the different ways in which sequencing based studies can fail and the options for visualisation and QC which allow you to identify and diagnose these failures at an early stage. It is designed to be of use to anyone who is using sequencing as part of their research, not just those who are running sequencing facilities.

#### Course Content:

- Why QC is important
- How sequencing experiments fail
- Implementing sequencing QC
- Existing QC software

#### Course Material:

- Course Introduction (pptx)
- Course Introduction (pdf)
- How sequencing experiments fail (pptx)
- How sequencing experiments fail (pdf)
- Failures in biological interpretation (pptx)
- Failures in biological interpretation (pdf)
- Developing and Implementing QC (pptx)
- Developing and Implementing QC (pdf)
- Course Data (zip) [9.3MB]

### Scientific Figure Design (Whole day)

This course provides a practical guide to producing figures for use in reports and publications. It is a wide ranging course which looks at how to design figures to clearly and fairly represent your data, the practical aspects of graph creation, the allowable manipulation of bitmap images and compositing and editing of final figures.

The course will use a number of different open source software packages and is illustrated with a number of example figures adapted from common analysis tools.

#### Course Content:

- Data Visualisation Theory Lecture
- Data Representation Practical
- Ethics of Data Representation Lecture
- Design Theory Lecture
- GIMP Tutorial
- GIMP Practical
- Inkscape Tutorial
- Inkscape Practical
- Final Practical

#### Course Material:

- Figure Design Introduction (pptx)
- Figure Design Introduction (pdf)
- Data Visualisation Theory Lecture (pptx)
- Data Visualisation Theory Lecture (pdf)
- Data Representation Practical (docx)
- Data Representation Practical (pdf)
- Ethics of Data Representation Lecture (pptx)
- Ethics of Data Representation Lecture (pdf)
- Design Theory Lecture (pptx)
- Design Theory Lecture (pdf)
- GIMP Tutorial (pptx)
- GIMP Tutorial (pdf)
- GIMP Practical (docx)
- GIMP Practical (pdf)
- Inkscape Tutorial (pptx)
- Inkscape Tutorial (pdf)
- Inkscape Practical (docx)
- Inkscape Practical (pdf)
- Exporting Files (docx)
- Exporting Files (pdf)
- Submitting to Journals (pdf)
- Submitting to Journals (pptx)
- Final Practical (docx)
- Final Practical (pdf)
- Figure Design Course Data (zip 15MB)