Publicly Available Projects
Whilst much of the work performed by the Babraham Bioinformatics group is available only to those at Babraham or the other BBSRC institutes where possible we aim to make any software we develop publicly available.
Each of the following projects is derived from code developed by the Babraham Bioinformatics group for in house use. Formal support for these packages is available via our consultancy service, but we aim as far as possible to provide free informal support to anyone making use of our software. In particular we are always keen to receive bug reports or feature requests for any of our software packages.
If you want to stay up to date with changes to our existing projects, or with new software we have released then you can follow us on twitter You can also see demonstrations of our software on our YouTube channel
Active ProjectsThese projects are still being maintained and developed. We are keen to hear ideas for new functionality which could be added to them.
Bismark (Pubmed link) is a program to align bisulfite treated sequencing reads (BS-Seq) to a reference genome and perform methylation calls for every cytosine in the read in a single step. The output can be easily imported into a genome viewer such as SeqMonk, enabling researchers to analyse their data straight away.
November 2015: Bismark is now also available on GitHub. Please consider leaving comments, bug reports or feature requests over there as well.
Cluster Flow is a pipeline tool for cluster envinronments. It manages processing pipelines and analysis modules, making routine bioinformatics analyses fast and reproducable.
Compter is an application which allows you to analyse the nucleotide composition of one or more sets of DNA or RNA sequences. It can be used to help identify compositional biases generated within experiments.
FastQC is a quality control application for high throughput sequence data. It reads in sequence data in a variety of formats and can either provide an interactive application to review the results of several different QC checks, or create an HTML based report which can be integrated into a pipeline.
FastQ Screen is an application which allows you to search a FastQ sequence file against a set of sequence databases and summarises the results. It is useful for incorporating into a sequencing pipeline to identify sources of contamination or mislabeled samples.
GOliath is a web based Gene Ontology searching system. It performs gene ontology enrichment analysis for mouse or human genes and checks the resulting categories against a set of pre-computed functional categories that may be artefactual or biased. If matches are found, the results are flagged as potentially biased.
HiCUP is a bioinformatics pipeline for processing Hi-C data. The pipeline maps FASTQ reads against a reference genome, removing frequently encountered experimental artefacts. HiCUP produces paired-read files in SAM/BAM format, each read pair corresponding to a putative Hi-C di-tag.
Labrador is a tool for managing and automating the processing of publicly available sequencing datasets. End-users can browse and search projects and view metadata before downloading data through their web browser. Bioinformaticians can store relevant metadata locally and organise projects, as well as using Labrador to automatically generate analysis scripts through the use of accession number lookups.
Re-DOT-able is an interactive application for creating and manipulating dotplots. It can be used to assess the patterns of similarity between large pairs of DNA sequences.
reStrainingOrder is a mouse sample QC tool to help identify either pure strains or hybrid crosses from Illumina sequencing that were aligned against a genome which had all known SNP positions masked by Ns. reStrainingOrder works for most commonly used types of sequencing, including RNA-seq, ChIP-seq, ATAC-seq, Bisulfite-Seq, and more...
SeqMonk is a program to enable the visualisation and analysis of mapped sequence data. It was written for use with mapped next generation sequence data but can in theory be used for any dataset which can be expressed as a series of genomic positions.
Sierra is a web based LIMS system designed for use by small sequencing facilities. It provides a simple way to track samples within a facility and return sequencing, mapping and QC results to users.
Sherman is a script to simulate high-throughput sequencing data. It specialises in generating libraries for bisulfite-sequencing (BS-Seq) and introducing various sources of contamination which are commonly found in various Next-Gen Sequencing applications.
SNPsplit is a tool to tag and sort reads allele-specifically that have been aligned against a genome which had all known SNP positions masked by Ns. This may be useful to determnine allele-specific binding of proteins or histones (ChIP-Seq) or gene expression (RNA-Seq) for mice with a mixed genetic background. SNPsplit can also handle allele-specific Hi-C data processed by HiCUP.
SRA downloader is a program which makes it easier to download raw fastq sequence data from the GEO and SRA databases. It reads the config file you can download from the NCBI SRA Run Selector and then pulls down the corresponding data.
Trim Galore is a wrapper for the stand-alone tools Cutadapt and FastQC, that is supposed to enable consistent quality control and quality/adapter trimming for Next-Gen sequencing applications. It comes with additional features that are specific for RRBS applications (Reduced Representation Bisulfite-Seq), but works just as well for any other FastQ file that might need quality or adapter trimming.
Legacy ProjectsThese projects are no longer under active development. It is unlikely that we will add new functionality to them, although we will condsider fixing bugs which turn up.
ASAP is a program to align a sequencing file to two genomes at the same time. This can be useful to detect allelic imbalances in samples which are of different genetic origin. Examples would be studying imprinted regions in ChIP-Seq or RNA-Seq experiments, or genomic interactions (e4C) in mouse strains with a mixed genetic background.
Bareback processing allows the user to 'move' the raw images acquired by the Illumina Genome Analyser IIx from initial cycles with a very biased sequence composition to the end of the run. The images can then be reanalysed using the GOAT pipeline (which is now part of the Illumina OLB). For samples with very biased initial sequences (such as restriction enzyme sites) and/or very high cluster densities this procedure can potentially increase the sequence yield dramatically.
28/01/2011 Bareback-processing is now published (Pubmed).
ChIPMonk is a program designed to help in the visualisation and analysis of ChIP-on-chip array data. It provides a comprehensive set of tools to import, normalise, analyse and visualise your data.
Difference tracker is a set of two ImageJ plugins which can be used to track a large number of faint moving particles in a noisy environment.
FocalPoint is an image browser which provides enhanced functionality for images with multiple frames or channels. It works with all standard image types as well as a number of specialised microscopy formats.
FRETSaw is a specialised image browser for examining pairs of images generated by a FRET experiment. It also creates a colourised image showing the differences between the source images.
MZViewer is a simple viewer for mass spectra files in the mzData format.
Realyser is a program to help with the selection of stable controls for QPCR and the normalisation of QPCR data. It is an application of the GNorm technique and is designed to work with the data files from ABI Real-time PCR machines.
SparkSpotter is a plugin for the ImageJ image analysis platform. It is designed to identify short, bright 'spark' events in movies. It was originally designed to work with microsopy images of cells stained with fluorescent dyes, but can probably be applied to many different types of input.
StackMeausure is a plugin for the ImageJ image analysis platform. It works on image stacks (usually from confocal microscopes) and is designed to identify features which are separated spatially and by colour. It then provide tools to flexibly measure the distance between them. It also provides a 3D view of the features it has identified.