Return to index

Creating a new Project

Selecting the correct genome

To begin to create a new project select File > New Project from the main menu. You will be provided with a tree view of the genomes you currntly have available. These will be classified firstly by species and then by assembly.

When you first run ChIPMonk you won't have any genomes imported so this list will be blank. If you can't see the genome you want move onto the next section: "Importing a new genome"

It is important that you select the correct genome assembly for the array you are going to use. Selecting the wrong assembly will result in probes being positioned incorrectly on the genome, or possibly not appearing at all.

There is no universal rule for how to determine which assembly to use, if you are unsure which genome to use you should ask the producer of your array for this information.

Once you have selected the correct genome it will be loaded into ChipMonk. This may take a minute or so since there's a lot of information in a genome! Once the genome has loaded you are ready to import your data.

Importing a new genome

If the genome you require does not appear in the list of available genomes you will need to download and install it into ChIP Monk. Because of the size of the genome annotation files it is not practical to ship a large number of these with the program.

Importing a new genome is a fairly straightforward process. After selecting File > New Project press the button which says "Import New" ChIPMonk will fetch a list of genomes from our website and give you a selectable tree to choose the one you want to import. Select the genome you want and press "Download". The genome will be downloaded and installed in your genomes folder. You can then use it to create a new project.

If you want to use a genome which isn't available on the download server then please report this as a bug (www.bioinformatics.bbsrc.ac.uk/bugzilla/) and we can add more genomes to the system. If you want to make up your own genome files then this is also fairly straightforward to do (they're just EMBL header files). Please contact the authors and we'll give you the details of how to do this.

Importing Data

Once your genome has loaded you can begin to import data. The data import process will be slightly different for different array formats. To begin to import data simply select File > Import and then pick the correct format for your array.

Nimblegen Data

The data file you need to import your Nimblegen data into ChIP Monk is the single file which contains all of your raw data. On older Nimblegen arrays this will be called all_pair.txt and will be in a folder called Pair_Files. On newer arrays the file will be named after your experiment title with .pair on the end and will be in a folder called Raw_data_files.

ChIP Monk will be able to extract the sample names from the all_pair.txt file, but it won't know which pairs of data go together, and which experimental condition each represents.

After selecting the data file you wish to use you will be presented with a dialog box which you should use to pair up your data sets. Each pair consists of a total DNA control sample, and an experimental DNA subset. Select one data set in each column and press "Make new sample". You will be asked for a sample name for each pair. If you don't recognise the names you are presented with then there should be a file called SampleKey.txt at the top level of your Nimblegen DVD which tells you which of your samples equates to which array name. It may also help to know that array names ending in _532 are Cy3 and those ending in _635 are Cy5.

When you've added all the pairs you want press the "Done" button to begin to import the data. You don't have to use all of the data sets in a data file. The data will then be imported and you can begin your analysis.

Generic Text Data

If your data format isn't explicity supported by ChIPMonk then you can still load it provided that it's a delimited text format. You can start this kind of import by selecting File > Import > Text (Generic).

Text import occurs in two steps:

  1. Importing information about probes on your array
  2. Importing your experimental data

Probe Text Import

For the probe text import you need to have a text file which contains the following information:

  1. The name of your probe (each name must be unique)
  2. The chromosome your probe is on. This must be the chromosome name (optionally prefixed by 'chr'. It can be blank if you don't know the position of this probe
  3. The position of your probe. This can be provided either as a single base position or as separate start and end values

To actually import your data you select your probe information file and you will be shown the top 50 lines of data. From the column on the right you need to tell the program which columns in the table supply which of the required fields. When you've provided enough information you can press the import button to bring in the probe details. If your file has header lines at the top then you can tell the program how many lines to ignore before starting to import data.

Text Data Import

Once you've imported your probe data you will be able to press the button which allows you to import your experimental data. The process is similar to the probe import. Each line of your text data file must contain:

  1. A probe identifier which matches exactly with those used in your probe import
  2. One or more pairs of Experiment / Control data values

Again you use the column on the right to tell the program which columns provide each type of information, and how many header lines your file has. For the data values you can keep adding experiment / control pairs until you have specified all of the data you wish to import. You don't have to import all the columns of data in your file if you don't want to.

Return to index