With the completion of the human mouse genomes, scientists are looking to make large-scale comparisons of syntetic regions of these two, or other genomes.
Usually a comparison between two sequences would be made by making an alignment between them, and then reading this to see the level of conservation. Comparions of large stretches of genomic sequence are not amenable to this kind of analysis for several reasons.
To get around the problems previously described, the most common form of comparison for large genomic sequences is a synteny plot. This shows the two sequences, one above the other, and draws blocked segments between the regions which are idenified as being equivalent (syntenic) between the two.
Several programs have been developed to display synteny plots. One of the most common is the Artemis comparison tool (ACT) This is a development of Artemis one of the most popular genome browsers.
ACT allows you to view a synteny plot between two sequences. It also displays features within the sequeces, such as genes and repeats. The program allows you to zoom in and out of the seuqences to whatever degree you wish, and also allows you to set a significance cutoff for the syntenic regions it displays, so you can choose how divergent the matches you are seeing are allowed to be.
ACT does not perform a comparison of your sequences, it merely displays a comparison you have previously performed. The advantage of this is that it can take a long time to perform a synteny comparion and by displaying a pre-computed comparison, you only need to compare any two sequences once. ACT can then display the synteny plot very quickly, as often as you like.
Information about how to get ACT can be found here.
If you're interested in making this sort of comparison, but aren't sure where to start, then you could consider going on the Babraham ACT course
This program is used to generate the comparison file which is used by ACT. It creates this file by performing several different functions.
You are required to supply two sequences for the comparison. Due to the likely sizes of such sequences there is no option to copy and paste them, but instead you must select them from a local file.
In theory the sequences can be in pretty much any format, but since these should be the same sequences you eventually use in ACT, they really should be in either EMBL or genbank format. Please note that ACT will not read sequences in GCG format.
Beacuse part of the analysis is masking the sequences for repeats it is necessary to know the origin of the sequences. Select the closest match from the options available (which are the default options from the repeatmasker program).
Whilst the program is running you will be regulary updated on the progress of your job. You don't usually need to preserve this information. However, if your job dies then the log file should tell you why, and you can report any bugs to the bioinformatics group as necessary.
When the program has finished running you will be presented with three different files you can download. You also have some filtering options which are only applicable when you are downloading the comparison file.
The comparison file is the list of regions of synteny between your two sequences and is the only file you absolutely require to get ACT to work.
Optionally you can also apply a filter to the list of hits to restrict them on the basis of the length of match, or their percentage identity. You can interactively alter the stringency of matches displayed from within ACT, but the filter option may be useful if you know in advance that you have hard cutoff limits below which you aren't interested in any hits.
In addition to generating the comparison file the program also allows you to download GFF (general feature format) files for both of your sequences which contain information about the position and type of repeats found in the sequences. These files can be imported into the ACT program, and will annotate the sequence view with the positions of the repeats which were removed before the comparison.