Enabling Research in Evolutionary Biology
A program for aligning short reads to reference phylogenies and alignments, by Simon A. Berger.
Last update: 2016-06-10.
The easiest way is to use our precompiled binaries of PaPaRa 2.5:
We however recommend to build PaPaRa on your own, for speed reasons. The compiler might be able to optimize better for your specific hardware. Also, you need to do this if the binaries do not work for your machine. See instructions below.
Invoke PaPaRa using
./papara -t <ref tree> -s <phylip RA> -q <fasta QS>
(or papara_static_x86_64
, if you use the pre-compiled binary).
The phylip file (option -s
) must contain the reference alignment (RA), consistent with the reference tree (option -t
).
The FASTA file (option -q
) contains the unaligned query sequences (QS). Optionally, all sequences which are in <phylip RA>
but do not occur in the <ref tree>
are also interpreted as QS.
The alignment parameters can be modified using the (optional) option -p <user_options>
. <user options>
is a string and must have the following form:
<gap_open>:<gap_extend>:<mismatch>:<match_cgap>
, so the default parameters used given in the paper correspond to the user option -p -3:-1:2:-3
.
The output alignment will be written to papara_alignment.default
. You can change the file suffix (i.e., “default”) by supplying a run-name with parameter -n
.
You can invoke the multi threaded version by adding the option -j <num threads>
.
The latest source code and Readme are available at the PaPaRa GitHub repository.
Get the source
If the provided binaries (see above) do not work, you need to compile PaPaRa on your own. On Unix/Linux systems, you first need the build tools. For example, on Debian based systems use
sudo apt-get install build-essential
Then, get the PaPaRa 2.5 source from here or download directly from the repository at
https://github.com/sim82/papara_nt
Unpack into papara_nt-master/
.
If the sub-directory papara_nt-master/ivy_mike/
is empty, also download
https://github.com/sim82/ivy_mike/tree/3269b7b39dc6c129cfe72708d9086f1e8f8c2c98
and unpack its contents into papara_nt-master/ivy_mike/
.
Get Boost
If you do not have the C++ Boost libraries installed on your system, you need to install them first.
For example, on Debian based systems use
sudo apt-get install libboost-all-dev
For Mac systems, call
brew install boost
which uses the package manager Homebrew.
Build
After that, compile PaPaRa by calling
sh build_papara2.sh
in papara_nt-master/
. If you want a static binary, use sh build_papara2_static.sh
instead. The latter only works on Unix/Linux systems, as Mac does not support static linking.
(Tested on a clean install Ubuntu 14.04 LTS Virtual Machine.)
When using the program or code, please cite:
S.A. Berger, A. Stamatakis
"Aligning short reads to reference alignments and trees"
Bioinformatics (2011) 27 (15): 2068-2075 first published online June 2, 2011. doi:10.1093/bioinformatics/btr320
which is available here.
The faster and much improved version of PaPaRa 2.4/2.5 is described in the following technical report:
S.A. Berger, A. Stamatakis:
"PaPaRa 2.0: A Vectorized Algorithm forProbabilistic Phylogeny-Aware Alignment Extension",
Heidelberg, Institute for Theoretical Studies, Exelixis-RRDR-2012-5, March 2012.
which is also available as PDF.
We recommend to use the current version (see above). If you however need backwards compatibility, see here:
Also, see the PaPaRa GitHub repository.