diff --git a/README b/README new file mode 100644 index 0000000000000000000000000000000000000000..37694ebb6e2c466cf349cdb167595904e6ca027f --- /dev/null +++ b/README @@ -0,0 +1,146 @@ +#INSTALL: + +Phybema requires the installation of the following programms: + + - python version >= 3.5 + + - numpy + + - Java 1.7 (for TreeCmp) + + Phybema contains the source code of the software TreeCmp. + The orginal source code can be found at https://eti.pg.edu.pl/treecmp/. + +to compile Phybema call: + + make + +#TEST: + +to run the tests call: + + make test + +#CLEAN UP: + + make clean + +#RUN: + +run Phybema with: + + phybema.py [-h] + [--tools {andi,mash,...} + [{andi,mash,...} ...]] + [-o {pdf,png,svg,none}] [-d [metric [metric ...]]] [-m] + datasetpaths [datasetpaths ...] + + positional arguments: + datasetpaths Specify a list of paths to directories which + contain a datafile and optionally a reference + tree. A datafile must be in + FASTA format (.fasta,.faa,.fna,.fasta.gz), + a reference tree must be + in newick format (.nh,.tre). + All genomes must be contained in one FASTA file. + For datafiles with a reference tree the + resulting NJ-tree (derived from the distances + delivered by the corresponding tool) will be + compared to the reference tree. + The genome names will be cut after 10 characters. + The reference tree must contain the + exact genome names resulting. + For datafiles without reference and if at least + two tools have been specified, the resulting + NJ-trees are compared against each other. + + optional arguments: + -h, --help show this help message and exit + --tools {andi,mash,...} + [{andi,mash,...} ...] + Specify the tools to evaluate. + If this is the last option, then use -- before + specifying any data set. + default: use all tools. + -o {pdf,png,svg,none}, --out_treefile_suffix {pdf,png,svg,none} + Specify suffix of output files + with the NJ-trees computed from the distance + matrices delivered by each tool. The argument + "none" means that no graphic-file is + generated. This is useful for remote sessions, + where this does not work + anyway, as the ete toolkit, which generates the + graphics output requires Qt and an X-server + default: pdf + -d [metric [metric ...]], --dist [metric [metric ...]] + Specify the distance metrics to use for + comparing the resulting NJ-trees with the + reference tree. + More than one metric is possible. + Possible metrics are: + ms Matching Split metric + pd Path Difference metric + qt Quartet metric + nrf normalised Robinson-Foulds distance + rf Robinson-Foulds distance + ms, pd, qt and rf are computed by TreeCmp, + nrf is computed using the ETE3 toolkit + default: use nrf only + -m, --mat Write the distance matrices originally computed + by the different programs into out. + +#ADD A TOOL +Technical Requirements for distance estimators +• input: sequences in multi FASTA-Format +• output: distance matrix in PHYLIP-format +• output to file or stdout + +Add a new tool as an instance of the class DistanceEstimator in the +module estimators.py. Add your tool at the Point "ADD YOUR TOOL HERE". +Include a tool by defining: + + progname = DistanceEstimator(call,call_options,fixed_out_file_name, + optional_end)) + + with: + + progname variable to add the defined instance + of DistanceEstimator + + call Define the path to call the program from + + call_options define program-specific options + + fixed_out_file_name if the output is written into a file: + specify the name of the outputfile + + optional_end option to name parameters which have to be + defined at the end of a list of parameters + for calling the program. + +By adding progname into the list all_estimators the tool is added to phybema. + +#FOLDER STRUCTURE: + + - src: includes scripts for Phybema + + - src_external: contains the source code of the software TreeCmp. + The orginal source code can be found at https://eti.pg.edu.pl/treecmp/. + + - testdata: provides real and simulated data + + - testsiud: includes test scripts + + - temp: temporary files generated while calling Phybema will be written here + + - out: generated files will be written here + + Makefile + + README + + + + + + diff --git a/README.md b/README.md deleted file mode 100644 index db4f60c49c21dc8f4fe3f13cbc9d62deb50e4304..0000000000000000000000000000000000000000 --- a/README.md +++ /dev/null @@ -1,3 +0,0 @@ -# Phybema - -Sources and Data for the Phylogenetic Benchmarking Project Phybema