# HG changeset patch # User diodupima # Date 1626362364 0 # Node ID d35be2be341fbdbf98e979fe3374424ed4b1d0d5 # Parent d0a7d282b1b9d07f29f83d81b0ba810b60009718 "planemo upload commit 8f9b7580dc80c99bc735ea899819ff1d109de311-dirty" diff -r d0a7d282b1b9 -r d35be2be341f macros.xml --- a/macros.xml Thu Jul 15 11:51:11 2021 +0000 +++ b/macros.xml Thu Jul 15 15:19:24 2021 +0000 @@ -327,40 +327,41 @@ **AAIc - Average Amino Acid Identity coast** -The AAIc is an attempt to have transform the AAI into a measure to compare two proteomes, as annotated. -Low identity hits will be considered, when they are usually removed. -On the other hand proteins that have no match at all will be also considered, as having 0 identity. +The AAIc is an attempt modify the AAI into a measure to compare proteomes for all annotated proteins. +Low identity hits will be considered, when they are usually removed by the traditional method. +Proteins that have no match at all will be also considered, as having 0 identity match. It provides a way to compare the actual annotation and select organisms, even if more taxonomically distant, with proteins that could be relevant for the function determination in hypothetical proteins, as an example. -For this the best hit is considered the one with the highest identity. +For this the best hit is selected by the highest identity. **AAIbd - Average Amino Acid Identity blast-diamond** The AAIbd, is a implementation of a similar calculation to that of the original -AAI, but calculated simply one way. It has by default a coverage and identity -of 50 and 40 respectively, as used also by EzAAI, based in the recent study +AAI, but calculated only one way. It has by default a coverage and identity +of 50 and 40 respectively. This values are also used by EzAAI, based in the recent study done by Nicholson et. all in 2020. The best hit is then selected by the the -highest identity The main purpose of this metric is to provide the user with an -estimate of how close taxonomically that taxid might be. The designation **bd** is -to distinguish it from the original AAIb, and because of the fact it might be +highest identity. +The main purpose of this metric is to provide the user with an +estimate of how close taxonomically that Taxonomic node might be. The designation **bd** is used +to distinguish it from the original AAIb. It identifies that the score might be produced using either BLAST results or diamond results. -The following options might be used to calibrate this selection to the user's context: +The following options might be used to calibrate this selection to the user's preference: -- Minimum Identity: Minimum Amino Acid Identity, for hit selection for AAIbd calculation -- Minimum Coverage: Minimum coverage, for hit selection for AAIbd calculation +- Minimum Identity: Minimum Amino Acid Identity, for hit selection for the AAIbd calculation; +- Minimum Coverage: Minimum coverage, for hit selection for the AAIbd calculation. **HITSPP - Hits Per Protein** The score is calculated by the quotient of the count of all the hits all proteins got, by the number of proteins in the query proteome. -This will help the user understand how represented the proteome’s proteins might be in in that database. +This will help the user understand how represented the proteome’s proteins might be in that particular database. .. class:: warningmark **WARNING** Very high values, above 100, might indicate that the taxonomic node very represented in the database. Intermediate steps only deal with up to 500 hits per proteins, before best-hit selection. -As such, a small number of organisms with very high HITSPP can reduce the amount of organisms returned. +As such, a small number of organisms with very high HITSPP scores can reduce the amount of organisms returned. ]]> diff -r d0a7d282b1b9 -r d35be2be341f tool-data/blastdb.loc.sample --- a/tool-data/blastdb.loc.sample Thu Jul 15 11:51:11 2021 +0000 +++ b/tool-data/blastdb.loc.sample Thu Jul 15 15:19:24 2021 +0000 @@ -1,5 +1,5 @@ #This is a sample file distributed with Galaxy that enables tools -#to use a directory of Samtools indexed sequences data files. You will need +#to use a directory of blast_databases. You will need #to create these data files and then create a coast_taxonomic_filters.loc file #similar to this one (store it in this directory) that points to #the directories in which those files are stored. The coast_taxonomic_filters.loc diff -r d0a7d282b1b9 -r d35be2be341f tool-data/coast_taxonomic_filters.loc.sample --- a/tool-data/coast_taxonomic_filters.loc.sample Thu Jul 15 11:51:11 2021 +0000 +++ b/tool-data/coast_taxonomic_filters.loc.sample Thu Jul 15 15:19:24 2021 +0000 @@ -1,5 +1,5 @@ #This is a sample file distributed with Galaxy that enables tools -#to use a directory of Samtools indexed sequences data files. You will need +#to use a directory of coast_taxonomic_filters. You will need #to create these data files and then create a coast_taxonomic_filters.loc file #similar to this one (store it in this directory) that points to #the directories in which those files are stored. The coast_taxonomic_filters.loc diff -r d0a7d282b1b9 -r d35be2be341f tool-data/diamond_database.loc.sample --- a/tool-data/diamond_database.loc.sample Thu Jul 15 11:51:11 2021 +0000 +++ b/tool-data/diamond_database.loc.sample Thu Jul 15 15:19:24 2021 +0000 @@ -1,5 +1,5 @@ #This is a sample file distributed with Galaxy that enables tools -#to use a directory of Samtools indexed sequences data files. You will need +#to use a directory of diamond Databases. You will need #to create these data files and then create a coast_taxonomic_filters.loc file #similar to this one (store it in this directory) that points to #the directories in which those files are stored. The coast_taxonomic_filters.loc diff -r d0a7d282b1b9 -r d35be2be341f tool_data_table_conf.xml.sample --- a/tool_data_table_conf.xml.sample Thu Jul 15 11:51:11 2021 +0000 +++ b/tool_data_table_conf.xml.sample Thu Jul 15 15:19:24 2021 +0000 @@ -8,10 +8,6 @@ value, name, path - - value, name, path - -
value, name, db_path