| Previous changeset 6:5ddad52360c5 (2021-07-09) Next changeset 8:3fa8cdba1d82 (2021-07-15) |
|
Commit message:
"planemo upload commit 8f9b7580dc80c99bc735ea899819ff1d109de311-dirty" |
|
modified:
coast_report.xml macros.xml |
|
added:
README.rst test-data/aln/10239_2021.06.11_12.34.25.txids test-data/aln/MN908947.3.gb test-data/aln/accession2taxid test-data/aln/my_blast_db.pdb test-data/aln/my_blast_db.phr test-data/aln/my_blast_db.pin test-data/aln/my_blast_db.pog test-data/aln/my_blast_db.pos test-data/aln/my_blast_db.pot test-data/aln/my_blast_db.psq test-data/aln/my_blast_db.ptf test-data/aln/my_blast_db.pto test-data/aln/my_diamond_db.dmnd test-data/aln/protein.faa test-data/aln/test_map.txt |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 README.rst --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.rst Thu Jul 15 11:45:05 2021 +0000 |
| b |
| @@ -0,0 +1,11 @@ +COAST's suite of Galaxy Tools +________________________________________ + +COAST is tool designed to identify close proteomes for a user provided query, particulary for virus, using conventional alignment tools. +The close proteomes are provided at NCBI's taxonomy node level. + +This suite includes: + +- Tools for COAST Search (BLAST and diamond) +- Tools for COAST Report +- PhageCOAST tool for the Phage Toolkit \ No newline at end of file |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 coast_report.xml --- a/coast_report.xml Fri Jul 09 17:49:13 2021 +0000 +++ b/coast_report.xml Thu Jul 15 11:45:05 2021 +0000 |
| [ |
| @@ -52,9 +52,21 @@ <output name="coast_results" file="report_h/coast_results.tab" /> </test> </tests> - <help><![CDATA[ - COAST Report - Generate COAST reports and outputs based in different parameters. - ]]></help> - <expand macro="citations"/> + <help> + +@HYPO_FILTER_WARNING@ + +COAST Report +============ + +Generate COAST reports and outputs based in different parameters, using the tabular alignment output produced by COAST Search jobs. + +@GENERAL_DESC@ +@AAI_DESC@ +@OUT_DESC@ + + </help> + <citations> + <expand macro="citations_coast"/> + </citations> </tool> \ No newline at end of file |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 macros.xml --- a/macros.xml Fri Jul 09 17:49:13 2021 +0000 +++ b/macros.xml Thu Jul 15 11:45:05 2021 +0000 |
| [ |
| b'@@ -3,17 +3,83 @@\n <xml name="requirements">\n <requirement type="package" version="0.1.2">coast</requirement>\n </xml>\n- <xml name="citations">\n- <citations>\n- <citation type="bibtex">@misc{noauthor_coast_nodate,\n- title = {{COAST} - {Compartive} {Ominc} {Alignment} {Search} {Tool}},\n- url = {https://gitlab.com/coast_tool/COAST},\n- abstract = {Alignment search tool that identifies similar proteomes},\n- language = {en},\n- urldate = {2021-06-22},\n- }\n- </citation>\n- </citations>\n+ <xml name="citations_coast">\n+ <citation type="bibtex">@misc{noauthor_coast_nodate,\n+ title = {{COAST} - {Compartive} {Ominc} {Alignment} {Search} {Tool}},\n+ url = {https://gitlab.com/coast_tool/COAST},\n+ abstract = {Alignment search tool that identifies similar proteomes},\n+ language = {en},\n+ urldate = {2021-06-22},\n+ }\n+ </citation>\n+ </xml>\n+ <xml name="citations_taxonkit">\n+ <citation type="bibtex">@article{shen_taxonkit_2021,\n+ abstract = {The National Center for Biotechnology Information (NCBI) Taxonomy is widely applied in biomedical and ecological studies. Typical demands include querying taxonomy identifier (TaxIds) by taxonomy names, querying complete taxonomic lineages by TaxIds, listing descendants of given TaxIds, and others. However, existed tools are either limited in functionalities or inefficient in terms of runtime. In this work, we present TaxonKit, a command-line toolkit for comprehensive and efficient manipulation of NCBI Taxonomy data. TaxonKit comprises seven core subcommands providing functions, including TaxIds querying, listing, filtering, lineage retrieving and reformatting, lowest common ancestor computation, and TaxIds change tracking. The practical functions, competitive processing performance, scalability with different scales of datasets and good accessibility could facilitate taxonomy data manipulations. TaxonKit provides free access under the permissive MIT license on GitHub, Brewsci, and Bioconda. The documents are also available at https://bioinf.shenwei.me/taxonkit/.},\n+ author = {Shen, Wei and Ren, Hong},\n+ doi = {10.1016/j.jgg.2021.03.006},\n+ file = {ScienceDirect Snapshot:/home/dm/Zotero/storage/Q3KYT6QS/S1673852721000837.html:text/html},\n+ issn = {1673-8527},\n+ journal = {Journal of Genetics and Genomics},\n+ keywords = {Lineage; NCBI Taxonomy; TaxId; TaxId changelog; TaxonKit},\n+ language = {en},\n+ month = apr,\n+ shorttitle = {{TaxonKit}},\n+ title = {{TaxonKit}: {A} practical and efficient {NCBI} taxonomy toolkit},\n+ url = {https://www.sciencedirect.com/science/article/pii/S1673852721000837},\n+ urldate = {2021-06-21},\n+ year = {2021}\n+ }\n+ </citation>\n+ </xml>\n+ <xml name="citations_diamond">\n+ <citation type="bibtex">@article{buchfink_sensitive_2021,\n+ title = {Sensitive protein alignments at tree-of-life scale using {DIAMOND}},\n+ volume = {18},\n+ issn = {1548-7091, 1548-7105},\n+ url = {http://www.nature.com/articles/s41592-021-01101-x},\n+ doi = {10.1038/s41592-021-01101-x},\n+ abstract = {Abstract\n+ We are at the beginning of a genomic revolution in which all known species are planned to be sequenced. Accessing such data for comparative analyses is crucial in this new age of data-driven biology. Here, we introduce an improved version of DIAMOND that greatly exceeds previous search performances and harnesses supercomputing to perform tree-of-life scale protein alignments in hours, while match'..b'___________\n+\n+**AAIc - Average Amino Acid Identity coast**\n+\n+The AAIc is an attempt to have transform the AAI into a measure to compare two proteomes, as annotated.\n+Low identity hits will be considered, when they are usually removed.\n+On the other hand proteins that have no match at all will be also considered, as having 0 identity.\n+It provides a way to compare the actual annotation and select organisms, even if more taxonomically distant, with proteins that could be\n+relevant for the function determination in hypothetical proteins, as an example.\n+For this the best hit is considered the one with the highest identity.\n+\n+**AAIbd - Average Amino Acid Identity blast-diamond**\n+\n+The AAIbd, is a implementation of a similar calculation to that of the original\n+AAI, but calculated simply one way. It has by default a coverage and identity\n+of 50 and 40 respectively, as used also by EzAAI, based in the recent study\n+done by Nicholson et. all in 2020. The best hit is then selected by the the\n+highest identity The main purpose of this metric is to provide the user with an\n+estimate of how close taxonomically that taxid might be. The designation **bd** is\n+to distinguish it from the original AAIb, and because of the fact it might be\n+produced using either BLAST results or diamond results.\n+\n+The following options might be used to calibrate this selection to the user\'s context:\n+\n+- Minimum Identity: Minimum Amino Acid Identity, for hit selection for AAIbd calculation\n+- Minimum Coverage: Minimum coverage, for hit selection for AAIbd calculation\n+\n+**HITSPP - Hits Per Protein**\n+\n+The score is calculated by the quotient of the count of all the hits all proteins got, by the number of proteins in the query\n+proteome.\n+This will help the user understand how represented the proteome\xe2\x80\x99s proteins might be in in that database.\n+\n+.. class:: warningmark\n+\n+**WARNING** Very high values, above 100, might indicate that the taxonomic node very represented in the database.\n+Intermediate steps only deal with up to 500 hits per proteins, before best-hit selection.\n+As such, a small number of organisms with very high HITSPP can reduce the amount of organisms returned.\n+\n+ ]]></token>\n+ <token name="@OUT_DESC@"><![CDATA[\n+\n+Outputs\n+_______\n+\n+**Batch alignment results** This is a non-optional output. It contains the total alignment search results for all proteins in the proteome. This can also be used to generated new outputs from the COAST Report tool, using different parameters.\n+\n+**Summarized report** Is an HTML document that contains a list of filtered results ordered by AAIc. This report includes an heatmap visualization for protein identities.\n+It also contains metadata for the COAST job.\n+\n+**Best-hits table** Tabular file with all the individual selected best-hits for each protein in the proteome. These are hits selected for the AAIc calculation.\n+\n+**Results table** Tabular file with aggregated metrics for each proteome match. Aggregated for TAXID.\n+\n+ ]]></token>\n+ <token name="@TAX_FILTER_WARNING@"><![CDATA[\n+\n+Taxonomic Filtering\n+___________________\n+\n+Taxonomic based filtering is present in both BLAST and diamond. It is **THE** key for short COAST run times in large databases.\n+\n+Most organisms in a database, like nr or Trembl ,are not useful in the close proteomes identification process.\n+When users try to identify similar viruses, the bacteria and eukaryotes in the same database will only slow the search down.\n+You should determine how wide you desire the search to be and identify the corresponding TAXID node.\n+Some of these filters are provided along with this tool.\n+\n+ ]]></token>\n+ <token name="@HYPO_FILTER_WARNING@"><![CDATA[\n+.. class:: warningmark\n+\n+**WARNING** Hypothetical protein filtering might lead to worse results. Should only be used when few of the proteins have corresponding best-hits and the database might lack poorly studied proteins.\n+\n+ ]]></token>\n+\n </macros>\n\\ No newline at end of file\n' |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 test-data/aln/10239_2021.06.11_12.34.25.txids --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/aln/10239_2021.06.11_12.34.25.txids Thu Jul 15 11:45:05 2021 +0000 |
| b |
| b'@@ -0,0 +1,212396 @@\n+46014\n+174676\n+244589\n+244590\n+459290\n+459291\n+693628\n+2032563\n+256994\n+2032573\n+248496\n+1959008\n+1959009\n+2491899\n+2508184\n+1298531\n+2655037\n+693627\n+693629\n+10484\n+654920\n+265522\n+74492\n+96779\n+160841\n+219164\n+993492\n+1329648\n+1428464\n+1623311\n+419435\n+36344\n+39640\n+47223\n+53988\n+654919\n+116759\n+199362\n+257816\n+218283\n+256130\n+359323\n+360638\n+449720\n+452647\n+452648\n+452649\n+523797\n+1823751\n+1823752\n+1823753\n+1823754\n+1836595\n+2487141\n+2664133\n+451706\n+523798\n+523799\n+574230\n+2083300\n+2169782\n+191766\n+12340\n+12347\n+12366\n+12371\n+12374\n+12375\n+12386\n+12388\n+12392\n+12403\n+12404\n+12405\n+12406\n+12408\n+12409\n+12412\n+12420\n+12424\n+12425\n+12427\n+12428\n+28368\n+31760\n+33768\n+33769\n+38018\n+39425\n+39943\n+41669\n+42171\n+42172\n+42173\n+45331\n+45332\n+48224\n+53480\n+54392\n+57476\n+60457\n+63117\n+65388\n+73492\n+76262\n+77920\n+86065\n+89551\n+100637\n+100638\n+100639\n+100640\n+105686\n+108916\n+108917\n+108918\n+126970\n+128975\n+129861\n+129862\n+132905\n+137422\n+147128\n+148339\n+156615\n+156616\n+156617\n+156618\n+156619\n+156620\n+156621\n+156622\n+156623\n+156624\n+156625\n+156626\n+156627\n+156653\n+156654\n+156655\n+156656\n+156657\n+156658\n+156659\n+156660\n+156661\n+156662\n+156663\n+156664\n+156665\n+156666\n+156667\n+156668\n+156669\n+156670\n+156671\n+156672\n+156673\n+156674\n+156675\n+156676\n+156677\n+156678\n+156679\n+156680\n+156681\n+156711\n+156712\n+156713\n+156714\n+156715\n+156716\n+156717\n+156718\n+156719\n+156720\n+156721\n+156722\n+156723\n+156724\n+156725\n+156726\n+156727\n+156728\n+156729\n+156730\n+156731\n+156732\n+156733\n+156740\n+156741\n+156742\n+156743\n+156745\n+156746\n+156747\n+156748\n+156749\n+156750\n+156767\n+156768\n+156769\n+156770\n+156771\n+156772\n+156773\n+156774\n+156775\n+156776\n+156777\n+156778\n+156779\n+156780\n+156781\n+156782\n+156783\n+156784\n+156785\n+156786\n+156787\n+156788\n+156789\n+156790\n+156791\n+156792\n+156797\n+156798\n+156799\n+156800\n+156801\n+156802\n+156803\n+156804\n+156805\n+156806\n+156807\n+156808\n+156809\n+156810\n+156811\n+156812\n+156813\n+156814\n+156815\n+156816\n+156817\n+156818\n+156819\n+156820\n+156821\n+156822\n+156823\n+156824\n+156825\n+156826\n+156827\n+156828\n+156829\n+156830\n+156831\n+156832\n+156833\n+156834\n+156835\n+215796\n+278008\n+306552\n+432371\n+436674\n+445563\n+445564\n+445565\n+707152\n+1168813\n+1168816\n+1168818\n+1168819\n+1168820\n+1168825\n+1168828\n+1168829\n+1168831\n+1168832\n+1168833\n+1168834\n+1168835\n+1168837\n+1168838\n+1168848\n+164125\n+167320\n+167533\n+172666\n+173830\n+173831\n+176100\n+181120\n+181485\n+185368\n+185369\n+187176\n+196194\n+198932\n+210927\n+219292\n+229343\n+229344\n+229345\n+229346\n+239740\n+241652\n+242708\n+262790\n+264484\n+268585\n+268586\n+268587\n+268588\n+268590\n+272473\n+272757\n+279276\n+280702\n+282372\n+282690\n+282691\n+282692\n+282693\n+282698\n+282699\n+282700\n+282701\n+282702\n+282703\n+282704\n+282705\n+282709\n+282710\n+284052\n+292511\n+293378\n+293711\n+293712\n+293713\n+294363\n+319711\n+319712\n+319713\n+334523\n+338345\n+347331\n+350104\n+356350\n+357204\n+360048\n+360049\n+360050\n+362866\n+364251\n+364252\n+364253\n+364254\n+370564\n+370565\n+374421\n+375032\n+375033\n+375034\n+375035\n+375036\n+375037\n+375038\n+375039\n+375040\n+375041\n+375042\n+375043\n+375044\n+375045\n+375046\n+375047\n+375048\n+375049\n+375050\n+375051\n+375052\n+375053\n+375054\n+375055\n+375056\n+375057\n+375058\n+375059\n+382277\n+382278\n+382279\n+382280\n+382281\n+382282\n+376611\n+382263\n+382264\n+382265\n+382266\n+382267\n+382268\n+382269\n+382271\n+382272\n+382274\n+387086\n+370566\n+370567\n+370568\n+370569\n+387087\n+370559\n+370560\n+370561\n+370562\n+370563\n+387088\n+370556\n+370557\n+370558\n+405001\n+409026\n+417289\n+417290\n+430512\n+430513\n+430514\n+430515\n+432197\n+432199\n+432202\n+432203\n+435305\n+435637\n+440576\n+445701\n+447796\n+447800\n+449399\n+458420\n+458421\n+458422\n+458423\n+458424\n+458425\n+458426\n+458427\n+458428\n+458429\n+458430\n+458431\n+458432\n+458433\n+458434\n+458435\n+458436\n+458437\n+458438\n+458439\n+458440\n+458441\n+458442\n+458443\n+458444\n+458445\n+458446\n+458447\n+458448\n+458449\n+458450\n+458451\n+458452\n+458453\n+458454\n+458455\n+458456\n+458457\n+458458\n+458459\n+458460\n+458461\n+458462\n+458463\n+458464\n+458465\n+458466\n+458467\n+458468\n+458469\n+458470\n+458471\n+458472\n+458473\n+458474\n+458'..b'\n+1245909\n+1732175\n+1895333\n+1983545\n+1983546\n+1983547\n+1983548\n+1983549\n+1983550\n+1983551\n+1983552\n+2493121\n+2730617\n+2730618\n+2730619\n+10479\n+10480\n+10481\n+1805492\n+2730621\n+342409\n+46615\n+92652\n+654913\n+2778543\n+1675544\n+2488332\n+2133792\n+2133793\n+2133794\n+2133795\n+2133796\n+10469\n+134999\n+259389\n+384357\n+446772\n+1260293\n+1260294\n+1920699\n+1920700\n+1927815\n+2202138\n+2282487\n+2282488\n+2282489\n+2282490\n+2282491\n+2282492\n+2282493\n+2282494\n+2282495\n+2282496\n+2282497\n+2282498\n+2282499\n+2282500\n+2282501\n+2282502\n+2282503\n+2282504\n+2282505\n+2282506\n+2282507\n+2282508\n+2282509\n+2282510\n+2282511\n+2282512\n+2282513\n+2282514\n+2282515\n+2282516\n+2282517\n+2282519\n+2282520\n+2282521\n+2282522\n+2681587\n+2748748\n+10449\n+10454\n+31506\n+10455\n+10456\n+10468\n+28288\n+46242\n+51313\n+148363\n+566972\n+991878\n+1077220\n+1569367\n+58094\n+59376\n+65124\n+70600\n+74320\n+74660\n+78219\n+101850\n+161494\n+1234615\n+2126611\n+166921\n+204440\n+559170\n+207830\n+208013\n+208973\n+1367204\n+224399\n+262177\n+268591\n+271108\n+172039\n+640862\n+1208064\n+307456\n+46015\n+1136025\n+80366\n+654904\n+307460\n+307468\n+379891\n+709481\n+307467\n+320432\n+1262541\n+332054\n+447897\n+490711\n+10450\n+10453\n+10471\n+28290\n+31507\n+36357\n+38012\n+38765\n+44564\n+49081\n+65801\n+78220\n+91234\n+191492\n+212102\n+260683\n+262167\n+262169\n+262170\n+262171\n+262173\n+262176\n+262179\n+265022\n+271106\n+271109\n+271110\n+271111\n+271112\n+271113\n+271114\n+271115\n+271116\n+276758\n+289183\n+307455\n+307457\n+307458\n+307459\n+307461\n+307462\n+307463\n+307464\n+307465\n+307466\n+307469\n+307470\n+307471\n+307472\n+307473\n+307475\n+307476\n+309368\n+309833\n+328433\n+351418\n+351419\n+388634\n+436110\n+480409\n+512493\n+513657\n+521523\n+533260\n+542343\n+554817\n+566270\n+568578\n+571205\n+586265\n+654964\n+673455\n+925754\n+946143\n+980897\n+1070316\n+1117131\n+1117132\n+1136026\n+1136027\n+1219875\n+1346819\n+1346821\n+1346824\n+1346829\n+1347909\n+1347910\n+1347911\n+1592576\n+1638618\n+1708404\n+1776679\n+1810940\n+1881632\n+2306063\n+2321387\n+2520509\n+2571060\n+2571265\n+2591332\n+2594175\n+2605774\n+2605775\n+2689371\n+2737031\n+1046267\n+1070315\n+1207438\n+1592335\n+1242863\n+1307954\n+1307956\n+1307957\n+1563660\n+1580580\n+1642929\n+1675866\n+1684825\n+1850906\n+1906244\n+1962501\n+1987479\n+2083176\n+2304025\n+2315721\n+2560521\n+1367203\n+2560562\n+134394\n+2560642\n+1675865\n+2824179\n+10462\n+1916701\n+10464\n+28289\n+654905\n+35254\n+51677\n+52412\n+56947\n+364745\n+98383\n+115813\n+166056\n+170617\n+192584\n+262175\n+283675\n+307444\n+307454\n+10463\n+45440\n+157825\n+252584\n+262168\n+262172\n+262174\n+270494\n+271101\n+271102\n+271103\n+271104\n+271105\n+307442\n+307443\n+307445\n+307448\n+307450\n+307451\n+307452\n+307453\n+359943\n+359944\n+915443\n+1346825\n+1347908\n+1352289\n+1675863\n+2078988\n+2760664\n+359919\n+362830\n+10465\n+489830\n+1675862\n+1750712\n+1986289\n+1986291\n+1986290\n+2072024\n+2169745\n+36355\n+2169746\n+1136024\n+111874\n+249151\n+654906\n+204507\n+130556\n+645993\n+1285594\n+2747309\n+379529\n+1285595\n+2747498\n+523909\n+1110704\n+1217568\n+1487700\n+1578827\n+1654582\n+2053981\n+2072209\n+2486603\n+2707358\n+2742594\n+2743613\n+2778529\n+2778530\n+2815508\n+92521\n+432587\n+2057187\n+29250\n+1128424\n+1546257\n+1529056\n+2509616\n+12475\n+10421\n+10422\n+10423\n+10424\n+10425\n+10426\n+10427\n+10428\n+31762\n+31763\n+31764\n+261991\n+261992\n+261993\n+261994\n+261995\n+261996\n+510659\n+510660\n+510661\n+510662\n+510663\n+510664\n+510665\n+510666\n+510667\n+510668\n+510669\n+510670\n+510671\n+510672\n+510673\n+510674\n+510675\n+510676\n+510677\n+510678\n+510679\n+510680\n+510681\n+510682\n+510834\n+510835\n+510836\n+510837\n+510838\n+510839\n+510840\n+510841\n+510842\n+510843\n+510844\n+510845\n+510846\n+510847\n+510848\n+510849\n+510850\n+510851\n+510852\n+510853\n+510854\n+510855\n+510856\n+510857\n+510858\n+510859\n+510860\n+510861\n+510862\n+510863\n+510864\n+510865\n+510866\n+510867\n+510868\n+510869\n+510870\n+510871\n+510872\n+510873\n+510874\n+510875\n+510876\n+510877\n+510878\n+510879\n+510880\n+510881\n+510882\n+510883\n+510884\n+510885\n+510886\n+510887\n+510888\n+510889\n+510890\n+510891\n+510892\n+510893\n+510894\n+510895\n+510896\n+510897\n+510898\n+510899\n+510900\n+510901\n+510902\n+510903\n+2691025\n+2364132\n+2419666\n+2596899\n+2596900\n+2596901\n+2596902\n+2751480\n+2766951\n+2767006\n' |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 test-data/aln/MN908947.3.gb --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/aln/MN908947.3.gb Thu Jul 15 11:45:05 2021 +0000 |
| b |
| b'@@ -0,0 +1,798 @@\n+LOCUS MN908947 29903 bp ss-RNA linear VRL 18-MAR-2020\n+DEFINITION Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1,\n+ complete genome.\n+ACCESSION MN908947\n+VERSION MN908947.3\n+KEYWORDS .\n+SOURCE Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)\n+ ORGANISM Severe acute respiratory syndrome coronavirus 2\n+ Viruses; Riboviria; Orthornavirae; Pisuviricota; Pisoniviricetes;\n+ Nidovirales; Cornidovirineae; Coronaviridae; Orthocoronavirinae;\n+ Betacoronavirus; Sarbecovirus.\n+REFERENCE 1 (bases 1 to 29903)\n+ AUTHORS Wu,F., Zhao,S., Yu,B., Chen,Y.M., Wang,W., Song,Z.G., Hu,Y.,\n+ Tao,Z.W., Tian,J.H., Pei,Y.Y., Yuan,M.L., Zhang,Y.L., Dai,F.H.,\n+ Liu,Y., Wang,Q.M., Zheng,J.J., Xu,L., Holmes,E.C. and Zhang,Y.Z.\n+ TITLE A new coronavirus associated with human respiratory disease in\n+ China\n+ JOURNAL Nature 579 (7798), 265-269 (2020)\n+ PUBMED 32015508\n+REFERENCE 2 (bases 1 to 29903)\n+ AUTHORS Wu,F., Zhao,S., Yu,B., Chen,Y.-M., Wang,W., Hu,Y., Song,Z.-G.,\n+ Tao,Z.-W., Tian,J.-H., Pei,Y.-Y., Yuan,M.L., Zhang,Y.-L.,\n+ Dai,F.-H., Liu,Y., Wang,Q.-M., Zheng,J.-J., Xu,L., Holmes,E.C. and\n+ Zhang,Y.-Z.\n+ TITLE Direct Submission\n+ JOURNAL Submitted (05-JAN-2020) Shanghai Public Health Clinical Center &\n+ School of Public Health, Fudan University, Shanghai, China\n+COMMENT On Jan 17, 2020 this sequence version replaced MN908947.2.\n+ \n+ ##Assembly-Data-START##\n+ Assembly Method :: Megahit v. V1.1.3\n+ Sequencing Technology :: Illumina\n+ ##Assembly-Data-END##\n+FEATURES Location/Qualifiers\n+ source 1..29903\n+ /organism="Severe acute respiratory syndrome coronavirus\n+ 2"\n+ /mol_type="genomic RNA"\n+ /isolate="Wuhan-Hu-1"\n+ /host="Homo sapiens"\n+ /db_xref="taxon:2697049"\n+ /country="China"\n+ /collection_date="Dec-2019"\n+ 5\'UTR 1..265\n+ gene 266..21555\n+ /gene="orf1ab"\n+ CDS join(266..13468,13468..21555)\n+ /gene="orf1ab"\n+ /ribosomal_slippage\n+ /note="pp1ab; translated by -1 ribosomal frameshift"\n+ /codon_start=1\n+ /product="orf1ab polyprotein"\n+ /protein_id="QHD43415.1"\n+ /translation="MESLVPGFNEKTHVQLSLPVLQVRDVLVRGFGDSVEEVLSEARQ\n+ HLKDGTCGLVEVEKGVLPQLEQPYVFIKRSDARTAPHGHVMVELVAELEGIQYGRSGE\n+ TLGVLVPHVGEIPVAYRKVLLRKNGNKGAGGHSYGADLKSFDLGDELGTDPYEDFQEN\n+ WNTKHSSGVTRELMRELNGGAYTRYVDNNFCGPDGYPLECIKDLLARAGKASCTLSEQ\n+ LDFIDTKRGVYCCREHEHEIAWYTERSEKSYELQTPFEIKLAKKFDTFNGECPNFVFP\n+ LNSIIKTIQPRVEKKKLDGFMGRIRSVYPVASPNECNQMCLSTLMKCDHCGETSWQTG\n+ DFVKATCEFCGTENLTKEGATTCGYLPQNAVVKIYCPACHNSEVGPEHSLAEYHNESG\n+ LKTILRKGGRTIAFGGCVFSYVGCHNKCAYWVPRASANIGCNHTGVVGEGSEGLNDNL\n+ LEILQKEKVNINIVGDFKLNEEIAIILASFSASTSAFVETVKGLDYKAFKQIVESCGN\n+ FKVTKGKAKKGAWNIGEQKSILSPLYAFASEAARVVRSIFSRTLETAQNSVRVLQKAA\n+ ITILDGISQYSLRLIDAMMFTSDLATNNLVVMAYITGGVVQLTSQWLTNIFGTVYEKL\n+ KPVLDWLEEKFKEGVEFLRDGWEIVKFISTCACEIVGGQIVTCAKEIKESVQTFFKLV\n+ NKFLALCADSIIIGGAKLKALNLGETFVTHSKGLYRKCVKSREETGLLMPLKAPKEII\n+ FLEGETLPTEVLTEEVVLKTGDLQPLEQPTSEAVEAPLVGTPVCINGLMLLEIKDTEK\n+ YCALAPNMMVTNNTFTLKGGAPTKVTFGDDTVIEVQGYKSVNITFELDERIDKVLNEK\n+ CSAYTVELGTEVNEFACVVADAVIKTLQPVSELLTPLGIDLDEWSMATYYLFDESGEF\n+ KLASHMYCSFYPPDEDEEEGDCEEEEFEPS'..b'gtggct cagctacttc attgcttctt\n+ 26821 tcagactgtt tgcgcgtacg cgttccatgt ggtcattcaa tccagaaact aacattcttc\n+ 26881 tcaacgtgcc actccatggc actattctga ccagaccgct tctagaaagt gaactcgtaa\n+ 26941 tcggagctgt gatccttcgt ggacatcttc gtattgctgg acaccatcta ggacgctgtg\n+ 27001 acatcaagga cctgcctaaa gaaatcactg ttgctacatc acgaacgctt tcttattaca\n+ 27061 aattgggagc ttcgcagcgt gtagcaggtg actcaggttt tgctgcatac agtcgctaca\n+ 27121 ggattggcaa ctataaatta aacacagacc attccagtag cagtgacaat attgctttgc\n+ 27181 ttgtacagta agtgacaaca gatgtttcat ctcgttgact ttcaggttac tatagcagag\n+ 27241 atattactaa ttattatgag gacttttaaa gtttccattt ggaatcttga ttacatcata\n+ 27301 aacctcataa ttaaaaattt atctaagtca ctaactgaga ataaatattc tcaattagat\n+ 27361 gaagagcaac caatggagat tgattaaacg aacatgaaaa ttattctttt cttggcactg\n+ 27421 ataacactcg ctacttgtga gctttatcac taccaagagt gtgttagagg tacaacagta\n+ 27481 cttttaaaag aaccttgctc ttctggaaca tacgagggca attcaccatt tcatcctcta\n+ 27541 gctgataaca aatttgcact gacttgcttt agcactcaat ttgcttttgc ttgtcctgac\n+ 27601 ggcgtaaaac acgtctatca gttacgtgcc agatcagttt cacctaaact gttcatcaga\n+ 27661 caagaggaag ttcaagaact ttactctcca atttttctta ttgttgcggc aatagtgttt\n+ 27721 ataacacttt gcttcacact caaaagaaag acagaatgat tgaactttca ttaattgact\n+ 27781 tctatttgtg ctttttagcc tttctgctat tccttgtttt aattatgctt attatctttt\n+ 27841 ggttctcact tgaactgcaa gatcataatg aaacttgtca cgcctaaacg aacatgaaat\n+ 27901 ttcttgtttt cttaggaatc atcacaactg tagctgcatt tcaccaagaa tgtagtttac\n+ 27961 agtcatgtac tcaacatcaa ccatatgtag ttgatgaccc gtgtcctatt cacttctatt\n+ 28021 ctaaatggta tattagagta ggagctagaa aatcagcacc tttaattgaa ttgtgcgtgg\n+ 28081 atgaggctgg ttctaaatca cccattcagt acatcgatat cggtaattat acagtttcct\n+ 28141 gtttaccttt tacaattaat tgccaggaac ctaaattggg tagtcttgta gtgcgttgtt\n+ 28201 cgttctatga agacttttta gagtatcatg acgttcgtgt tgttttagat ttcatctaaa\n+ 28261 cgaacaaact aaaatgtctg ataatggacc ccaaaatcag cgaaatgcac cccgcattac\n+ 28321 gtttggtgga ccctcagatt caactggcag taaccagaat ggagaacgca gtggggcgcg\n+ 28381 atcaaaacaa cgtcggcccc aaggtttacc caataatact gcgtcttggt tcaccgctct\n+ 28441 cactcaacat ggcaaggaag accttaaatt ccctcgagga caaggcgttc caattaacac\n+ 28501 caatagcagt ccagatgacc aaattggcta ctaccgaaga gctaccagac gaattcgtgg\n+ 28561 tggtgacggt aaaatgaaag atctcagtcc aagatggtat ttctactacc taggaactgg\n+ 28621 gccagaagct ggacttccct atggtgctaa caaagacggc atcatatggg ttgcaactga\n+ 28681 gggagccttg aatacaccaa aagatcacat tggcacccgc aatcctgcta acaatgctgc\n+ 28741 aatcgtgcta caacttcctc aaggaacaac attgccaaaa ggcttctacg cagaagggag\n+ 28801 cagaggcggc agtcaagcct cttctcgttc ctcatcacgt agtcgcaaca gttcaagaaa\n+ 28861 ttcaactcca ggcagcagta ggggaacttc tcctgctaga atggctggca atggcggtga\n+ 28921 tgctgctctt gctttgctgc tgcttgacag attgaaccag cttgagagca aaatgtctgg\n+ 28981 taaaggccaa caacaacaag gccaaactgt cactaagaaa tctgctgctg aggcttctaa\n+ 29041 gaagcctcgg caaaaacgta ctgccactaa agcatacaat gtaacacaag ctttcggcag\n+ 29101 acgtggtcca gaacaaaccc aaggaaattt tggggaccag gaactaatca gacaaggaac\n+ 29161 tgattacaaa cattggccgc aaattgcaca atttgccccc agcgcttcag cgttcttcgg\n+ 29221 aatgtcgcgc attggcatgg aagtcacacc ttcgggaacg tggttgacct acacaggtgc\n+ 29281 catcaaattg gatgacaaag atccaaattt caaagatcaa gtcattttgc tgaataagca\n+ 29341 tattgacgca tacaaaacat tcccaccaac agagcctaaa aaggacaaaa agaagaaggc\n+ 29401 tgatgaaact caagccttac cgcagagaca gaagaaacag caaactgtga ctcttcttcc\n+ 29461 tgctgcagat ttggatgatt tctccaaaca attgcaacaa tccatgagca gtgctgactc\n+ 29521 aactcaggcc taaactcatg cagaccacac aaggcagatg ggctatataa acgttttcgc\n+ 29581 ttttccgttt acgatatata gtctactctt gtgcagaatg aattctcgta actacatagc\n+ 29641 acaagtagat gtagttaact ttaatctcac atagcaatct ttaatcagtg tgtaacatta\n+ 29701 gggaggactt gaaagagcca ccacattttc accgaggcca cgcggagtac gatcgagtgt\n+ 29761 acagtgaaca atgctaggga gagctgccta tatggaagag ccctaatgtg taaaattaat\n+ 29821 tttagtagtg ctatccccat gtgattttaa tagcttctta ggagaatgac aaaaaaaaaa\n+ 29881 aaaaaaaaaa aaaaaaaaaa aaa\n+//\n+\n' |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 test-data/aln/accession2taxid --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/aln/accession2taxid Thu Jul 15 11:45:05 2021 +0000 |
| b |
| @@ -0,0 +1,11 @@ +accession accession.version taxid gi +QHD43415 QHD43415.1 2697049 +QHD43416 QHD43416.1 2697049 +QHD43417 QHD43417.1 2697049 +QHD43418 QHD43418.1 2697049 +QHD43419 QHD43419.1 2697049 +QHD43420 QHD43420.1 2697049 +QHD43421 QHD43421.1 2697049 +QHD43422 QHD43422.1 2697049 +QHD43423 QHD43423.2 2697049 +QHI42199 QHI42199.1 2697049 \ No newline at end of file |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 test-data/aln/my_blast_db.pdb |
| b |
| Binary file test-data/aln/my_blast_db.pdb has changed |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 test-data/aln/my_blast_db.phr |
| b |
| Binary file test-data/aln/my_blast_db.phr has changed |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 test-data/aln/my_blast_db.pin |
| b |
| Binary file test-data/aln/my_blast_db.pin has changed |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 test-data/aln/my_blast_db.pog |
| b |
| Binary file test-data/aln/my_blast_db.pog has changed |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 test-data/aln/my_blast_db.pos |
| b |
| Binary file test-data/aln/my_blast_db.pos has changed |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 test-data/aln/my_blast_db.pot |
| b |
| Binary file test-data/aln/my_blast_db.pot has changed |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 test-data/aln/my_blast_db.psq |
| b |
| Binary file test-data/aln/my_blast_db.psq has changed |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 test-data/aln/my_blast_db.ptf |
| b |
| Binary file test-data/aln/my_blast_db.ptf has changed |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 test-data/aln/my_blast_db.pto |
| b |
| Binary file test-data/aln/my_blast_db.pto has changed |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 test-data/aln/my_diamond_db.dmnd |
| b |
| Binary file test-data/aln/my_diamond_db.dmnd has changed |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 test-data/aln/protein.faa --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/aln/protein.faa Thu Jul 15 11:45:05 2021 +0000 |
| b |
| b'@@ -0,0 +1,178 @@\n+>QHD43415.1 orf1ab polyprotein\n+MESLVPGFNEKTHVQLSLPVLQVRDVLVRGFGDSVEEVLSEARQHLKDGTCGLVEVEKGV\n+LPQLEQPYVFIKRSDARTAPHGHVMVELVAELEGIQYGRSGETLGVLVPHVGEIPVAYRK\n+VLLRKNGNKGAGGHSYGADLKSFDLGDELGTDPYEDFQENWNTKHSSGVTRELMRELNGG\n+AYTRYVDNNFCGPDGYPLECIKDLLARAGKASCTLSEQLDFIDTKRGVYCCREHEHEIAW\n+YTERSEKSYELQTPFEIKLAKKFDTFNGECPNFVFPLNSIIKTIQPRVEKKKLDGFMGRI\n+RSVYPVASPNECNQMCLSTLMKCDHCGETSWQTGDFVKATCEFCGTENLTKEGATTCGYL\n+PQNAVVKIYCPACHNSEVGPEHSLAEYHNESGLKTILRKGGRTIAFGGCVFSYVGCHNKC\n+AYWVPRASANIGCNHTGVVGEGSEGLNDNLLEILQKEKVNINIVGDFKLNEEIAIILASF\n+SASTSAFVETVKGLDYKAFKQIVESCGNFKVTKGKAKKGAWNIGEQKSILSPLYAFASEA\n+ARVVRSIFSRTLETAQNSVRVLQKAAITILDGISQYSLRLIDAMMFTSDLATNNLVVMAY\n+ITGGVVQLTSQWLTNIFGTVYEKLKPVLDWLEEKFKEGVEFLRDGWEIVKFISTCACEIV\n+GGQIVTCAKEIKESVQTFFKLVNKFLALCADSIIIGGAKLKALNLGETFVTHSKGLYRKC\n+VKSREETGLLMPLKAPKEIIFLEGETLPTEVLTEEVVLKTGDLQPLEQPTSEAVEAPLVG\n+TPVCINGLMLLEIKDTEKYCALAPNMMVTNNTFTLKGGAPTKVTFGDDTVIEVQGYKSVN\n+ITFELDERIDKVLNEKCSAYTVELGTEVNEFACVVADAVIKTLQPVSELLTPLGIDLDEW\n+SMATYYLFDESGEFKLASHMYCSFYPPDEDEEEGDCEEEEFEPSTQYEYGTEDDYQGKPL\n+EFGATSAALQPEEEQEEDWLDDDSQQTVGQQDGSEDNQTTTIQTIVEVQPQLEMELTPVV\n+QTIEVNSFSGYLKLTDNVYIKNADIVEEAKKVKPTVVVNAANVYLKHGGGVAGALNKATN\n+NAMQVESDDYIATNGPLKVGGSCVLSGHNLAKHCLHVVGPNVNKGEDIQLLKSAYENFNQ\n+HEVLLAPLLSAGIFGADPIHSLRVCVDTVRTNVYLAVFDKNLYDKLVSSFLEMKSEKQVE\n+QKIAEIPKEEVKPFITESKPSVEQRKQDDKKIKACVEEVTTTLEETKFLTENLLLYIDIN\n+GNLHPDSATLVSDIDITFLKKDAPYIVGDVVQEGVLTAVVIPTKKAGGTTEMLAKALRKV\n+PTDNYITTYPGQGLNGYTVEEAKTVLKKCKSAFYILPSIISNEKQEILGTVSWNLREMLA\n+HAEETRKLMPVCVETKAIVSTIQRKYKGIKIQEGVVDYGARFYFYTSKTTVASLINTLND\n+LNETLVTMPLGYVTHGLNLEEAARYMRSLKVPATVSVSSPDAVTAYNGYLTSSSKTPEEH\n+FIETISLAGSYKDWSYSGQSTQLGIEFLKRGDKSVYYTSNPTTFHLDGEVITFDNLKTLL\n+SLREVRTIKVFTTVDNINLHTQVVDMSMTYGQQFGPTYLDGADVTKIKPHNSHEGKTFYV\n+LPNDDTLRVEAFEYYHTTDPSFLGRYMSALNHTKKWKYPQVNGLTSIKWADNNCYLATAL\n+LTLQQIELKFNPPALQDAYYRARAGEAANFCALILAYCNKTVGELGDVRETMSYLFQHAN\n+LDSCKRVLNVVCKTCGQQQTTLKGVEAVMYMGTLSYEQFKKGVQIPCTCGKQATKYLVQQ\n+ESPFVMMSAPPAQYELKHGTFTCASEYTGNYQCGHYKHITSKETLYCIDGALLTKSSEYK\n+GPITDVFYKENSYTTTIKPVTYKLDGVVCTEIDPKLDNYYKKDNSYFTEQPIDLVPNQPY\n+PNASFDNFKFVCDNIKFADDLNQLTGYKKPASRELKVTFFPDLNGDVVAIDYKHYTPSFK\n+KGAKLLHKPIVWHVNNATNKATYKPNTWCIRCLWSTKPVETSNSFDVLKSEDAQGMDNLA\n+CEDLKPVSEEVVENPTIQKDVLECNVKTTEVVGDIILKPANNSLKITEEVGHTDLMAAYV\n+DNSSLTIKKPNELSRVLGLKTLATHGLAAVNSVPWDTIANYAKPFLNKVVSTTTNIVTRC\n+LNRVCTNYMPYFFTLLLQLCTFTRSTNSRIKASMPTTIAKNTVKSVGKFCLEASFNYLKS\n+PNFSKLINIIIWFLLLSVCLGSLIYSTAALGVLMSNLGMPSYCTGYREGYLNSTNVTIAT\n+YCTGSIPCSVCLSGLDSLDTYPSLETIQITISSFKWDLTAFGLVAEWFLAYILFTRFFYV\n+LGLAAIMQLFFSYFAVHFISNSWLMWLIINLVQMAPISAMVRMYIFFASFYYVWKSYVHV\n+VDGCNSSTCMMCYKRNRATRVECTTIVNGVRRSFYVYANGGKGFCKLHNWNCVNCDTFCA\n+GSTFISDEVARDLSLQFKRPINPTDQSSYIVDSVTVKNGSIHLYFDKAGQKTYERHSLSH\n+FVNLDNLRANNTKGSLPINVIVFDGKSKCEESSAKSASVYYSQLMCQPILLLDQALVSDV\n+GDSAEVAVKMFDAYVNTFSSTFNVPMEKLKTLVATAEAELAKNVSLDNVLSTFISAARQG\n+FVDSDVETKDVVECLKLSHQSDIEVTGDSCNNYMLTYNKVENMTPRDLGACIDCSARHIN\n+AQVAKSHNIALIWNVKDFMSLSEQLRKQIRSAAKKNNLPFKLTCATTRQVVNVVTTKIAL\n+KGGKIVNNWLKQLIKVTLVFLFVAAIFYLITPVHVMSKHTDFSSEIIGYKAIDGGVTRDI\n+ASTDTCFANKHADFDTWFSQRGGSYTNDKACPLIAAVITREVGFVVPGLPGTILRTTNGD\n+FLHFLPRVFSAVGNICYTPSKLIEYTDFATSACVLAAECTIFKDASGKPVPYCYDTNVLE\n+GSVAYESLRPDTRYVLMDGSIIQFPNTYLEGSVRVVTTFDSEYCRHGTCERSEAGVCVST\n+SGRWVLNNDYYRSLPGVFCGVDAVNLLTNMFTPLIQPIGALDISASIVAGGIVAIVVTCL\n+AYYFMRFRRAFGEYSHVVAFNTLLFLMSFTVLCLTPVYSFLPGVYSVIYLYLTFYLTNDV\n+SFLAHIQWMVMFTPLVPFWITIAYIICISTKHFYWFFSNYLKRRVVFNGVSFSTFEEAAL\n+CTFLLNKEMYLKLRSDVLLPLTQYNRYLALYNKYKYFSGAMDTTSYREAACCHLAKALND\n+FSNSGSDVLYQPPQTSITSAVLQSGFRKMAFPSGKVEGCMVQVTCGTTTLNGLWLDDVVY\n+CPRHVICTSEDMLNPNYEDLLIRKSNHNFLVQAGNVQLRVIGHSMQNCVLKLKVDTANPK\n+TPKYKFVRIQPGQTFSVLACYNGSPSGVYQCAMRPNFTIKGSFLNGSCGSVGFNIDYDCV\n+SFCYMHHMELPTGVHAGTDLEGNFYGPFVDRQTAQAAGTDTTITVNVLAWLYAAVINGDR\n+WFLNRFTTTLNDFNLVAMKYNYEPLTQDHVDILGPLSAQTGIAVLDMCASLKELLQNGMN\n+GRTILGSALLEDEFTPFDVVRQCSGVTFQSAVKRTIKGTHHWLLLTILTSLLVLVQSTQW\n+SLFFFLYENAFLPFAMGIIAMSAFAMMFVKHKHAFLCLFLLPSLATVAYFNMVYMPASWV\n+MRIMTWLDMVDTSLSGFKLKDCVMYASAVVLLILMTARTVYDDGARRVWTLMNVLTLVYK\n+VYYGNALDQAISMWALIISVTSNYSGVVTTVMFLARGIVFMCVEYCPIFFITGNTLQCIM\n+LVYCFLGYFCTCYFGLFCLLNRYFRLTLGVYDYLVSTQEFRY'..b'VLWAHGFELTSM\n+KYFVKIGPERTCCLCDRRATCFSTASDTYACWHHSIGFDYVYNPFMIDVQQWGFTGNLQS\n+NHDLYCQVHGNAHVASCDAIMTRCLAVHECFVKRVDWTIEYPIIGDELKINAACRKVQHM\n+VVKAALLADKFPVLHDIGNPKAIKCVPQADVEWKFYDAQPCSDKAYKIEELFYSYATHSD\n+KFTDGVCLFWNCNVDRYPANSIVCRFDTRVLSNLNLPGCDGGSLYVNKHAFHTPAFDKSA\n+FVNLKQLPFFYYSDSPCESHGKQVVSDIDYVPLKSATCITRCNLGGAVCRHHANEYRLYL\n+DAYNMMISAGFSLWVYKQFDTYNLWNTFTRLQSLENVAFNVVNKGHFDGQQGEVPVSIIN\n+NTVYTKVDGVDVELFENKTTLPVNVAFELWAKRNIKPVPEVKILNNLGVDIAANTVIWDY\n+KRDAPAHISTIGVCSMTDIAKKPTETICAPLTVFFDGRVDGQVDLFRNARNGVLITEGSV\n+KGLQPSVGPKQASLNGVTLIGEAVKTQFNYYKKVDGVVQQLPETYFTQSRNLQEFKPRSQ\n+MEIDFLELAMDEFIERYKLEGYAFEHIVYGDFSHSQLGGLHLLIGLAKRFKESPFELEDF\n+IPMDSTVKNYFITDAQTGSSKCVCSVIDLLLDDFVEIIKSQDLSVVSKVVKVTIDYTEIS\n+FMLWCKDGHVETFYPKLQSSQAWQPGVAMPNLYKMQRMLLEKCDLQNYGDSATLPKGIMM\n+NVAKYTQLCQYLNTLTLAVPYNMRVIHFGAGSDKGVAPGTAVLRQWLPTGTLLVDSDLND\n+FVSDADSTLIGDCATVHTANKWDLIISDMYDPKTKNVTKENDSKEGFFTYICGFIQQKLA\n+LGGSVAIKITEHSWNADLYKLMGHFAWWTAFVTNVNASSSEAFLIGCNYLGKPREQIDGY\n+VMHANYIFWRNTNPIQLSSYSLFDMSKFPLKLRGTAVMSLKEGQINDMILSLLSKGRLII\n+RENNRVVISSDVLVNN\n+>QHD43416.1 surface glycoprotein\n+MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFS\n+NVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIV\n+NNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLE\n+GKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQT\n+LLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETK\n+CTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISN\n+CVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIAD\n+YNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPC\n+NGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVN\n+FNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITP\n+GTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSY\n+ECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTI\n+SVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQE\n+VFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDC\n+LGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAM\n+QMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALN\n+TLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRA\n+SANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPA\n+ICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDP\n+LQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDL\n+QELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDD\n+SEPVLKGVKLHYT\n+>QHD43417.1 ORF3a protein\n+MDLFMRIFTIGTVTLKQGEIKDATPSDFVRATATIPIQASLPFGWLIVGVALLAVFQSAS\n+KIITLKKRWQLALSKGVHFVCNLLLLFVTVYSHLLLVAAGLEAPFLYLYALVYFLQSINF\n+VRIIMRLWLCWKCRSKNPLLYDANYFLCWHTNCYDYCIPYNSVTSSIVITSGDGTTSPIS\n+EHDYQIGGYTEKWESGVKDCVVLHSYFTSDYYQLYSTQLSTDTGVEHVTFFIYNKIVDEP\n+EEHVQIHTIDGSSGVVNPVMEPIYDEPTTTTSVPL\n+>QHD43418.1 envelope protein\n+MYSFVSEETGTLIVNSVLLFLAFVVFLLVTLAILTALRLCAYCCNIVNVSLVKPSFYVYS\n+RVKNLNSSRVPDLLV\n+>QHD43419.1 membrane glycoprotein\n+MADSNGTITVEELKKLLEQWNLVIGFLFLTWICLLQFAYANRNRFLYIIKLIFLWLLWPV\n+TLACFVLAAVYRINWITGGIAIAMACLVGLMWLSYFIASFRLFARTRSMWSFNPETNILL\n+NVPLHGTILTRPLLESELVIGAVILRGHLRIAGHHLGRCDIKDLPKEITVATSRTLSYYK\n+LGASQRVAGDSGFAAYSRYRIGNYKLNTDHSSSSDNIALLVQ\n+>QHD43420.1 ORF6 protein\n+MFHLVDFQVTIAEILLIIMRTFKVSIWNLDYIINLIIKNLSKSLTENKYSQLDEEQPMEI\n+D\n+>QHD43421.1 ORF7a protein\n+MKIILFLALITLATCELYHYQECVRGTTVLLKEPCSSGTYEGNSPFHPLADNKFALTCFS\n+TQFAFACPDGVKHVYQLRARSVSPKLFIRQEEVQELYSPIFLIVAAIVFITLCFTLKRKT\n+E\n+>QHD43422.1 ORF8 protein\n+MKFLVFLGIITTVAAFHQECSLQSCTQHQPYVVDDPCPIHFYSKWYIRVGARKSAPLIEL\n+CVDEAGSKSPIQYIDIGNYTVSCLPFTINCQEPKLGSLVVRCSFYEDFLEYHDVRVVLDF\n+I\n+>QHD43423.2 nucleocapsid phosphoprotein\n+MSDNGPQNQRNAPRITFGGPSDSTGSNQNGERSGARSKQRRPQGLPNNTASWFTALTQHG\n+KEDLKFPRGQGVPINTNSSPDDQIGYYRRATRRIRGGDGKMKDLSPRWYFYYLGTGPEAG\n+LPYGANKDGIIWVATEGALNTPKDHIGTRNPANNAAIVLQLPQGTTLPKGFYAEGSRGGS\n+QASSRSSSRSRNSSRNSTPGSSRGTSPARMAGNGGDAALALLLLDRLNQLESKMSGKGQQ\n+QQGQTVTKKSAAEASKKPRQKRTATKAYNVTQAFGRRGPEQTQGNFGDQELIRQGTDYKH\n+WPQIAQFAPSASAFFGMSRIGMEVTPSGTWLTYTGAIKLDDKDPNFKDQVILLNKHIDAY\n+KTFPPTEPKKDKKKKADETQALPQRQKKQQTVTLLPAADLDDFSKQLQQSMSSADSTQA\n+>QHI42199.1 ORF10 protein\n+MGYINVFAFPFTIYSLLLCRMNSRNYIAQVDVVNFNLT\n' |
| b |
| diff -r 5ddad52360c5 -r e8de9a44ce72 test-data/aln/test_map.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/aln/test_map.txt Thu Jul 15 11:45:05 2021 +0000 |
| b |
| @@ -0,0 +1,9 @@ +QHD43415.1 2697049 +QHD43416.1 2697049 +QHD43417.1 2697049 +QHD43418.1 2697049 +QHD43419.1 2697049 +QHD43420.1 2697049 +QHD43421.1 2697049 +QHD43422.1 2697049 +QHD43423.2 2697049 \ No newline at end of file |