# HG changeset patch # User diane # Date 1536348154 14400 # Node ID 71bc4fd01351f3d4c7f6c3b5482b396e7e95f959 # Parent 8af7fd9b74764c972772133fe8a7aa68dc082203 planemo upload for repository https://github.com/DiDigsDNA/amplicon_sequevars_to_dict commit ca00fa6b41cfb8f6a27901792a4e4d3fc3e1878b diff -r 8af7fd9b7476 -r 71bc4fd01351 amplicon_sequevars_to_dict.py --- a/amplicon_sequevars_to_dict.py Fri Sep 07 15:17:08 2018 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,113 +0,0 @@ -#!/usr/bin/env python -'''Reads in nucleotide sequences from a fasta file as SeqRecords and sorts into a dictionary, -with unique sequences as keys and lists of SeqRecords sharing these sequevars as values. If the user -specified a fasta file containing sequences which have been empirically determined to amplify in the -PCR (i.e. "safelist"), the sequevars are compared to this safelist prior to being output to fasta. Sequence -identifiers (usually Genbank accession numbers) of sequences in a sequevar are printed to a text file. - -Usage with safelist: python sequevarsToDict.py -s safelist.fasta unmatching_sequences.fasta Mar20_sequevars -Usage without safelist: python sequevars_to_dict.py unmatching_sequences.fasta Mar20_sequevars -''' - -'''Author: Diane Eisler, Molecular Microbiology & Genomics, BCCDC Public Health Laboratory, March 2018''' - -import sys,string,os, time, Bio, re, argparse -from Bio import Seq, SeqIO, SeqUtils, Alphabet, SeqRecord -from Bio.Alphabet import IUPAC -from Bio.Seq import Seq -from Bio.SeqRecord import SeqRecord -import Bio.Data.IUPACData - -#parse command line arguments -parser = argparse.ArgumentParser() -parser.add_argument("-s", dest = "safelist", help = "specify filename of previously vetted sequences", - action = "store") -parser.add_argument("fastaToParse") -parser.add_argument("outFileHandle") -results = parser.parse_args() -print("ARGUMENTS:") -print(results) -print('safelist = ', results.safelist) - -#construct output filenames -gb_accessionsHandle = results.outFileHandle + ".txt" #user-specified output file name -gb_accessions= open(gb_accessionsHandle,'w') #output file of GB access #'s and details re: which oligos NOT found -outputFastaHandle = results.outFileHandle + ".fasta" #create fasta file from user specified name - -def outputInformationFile(sequevars): - '''Outputs text file with unique amplicon sequences and id's of sequences in each sequevar.''' - localtime = time.asctime(time.localtime(time.time())) #date and time of analysis - gb_accessions.write("---------------------------------------------------------------------------\n") - gb_accessions.write("RESULTS OF SEARCH FOR EXACT MATCHES TO OLIGONUCLEOTIDES IN QUERY SEQUENCES:\n") - gb_accessions.write("---------------------------------------------------------------------------\n\n") - gb_accessions.write("Analysis as of: " + localtime + "\n\n") - gb_accessions.write("\n---------------------------------------------------------------------------\n") - #if there are sequevars for further analysis, output them to the results .txt and fasta files - if len(sequevars) > 0: #if there are sequevars to output, write to list with sequence id's in sequevar - outputSequevarsToFasta(sequevars) #output fasta file of sequevars i.e.representative record - gb_accessions.write("%i Different Amplicon Sequevars Found: %s \n\n" % (len(sequevars),results.fastaToParse)) - for sequevar in sequevars: - print("\nSequevar: %s" % sequevar) #print sequevar to console - recordList = sequevars[sequevar] - gb_accessions.write("\n\n%i Records in Sequevar: %s" % (len(recordList),sequevar)) #write sequevar to file - print("%i Record(s) have this sequence:" % (len(recordList))) - for record in recordList: - print("\t" + record.id) #print record id to console - gb_accessions.write("\n\t" + record.id) #write records with this sequence to file - else: #if not sequevars to output, print message to outfile to clarify this - gb_accessions.write("No sequevars requiring further investigation.") - gb_accessions.write("\n---------------------------------------------------------------------------") - gb_accessions.write("\n\nEND OF RESULTS\n") - #print output filepaths to console and to search results file - gb_accessions.write("\nSearch Results (this file): " + gb_accessionsHandle) - print("\nSearch Results: " + gb_accessionsHandle) - if len(sequevars) > 0: #if there are sequevars to print to fasta, direct user to fasta file - gb_accessions.write("\nFasta sequences for analysis: " + outputFastaHandle) - print("Fasta sequences for analysis: " + outputFastaHandle) - else: - gb_accessions.write("\nFasta sequences for analysis: N/A") - print("Fasta sequences for analysis: N/A\n") - return - -def outputSequevarsToFasta(sequevars): - '''Output a fasta file with one representative sequence for each sequevar.''' - output_fasta = open(outputFastaHandle, 'w') #create a writeable fasta file - sequevars_list = [] - for sequevar in sequevars: - first_record = sequevars[sequevar][0] #grab the first SeqRecord in the list sharing the sequevar - sequevars_list.append(first_record) #add it to the list of SeqRecords - SeqIO.write(sequevars_list, output_fasta, "fasta") #write the sequevars to fasta - return - -with open(results.fastaToParse,'r') as inFile: - #read nucleotide sequences from fasta file into SeqRecords, uppercases and adds to a list - seqList = [rec.upper() for rec in list(SeqIO.parse(inFile, "fasta", alphabet=IUPAC.ambiguous_dna))] - sequevars = {} # empty dictionary of sequevar: list - safe_seqs = [] # empty list of nucleotide sequences - - if results.safelist: #if safelist specified by user, read these sequences into a list - safe_seqs = [record.seq for record in SeqIO.parse(results.safelist, "fasta", alphabet=IUPAC.ambiguous_dna)] - print("Safeseqs:") - for seq in safe_seqs: - print(seq) #print all the vetted unique sequences - - for record in seqList: #parse SeqRecord for exact match to amplicon or reverse complement - sequence = str(record.seq) - #check if the sequence has already been tested and vetted (i.e. in safelist) - if (sequence in safe_seqs): #only add to dictionary if not previously tested - print("Sequence from %s in safe list!" % (record.id) )#print mssg to console - else: - #if the sequence is a key in the dict, add SeqRecord to list - if sequence in sequevars: - sequevars[sequence].append(record) - else: - #add sequence as new key to dict, accessing a list with SeqRecord as its first item - sequevars[sequence] = [record] - #get a sorted list of unique sequence keys - sorted_unique_seq_keys = sorted(sequevars.keys()) - #process list of SeqRecords in each sequevar and print to results summary file for end-user - outputInformationFile(sequevars) #output information .txt file - #outputSequevarsToFasta(sequevars) #output fasta file of sequevars i.e.representative record - -inFile.close() -gb_accessions.close() diff -r 8af7fd9b7476 -r 71bc4fd01351 amplicon_sequevars_to_dict.xml --- a/amplicon_sequevars_to_dict.xml Fri Sep 07 15:17:08 2018 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,35 +0,0 @@ - - - biopython - - - - - - - - - - - - - - - - - - - - - - - -#NEEDS MODIFICATIONS!!! \ No newline at end of file diff -r 8af7fd9b7476 -r 71bc4fd01351 test-data/amplicon_sequevars_to_dict/amplicon_sequevars_with_safelist.fasta --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/amplicon_sequevars_to_dict/amplicon_sequevars_with_safelist.fasta Fri Sep 07 15:22:34 2018 -0400 @@ -0,0 +1,36 @@ +>KY925981 A/Bage/LACENRS-294/2011 2011/06/16 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC +CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY926040 A/Palmitinho/LACENRS-2595/2009 2009/08/31 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY926095 A/Porto Alegre/LACENRS-2380/2013 2013/07/15 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACAGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY926106 A/Santa Cruz do Sul/LACENRS-913/2011 2011/07/11 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY950187 A/Linkou/0181/2016 2016/12/21 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF039637 A/Zhejiang/DTID-ZJU01/2013 2013/04/07 7 (MP) +GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC +>MF039638 A/WSN/1933 1933// 7 (MP) +GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGGGGACTGCAGCGTAGACGCTTTGTCCAAAATGCTC +>MF441160 A/Korea/KUMC-GR190/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGTTTTGTCCAAAATGCCC +>MF441174 A/Korea/KUMC-GR689/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTTCAAAATGCCC +>MF593488 A/Mexico/4104/2016 2016/01/14 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAACGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF599436 A/Kenya/011/2017 2017/03/08 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTCTTCACGCTCACCGTGC +CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF629795 A/Trivandrum/MCVR261/2009 2009/08/21 7 (MP) +GACCAATCTTGTCACCCCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC diff -r 8af7fd9b7476 -r 71bc4fd01351 test-data/amplicon_sequevars_to_dict/amplicon_sequevars_with_safelist.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/amplicon_sequevars_to_dict/amplicon_sequevars_with_safelist.txt Fri Sep 07 15:22:34 2018 -0400 @@ -0,0 +1,64 @@ +--------------------------------------------------------------------------- +RESULTS OF SEARCH FOR EXACT MATCHES TO OLIGONUCLEOTIDES IN QUERY SEQUENCES: +--------------------------------------------------------------------------- + +Analysis as of: Thu Sep 6 18:27:06 2018 + + +--------------------------------------------------------------------------- +12 Different Amplicon Sequevars Found: test-data/amplicons.fasta + + + +1 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGCCAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC + KY925981 + +2 Records in Sequevar: GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC + KY926040 + KY926049 + +1 Records in Sequevar: GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACAGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC + KY926095 + +4 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC + KY926106 + MF441141 + MF441149 + MF441161 + +5 Records in Sequevar: GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC + KY950187 + KY950217 + MF599433 + MF599434 + MF599438 + +4 Records in Sequevar: GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC + MF039637 + MF370249 + MF370253 + MF370257 + +1 Records in Sequevar: GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGGGGACTGCAGCGTAGACGCTTTGTCCAAAATGCTC + MF039638 + +1 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGTTTTGTCCAAAATGCCC + MF441160 + +1 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTTCAAAATGCCC + MF441174 + +1 Records in Sequevar: GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAACGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC + MF593488 + +1 Records in Sequevar: GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTCTTCACGCTCACCGTGCCAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC + MF599436 + +1 Records in Sequevar: GACCAATCTTGTCACCCCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC + MF629795 +--------------------------------------------------------------------------- + +END OF RESULTS + +Search Results (this file): test-data/output.txt +Fasta sequences for analysis: test-data/output.fasta \ No newline at end of file diff -r 8af7fd9b7476 -r 71bc4fd01351 test-data/amplicon_sequevars_to_dict/amplicon_sequevars_without_safelist.fasta --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/amplicon_sequevars_to_dict/amplicon_sequevars_without_safelist.fasta Fri Sep 07 15:22:34 2018 -0400 @@ -0,0 +1,48 @@ +>KY925925 A/Santo Antonio da Patrulha/LACENRS-2621/2012 2012/08/10 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY925966 A/Porto Alegre/LACENRS-1075/2015 2015/05/29 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY925981 A/Bage/LACENRS-294/2011 2011/06/16 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC +CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY926040 A/Palmitinho/LACENRS-2595/2009 2009/08/31 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY926095 A/Porto Alegre/LACENRS-2380/2013 2013/07/15 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACAGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY926106 A/Santa Cruz do Sul/LACENRS-913/2011 2011/07/11 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY950187 A/Linkou/0181/2016 2016/12/21 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF039637 A/Zhejiang/DTID-ZJU01/2013 2013/04/07 7 (MP) +GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC +>MF039638 A/WSN/1933 1933// 7 (MP) +GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGGGGACTGCAGCGTAGACGCTTTGTCCAAAATGCTC +>MF441145 A/Korea/KUMC-GR66/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441160 A/Korea/KUMC-GR190/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGTTTTGTCCAAAATGCCC +>MF441174 A/Korea/KUMC-GR689/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTTCAAAATGCCC +>MF593488 A/Mexico/4104/2016 2016/01/14 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAACGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF599435 A/Kenya/009/2017 2017/03/06 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF599436 A/Kenya/011/2017 2017/03/08 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTCTTCACGCTCACCGTGC +CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF629795 A/Trivandrum/MCVR261/2009 2009/08/21 7 (MP) +GACCAATCTTGTCACCCCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC diff -r 8af7fd9b7476 -r 71bc4fd01351 test-data/amplicon_sequevars_to_dict/amplicon_sequevars_without_safelist.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/amplicon_sequevars_to_dict/amplicon_sequevars_without_safelist.txt Fri Sep 07 15:22:34 2018 -0400 @@ -0,0 +1,148 @@ +--------------------------------------------------------------------------- +RESULTS OF SEARCH FOR EXACT MATCHES TO OLIGONUCLEOTIDES IN QUERY SEQUENCES: +--------------------------------------------------------------------------- + +Analysis as of: Thu Sep 6 18:19:51 2018 + + +--------------------------------------------------------------------------- +16 Different Amplicon Sequevars Found: test-data/amplicons.fasta + + + +36 Records in Sequevar: GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC + KY925925 + KY925930 + KY925952 + KY925960 + KY925968 + KY926011 + KY926047 + KY926053 + KY926055 + KY926082 + KY926097 + KY926123 + KY926125 + KY926132 + KY950197 + KY950209 + KY950219 + MF593496 + MF593504 + MF593512 + MF593520 + MF593528 + MF593533 + MF593538 + MF593544 + MF593549 + MF593557 + MF593562 + MF593570 + MF593578 + MF593586 + MF593594 + MF593602 + MF593610 + MF593615 + MF593623 + +23 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC + KY925966 + KY925996 + KY926140 + MF441140 + MF441142 + MF441143 + MF441144 + MF441146 + MF441150 + MF441151 + MF441153 + MF441156 + MF441157 + MF441163 + MF441164 + MF441165 + MF441166 + MF441167 + MF441168 + MF441169 + MF441170 + MF441172 + MF441173 + +1 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGCCAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC + KY925981 + +2 Records in Sequevar: GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC + KY926040 + KY926049 + +1 Records in Sequevar: GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACAGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC + KY926095 + +4 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC + KY926106 + MF441141 + MF441149 + MF441161 + +5 Records in Sequevar: GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC + KY950187 + KY950217 + MF599433 + MF599434 + MF599438 + +4 Records in Sequevar: GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC + MF039637 + MF370249 + MF370253 + MF370257 + +1 Records in Sequevar: GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGGGGACTGCAGCGTAGACGCTTTGTCCAAAATGCTC + MF039638 + +10 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC + MF441145 + MF441147 + MF441148 + MF441152 + MF441154 + MF441155 + MF441158 + MF441159 + MF441162 + MF441171 + +1 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGTTTTGTCCAAAATGCCC + MF441160 + +1 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTTCAAAATGCCC + MF441174 + +1 Records in Sequevar: GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAACGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC + MF593488 + +7 Records in Sequevar: GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGCCAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC + MF599435 + MF599437 + MF599439 + MF599440 + MF599441 + MF599442 + MF599443 + +1 Records in Sequevar: GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTCTTCACGCTCACCGTGCCAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC + MF599436 + +1 Records in Sequevar: GACCAATCTTGTCACCCCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC + MF629795 +--------------------------------------------------------------------------- + +END OF RESULTS + +Search Results (this file): amplicon_sequevars_without_safelist.txt +Fasta sequences for analysis: amplicon_sequevars_without_safelist.fasta \ No newline at end of file diff -r 8af7fd9b7476 -r 71bc4fd01351 test-data/amplicon_sequevars_to_dict/amplicons.fasta --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/amplicon_sequevars_to_dict/amplicons.fasta Fri Sep 07 15:22:34 2018 -0400 @@ -0,0 +1,297 @@ +>KY925925 A/Santo Antonio da Patrulha/LACENRS-2621/2012 2012/08/10 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY925930 A/Novo Hamburgo/LACENRS-385/2016 2016/04/04 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY925952 A/Viamao/LACENRS-1400/2011 2011/08/22 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY925960 A/Gramado/LACENRS-1287/2016 2016/04/29 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY925966 A/Porto Alegre/LACENRS-1075/2015 2015/05/29 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY925968 A/Santa Cruz do Sul/LACENRS-2672/2013 2013/07/22 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY925981 A/Bage/LACENRS-294/2011 2011/06/16 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC +CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY925996 A/Canoas/LACENRS-1793/2015 2015/07/17 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY926011 A/Tres Coroas/LACENRS-1810/2012 2012/07/23 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY926040 A/Palmitinho/LACENRS-2595/2009 2009/08/31 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY926047 A/Sao Gabriel/LACENRS-1809/2012 2012/07/23 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY926049 A/Sao Gabriel/LACENRS-1626/2009 2009/08/14 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY926053 A/Uruguaiana/LACENRS-296/2016 2016/03/27 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY926055 A/Caibate/LACENRS-903/2012 2012/07/08 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY926082 A/Camaqua/LACENRS-623/2011 2011/07/01 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY926095 A/Porto Alegre/LACENRS-2380/2013 2013/07/15 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACAGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY926097 A/Caxias do Sul/LACENRS-1186/2011 2011/08/31 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY926106 A/Santa Cruz do Sul/LACENRS-913/2011 2011/07/11 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY926123 A/Canoas/LACENRS-1320/2016 2016/05/02 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY926125 A/Cruz Alta/LACENRS-499/2012 2012/06/27 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY926132 A/Nova Petropolis/LACENRS-1205/2013 2013/06/07 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY926140 A/Porto Alegre/LACENRS-1887/2015 2015/07/25 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY950187 A/Linkou/0181/2016 2016/12/21 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY950197 A/Baltimore/0026/2016 2016/01/16 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY950209 A/Baltimore/0077/2016 2016/02/23 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY950217 A/Baltimore/0247/2017 2017/01/25 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>KY950219 A/Linkou/0004/2015 2015/11/06 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF039637 A/Zhejiang/DTID-ZJU01/2013 2013/04/07 7 (MP) +GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC +>MF039638 A/WSN/1933 1933// 7 (MP) +GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGGGGACTGCAGCGTAGACGCTTTGTCCAAAATGCTC +>MF370249 A/Changsha/26/2017 2017/02/04 7 (MP) +GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC +>MF370253 A/Changsha/41/2017 2017/03/12 7 (MP) +GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC +>MF370257 A/Changsha/44/2017 2017/03/14 7 (MP) +GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC +>MF441140 A/Korea/KUMC-GR14/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441141 A/Korea/KUMC-GR17/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441142 A/Korea/KUMC-GR18/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441143 A/Korea/KUMC-GR47/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441144 A/Korea/KUMC-GR55/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441145 A/Korea/KUMC-GR66/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441146 A/Korea/KUMC-GR83/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441147 A/Korea/KUMC-GR90/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441148 A/Korea/KUMC-GR92/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441149 A/Korea/KUMC-GR96/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441150 A/Korea/KUMC-GR99/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441151 A/Korea/KUMC-GR115/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441152 A/Korea/KUMC-GR116/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441153 A/Korea/KUMC-GR121/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441154 A/Korea/KUMC-GR124/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441155 A/Korea/KUMC-GR137/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441156 A/Korea/KUMC-GR157/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441157 A/Korea/KUMC-GR159/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441158 A/Korea/KUMC-GR172/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441159 A/Korea/KUMC-GR175/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441160 A/Korea/KUMC-GR190/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGTTTTGTCCAAAATGCCC +>MF441161 A/Korea/KUMC-GR195/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441162 A/Korea/KUMC-GR202/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441163 A/Korea/KUMC-GR218/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441164 A/Korea/KUMC-GR226/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441165 A/Korea/KUMC-GR227/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441166 A/Korea/KUMC-GR252/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441167 A/Korea/KUMC-GR261/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441168 A/Korea/KUMC-GR281/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441169 A/Korea/KUMC-GR285/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441170 A/Korea/KUMC-GR437/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441171 A/Korea/KUMC-GR567/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441172 A/Korea/KUMC-GR570/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441173 A/Korea/KUMC-GR590/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441174 A/Korea/KUMC-GR689/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTTCAAAATGCCC +>MF593488 A/Mexico/4104/2016 2016/01/14 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAACGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593496 A/Mexico/4431/2016 2016/03/01 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593504 A/Mexico/4433/2016 2016/02/28 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593512 A/Mexico/4435/2016 2016/03/02 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593520 A/Mexico/4436/2016 2016/02/28 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593528 A/Mexico/4440/2016 2016/03/16 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593533 A/Mexico/4604/2017 2017/02/10 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593538 A/Mexico/4621/2017 2017/02/18 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593544 A/Mexico/4627/2017 2017/02/24 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593549 A/Mexico/4628/2017 2017/03/02 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593557 A/Mexico/4687/2017 2017/03/12 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593562 A/Mexico/4703/2017 2017/03/21 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593570 A/Mexico/7317/2017 2017/01/16 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593578 A/Mexico/8017/2017 2017/01/17 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593586 A/Mexico/8517/2017 2017/01/18 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593594 A/Mexico/11417/2017 2017/01/23 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593602 A/Mexico/12317/2017 2017/01/24 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593610 A/Mexico/15017/2017 2017/01/26 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593615 A/Mexico/15517/2017 2017/01/27 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF593623 A/Mexico/17517/2017 2017/01/30 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>MF599433 A/Kenya/001/2017 2017/01/01 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF599434 A/Kenya/002/2017 2017/01/02 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF599435 A/Kenya/009/2017 2017/03/06 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF599436 A/Kenya/011/2017 2017/03/08 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTCTTCACGCTCACCGTGC +CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF599437 A/Kenya/014/2017 2017/03/13 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF599438 A/Kenya/015/2017 2017/04/03 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF599439 A/Kenya/008/2017 2017/03/05 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF599440 A/Kenya/010/2017 2017/03/07 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF599441 A/Kenya/012/2017 2017/03/10 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF599442 A/Kenya/013/2017 2017/03/12 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF599443 A/Kenya/003/2017 2017/02/01 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF629795 A/Trivandrum/MCVR261/2009 2009/08/21 7 (MP) +GACCAATCTTGTCACCCCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC diff -r 8af7fd9b7476 -r 71bc4fd01351 test-data/amplicon_sequevars_to_dict/safelist.fasta --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/amplicon_sequevars_to_dict/safelist.fasta Fri Sep 07 15:22:34 2018 -0400 @@ -0,0 +1,12 @@ +>KY925925 A/Santo Antonio da Patrulha/LACENRS-2621/2012 2012/08/10 7 (MP) +GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC +>KY925966 A/Porto Alegre/LACENRS-1075/2015 2015/05/29 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF441145 A/Korea/KUMC-GR66/2011 2011/12/01 7 (MP) +GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC +CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC +>MF599435 A/Kenya/009/2017 2017/03/06 7 (MP) +GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC +CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC diff -r 8af7fd9b7476 -r 71bc4fd01351 test-data/amplicon_sequevars_with_safelist.fasta --- a/test-data/amplicon_sequevars_with_safelist.fasta Fri Sep 07 15:17:08 2018 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,36 +0,0 @@ ->KY925981 A/Bage/LACENRS-294/2011 2011/06/16 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC -CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY926040 A/Palmitinho/LACENRS-2595/2009 2009/08/31 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY926095 A/Porto Alegre/LACENRS-2380/2013 2013/07/15 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACAGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY926106 A/Santa Cruz do Sul/LACENRS-913/2011 2011/07/11 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY950187 A/Linkou/0181/2016 2016/12/21 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF039637 A/Zhejiang/DTID-ZJU01/2013 2013/04/07 7 (MP) -GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC ->MF039638 A/WSN/1933 1933// 7 (MP) -GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGGGGACTGCAGCGTAGACGCTTTGTCCAAAATGCTC ->MF441160 A/Korea/KUMC-GR190/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGTTTTGTCCAAAATGCCC ->MF441174 A/Korea/KUMC-GR689/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTTCAAAATGCCC ->MF593488 A/Mexico/4104/2016 2016/01/14 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAACGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF599436 A/Kenya/011/2017 2017/03/08 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTCTTCACGCTCACCGTGC -CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF629795 A/Trivandrum/MCVR261/2009 2009/08/21 7 (MP) -GACCAATCTTGTCACCCCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC diff -r 8af7fd9b7476 -r 71bc4fd01351 test-data/amplicon_sequevars_with_safelist.txt --- a/test-data/amplicon_sequevars_with_safelist.txt Fri Sep 07 15:17:08 2018 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,64 +0,0 @@ ---------------------------------------------------------------------------- -RESULTS OF SEARCH FOR EXACT MATCHES TO OLIGONUCLEOTIDES IN QUERY SEQUENCES: ---------------------------------------------------------------------------- - -Analysis as of: Thu Sep 6 18:27:06 2018 - - ---------------------------------------------------------------------------- -12 Different Amplicon Sequevars Found: test-data/amplicons.fasta - - - -1 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGCCAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC - KY925981 - -2 Records in Sequevar: GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC - KY926040 - KY926049 - -1 Records in Sequevar: GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACAGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC - KY926095 - -4 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC - KY926106 - MF441141 - MF441149 - MF441161 - -5 Records in Sequevar: GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC - KY950187 - KY950217 - MF599433 - MF599434 - MF599438 - -4 Records in Sequevar: GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC - MF039637 - MF370249 - MF370253 - MF370257 - -1 Records in Sequevar: GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGGGGACTGCAGCGTAGACGCTTTGTCCAAAATGCTC - MF039638 - -1 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGTTTTGTCCAAAATGCCC - MF441160 - -1 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTTCAAAATGCCC - MF441174 - -1 Records in Sequevar: GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAACGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC - MF593488 - -1 Records in Sequevar: GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTCTTCACGCTCACCGTGCCAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC - MF599436 - -1 Records in Sequevar: GACCAATCTTGTCACCCCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC - MF629795 ---------------------------------------------------------------------------- - -END OF RESULTS - -Search Results (this file): test-data/output.txt -Fasta sequences for analysis: test-data/output.fasta \ No newline at end of file diff -r 8af7fd9b7476 -r 71bc4fd01351 test-data/amplicon_sequevars_without_safelist.fasta --- a/test-data/amplicon_sequevars_without_safelist.fasta Fri Sep 07 15:17:08 2018 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,48 +0,0 @@ ->KY925925 A/Santo Antonio da Patrulha/LACENRS-2621/2012 2012/08/10 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY925966 A/Porto Alegre/LACENRS-1075/2015 2015/05/29 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY925981 A/Bage/LACENRS-294/2011 2011/06/16 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC -CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY926040 A/Palmitinho/LACENRS-2595/2009 2009/08/31 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY926095 A/Porto Alegre/LACENRS-2380/2013 2013/07/15 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACAGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY926106 A/Santa Cruz do Sul/LACENRS-913/2011 2011/07/11 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY950187 A/Linkou/0181/2016 2016/12/21 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF039637 A/Zhejiang/DTID-ZJU01/2013 2013/04/07 7 (MP) -GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC ->MF039638 A/WSN/1933 1933// 7 (MP) -GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGGGGACTGCAGCGTAGACGCTTTGTCCAAAATGCTC ->MF441145 A/Korea/KUMC-GR66/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441160 A/Korea/KUMC-GR190/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGTTTTGTCCAAAATGCCC ->MF441174 A/Korea/KUMC-GR689/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTTCAAAATGCCC ->MF593488 A/Mexico/4104/2016 2016/01/14 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAACGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF599435 A/Kenya/009/2017 2017/03/06 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF599436 A/Kenya/011/2017 2017/03/08 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTCTTCACGCTCACCGTGC -CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF629795 A/Trivandrum/MCVR261/2009 2009/08/21 7 (MP) -GACCAATCTTGTCACCCCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC diff -r 8af7fd9b7476 -r 71bc4fd01351 test-data/amplicon_sequevars_without_safelist.txt --- a/test-data/amplicon_sequevars_without_safelist.txt Fri Sep 07 15:17:08 2018 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,148 +0,0 @@ ---------------------------------------------------------------------------- -RESULTS OF SEARCH FOR EXACT MATCHES TO OLIGONUCLEOTIDES IN QUERY SEQUENCES: ---------------------------------------------------------------------------- - -Analysis as of: Thu Sep 6 18:19:51 2018 - - ---------------------------------------------------------------------------- -16 Different Amplicon Sequevars Found: test-data/amplicons.fasta - - - -36 Records in Sequevar: GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC - KY925925 - KY925930 - KY925952 - KY925960 - KY925968 - KY926011 - KY926047 - KY926053 - KY926055 - KY926082 - KY926097 - KY926123 - KY926125 - KY926132 - KY950197 - KY950209 - KY950219 - MF593496 - MF593504 - MF593512 - MF593520 - MF593528 - MF593533 - MF593538 - MF593544 - MF593549 - MF593557 - MF593562 - MF593570 - MF593578 - MF593586 - MF593594 - MF593602 - MF593610 - MF593615 - MF593623 - -23 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC - KY925966 - KY925996 - KY926140 - MF441140 - MF441142 - MF441143 - MF441144 - MF441146 - MF441150 - MF441151 - MF441153 - MF441156 - MF441157 - MF441163 - MF441164 - MF441165 - MF441166 - MF441167 - MF441168 - MF441169 - MF441170 - MF441172 - MF441173 - -1 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGCCAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC - KY925981 - -2 Records in Sequevar: GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC - KY926040 - KY926049 - -1 Records in Sequevar: GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACAGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC - KY926095 - -4 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC - KY926106 - MF441141 - MF441149 - MF441161 - -5 Records in Sequevar: GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC - KY950187 - KY950217 - MF599433 - MF599434 - MF599438 - -4 Records in Sequevar: GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC - MF039637 - MF370249 - MF370253 - MF370257 - -1 Records in Sequevar: GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGGGGACTGCAGCGTAGACGCTTTGTCCAAAATGCTC - MF039638 - -10 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC - MF441145 - MF441147 - MF441148 - MF441152 - MF441154 - MF441155 - MF441158 - MF441159 - MF441162 - MF441171 - -1 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGTTTTGTCCAAAATGCCC - MF441160 - -1 Records in Sequevar: GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTTCAAAATGCCC - MF441174 - -1 Records in Sequevar: GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAACGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC - MF593488 - -7 Records in Sequevar: GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGCCAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC - MF599435 - MF599437 - MF599439 - MF599440 - MF599441 - MF599442 - MF599443 - -1 Records in Sequevar: GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTCTTCACGCTCACCGTGCCAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC - MF599436 - -1 Records in Sequevar: GACCAATCTTGTCACCCCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC - MF629795 ---------------------------------------------------------------------------- - -END OF RESULTS - -Search Results (this file): amplicon_sequevars_without_safelist.txt -Fasta sequences for analysis: amplicon_sequevars_without_safelist.fasta \ No newline at end of file diff -r 8af7fd9b7476 -r 71bc4fd01351 test-data/amplicons.fasta --- a/test-data/amplicons.fasta Fri Sep 07 15:17:08 2018 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,297 +0,0 @@ ->KY925925 A/Santo Antonio da Patrulha/LACENRS-2621/2012 2012/08/10 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY925930 A/Novo Hamburgo/LACENRS-385/2016 2016/04/04 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY925952 A/Viamao/LACENRS-1400/2011 2011/08/22 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY925960 A/Gramado/LACENRS-1287/2016 2016/04/29 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY925966 A/Porto Alegre/LACENRS-1075/2015 2015/05/29 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY925968 A/Santa Cruz do Sul/LACENRS-2672/2013 2013/07/22 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY925981 A/Bage/LACENRS-294/2011 2011/06/16 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC -CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY925996 A/Canoas/LACENRS-1793/2015 2015/07/17 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY926011 A/Tres Coroas/LACENRS-1810/2012 2012/07/23 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY926040 A/Palmitinho/LACENRS-2595/2009 2009/08/31 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY926047 A/Sao Gabriel/LACENRS-1809/2012 2012/07/23 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY926049 A/Sao Gabriel/LACENRS-1626/2009 2009/08/14 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY926053 A/Uruguaiana/LACENRS-296/2016 2016/03/27 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY926055 A/Caibate/LACENRS-903/2012 2012/07/08 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY926082 A/Camaqua/LACENRS-623/2011 2011/07/01 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY926095 A/Porto Alegre/LACENRS-2380/2013 2013/07/15 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACAGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY926097 A/Caxias do Sul/LACENRS-1186/2011 2011/08/31 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY926106 A/Santa Cruz do Sul/LACENRS-913/2011 2011/07/11 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY926123 A/Canoas/LACENRS-1320/2016 2016/05/02 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY926125 A/Cruz Alta/LACENRS-499/2012 2012/06/27 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY926132 A/Nova Petropolis/LACENRS-1205/2013 2013/06/07 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY926140 A/Porto Alegre/LACENRS-1887/2015 2015/07/25 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY950187 A/Linkou/0181/2016 2016/12/21 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY950197 A/Baltimore/0026/2016 2016/01/16 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY950209 A/Baltimore/0077/2016 2016/02/23 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY950217 A/Baltimore/0247/2017 2017/01/25 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->KY950219 A/Linkou/0004/2015 2015/11/06 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF039637 A/Zhejiang/DTID-ZJU01/2013 2013/04/07 7 (MP) -GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC ->MF039638 A/WSN/1933 1933// 7 (MP) -GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGGGGACTGCAGCGTAGACGCTTTGTCCAAAATGCTC ->MF370249 A/Changsha/26/2017 2017/02/04 7 (MP) -GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC ->MF370253 A/Changsha/41/2017 2017/03/12 7 (MP) -GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC ->MF370257 A/Changsha/44/2017 2017/03/14 7 (MP) -GACCAATCCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGGTTTGTCCAAAACGCCC ->MF441140 A/Korea/KUMC-GR14/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441141 A/Korea/KUMC-GR17/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441142 A/Korea/KUMC-GR18/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441143 A/Korea/KUMC-GR47/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441144 A/Korea/KUMC-GR55/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441145 A/Korea/KUMC-GR66/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441146 A/Korea/KUMC-GR83/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441147 A/Korea/KUMC-GR90/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441148 A/Korea/KUMC-GR92/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441149 A/Korea/KUMC-GR96/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441150 A/Korea/KUMC-GR99/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441151 A/Korea/KUMC-GR115/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441152 A/Korea/KUMC-GR116/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441153 A/Korea/KUMC-GR121/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441154 A/Korea/KUMC-GR124/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441155 A/Korea/KUMC-GR137/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441156 A/Korea/KUMC-GR157/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441157 A/Korea/KUMC-GR159/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441158 A/Korea/KUMC-GR172/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441159 A/Korea/KUMC-GR175/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441160 A/Korea/KUMC-GR190/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGTTTTGTCCAAAATGCCC ->MF441161 A/Korea/KUMC-GR195/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441162 A/Korea/KUMC-GR202/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441163 A/Korea/KUMC-GR218/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441164 A/Korea/KUMC-GR226/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441165 A/Korea/KUMC-GR227/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441166 A/Korea/KUMC-GR252/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441167 A/Korea/KUMC-GR261/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441168 A/Korea/KUMC-GR281/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441169 A/Korea/KUMC-GR285/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441170 A/Korea/KUMC-GR437/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441171 A/Korea/KUMC-GR567/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441172 A/Korea/KUMC-GR570/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441173 A/Korea/KUMC-GR590/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441174 A/Korea/KUMC-GR689/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGGTTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTTCAAAATGCCC ->MF593488 A/Mexico/4104/2016 2016/01/14 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAACGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593496 A/Mexico/4431/2016 2016/03/01 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593504 A/Mexico/4433/2016 2016/02/28 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593512 A/Mexico/4435/2016 2016/03/02 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593520 A/Mexico/4436/2016 2016/02/28 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593528 A/Mexico/4440/2016 2016/03/16 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593533 A/Mexico/4604/2017 2017/02/10 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593538 A/Mexico/4621/2017 2017/02/18 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593544 A/Mexico/4627/2017 2017/02/24 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593549 A/Mexico/4628/2017 2017/03/02 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593557 A/Mexico/4687/2017 2017/03/12 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593562 A/Mexico/4703/2017 2017/03/21 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593570 A/Mexico/7317/2017 2017/01/16 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593578 A/Mexico/8017/2017 2017/01/17 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593586 A/Mexico/8517/2017 2017/01/18 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593594 A/Mexico/11417/2017 2017/01/23 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593602 A/Mexico/12317/2017 2017/01/24 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593610 A/Mexico/15017/2017 2017/01/26 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593615 A/Mexico/15517/2017 2017/01/27 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF593623 A/Mexico/17517/2017 2017/01/30 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->MF599433 A/Kenya/001/2017 2017/01/01 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF599434 A/Kenya/002/2017 2017/01/02 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF599435 A/Kenya/009/2017 2017/03/06 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF599436 A/Kenya/011/2017 2017/03/08 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTCTTCACGCTCACCGTGC -CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF599437 A/Kenya/014/2017 2017/03/13 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF599438 A/Kenya/015/2017 2017/04/03 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF599439 A/Kenya/008/2017 2017/03/05 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF599440 A/Kenya/010/2017 2017/03/07 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF599441 A/Kenya/012/2017 2017/03/10 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF599442 A/Kenya/013/2017 2017/03/12 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF599443 A/Kenya/003/2017 2017/02/01 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF629795 A/Trivandrum/MCVR261/2009 2009/08/21 7 (MP) -GACCAATCTTGTCACCCCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC diff -r 8af7fd9b7476 -r 71bc4fd01351 test-data/safelist.fasta --- a/test-data/safelist.fasta Fri Sep 07 15:17:08 2018 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,12 +0,0 @@ ->KY925925 A/Santo Antonio da Patrulha/LACENRS-2621/2012 2012/08/10 7 (MP) -GACCAATCTTGTCACCTCTGACTAAGGGAATTTTAGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTATCCAAAATGCCC ->KY925966 A/Porto Alegre/LACENRS-1075/2015 2015/05/29 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF441145 A/Korea/KUMC-GR66/2011 2011/12/01 7 (MP) -GACCAATTCTGTCACCTCTGACTAAGGGGATTTTGGGATTTGTGTTCACGCTCACCGTGC -CCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC ->MF599435 A/Kenya/009/2017 2017/03/06 7 (MP) -GACCAATTCTGTCACCTTTGACTAAGGGGATTTTAGGGTTTGTTTTCACGCTCACCGTGC -CAAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC diff -r 8af7fd9b7476 -r 71bc4fd01351 tools/amplicon_sequevars_to_dict/amplicon_sequevars_to_dict.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/amplicon_sequevars_to_dict/amplicon_sequevars_to_dict.py Fri Sep 07 15:22:34 2018 -0400 @@ -0,0 +1,113 @@ +#!/usr/bin/env python +'''Reads in nucleotide sequences from a fasta file as SeqRecords and sorts into a dictionary, +with unique sequences as keys and lists of SeqRecords sharing these sequevars as values. If the user +specified a fasta file containing sequences which have been empirically determined to amplify in the +PCR (i.e. "safelist"), the sequevars are compared to this safelist prior to being output to fasta. Sequence +identifiers (usually Genbank accession numbers) of sequences in a sequevar are printed to a text file. + +Usage with safelist: python sequevarsToDict.py -s safelist.fasta unmatching_sequences.fasta Mar20_sequevars +Usage without safelist: python sequevars_to_dict.py unmatching_sequences.fasta Mar20_sequevars +''' + +'''Author: Diane Eisler, Molecular Microbiology & Genomics, BCCDC Public Health Laboratory, March 2018''' + +import sys,string,os, time, Bio, re, argparse +from Bio import Seq, SeqIO, SeqUtils, Alphabet, SeqRecord +from Bio.Alphabet import IUPAC +from Bio.Seq import Seq +from Bio.SeqRecord import SeqRecord +import Bio.Data.IUPACData + +#parse command line arguments +parser = argparse.ArgumentParser() +parser.add_argument("-s", dest = "safelist", help = "specify filename of previously vetted sequences", + action = "store") +parser.add_argument("fastaToParse") +parser.add_argument("outFileHandle") +results = parser.parse_args() +print("ARGUMENTS:") +print(results) +print('safelist = ', results.safelist) + +#construct output filenames +gb_accessionsHandle = results.outFileHandle + ".txt" #user-specified output file name +gb_accessions= open(gb_accessionsHandle,'w') #output file of GB access #'s and details re: which oligos NOT found +outputFastaHandle = results.outFileHandle + ".fasta" #create fasta file from user specified name + +def outputInformationFile(sequevars): + '''Outputs text file with unique amplicon sequences and id's of sequences in each sequevar.''' + localtime = time.asctime(time.localtime(time.time())) #date and time of analysis + gb_accessions.write("---------------------------------------------------------------------------\n") + gb_accessions.write("RESULTS OF SEARCH FOR EXACT MATCHES TO OLIGONUCLEOTIDES IN QUERY SEQUENCES:\n") + gb_accessions.write("---------------------------------------------------------------------------\n\n") + gb_accessions.write("Analysis as of: " + localtime + "\n\n") + gb_accessions.write("\n---------------------------------------------------------------------------\n") + #if there are sequevars for further analysis, output them to the results .txt and fasta files + if len(sequevars) > 0: #if there are sequevars to output, write to list with sequence id's in sequevar + outputSequevarsToFasta(sequevars) #output fasta file of sequevars i.e.representative record + gb_accessions.write("%i Different Amplicon Sequevars Found: %s \n\n" % (len(sequevars),results.fastaToParse)) + for sequevar in sequevars: + print("\nSequevar: %s" % sequevar) #print sequevar to console + recordList = sequevars[sequevar] + gb_accessions.write("\n\n%i Records in Sequevar: %s" % (len(recordList),sequevar)) #write sequevar to file + print("%i Record(s) have this sequence:" % (len(recordList))) + for record in recordList: + print("\t" + record.id) #print record id to console + gb_accessions.write("\n\t" + record.id) #write records with this sequence to file + else: #if not sequevars to output, print message to outfile to clarify this + gb_accessions.write("No sequevars requiring further investigation.") + gb_accessions.write("\n---------------------------------------------------------------------------") + gb_accessions.write("\n\nEND OF RESULTS\n") + #print output filepaths to console and to search results file + gb_accessions.write("\nSearch Results (this file): " + gb_accessionsHandle) + print("\nSearch Results: " + gb_accessionsHandle) + if len(sequevars) > 0: #if there are sequevars to print to fasta, direct user to fasta file + gb_accessions.write("\nFasta sequences for analysis: " + outputFastaHandle) + print("Fasta sequences for analysis: " + outputFastaHandle) + else: + gb_accessions.write("\nFasta sequences for analysis: N/A") + print("Fasta sequences for analysis: N/A\n") + return + +def outputSequevarsToFasta(sequevars): + '''Output a fasta file with one representative sequence for each sequevar.''' + output_fasta = open(outputFastaHandle, 'w') #create a writeable fasta file + sequevars_list = [] + for sequevar in sequevars: + first_record = sequevars[sequevar][0] #grab the first SeqRecord in the list sharing the sequevar + sequevars_list.append(first_record) #add it to the list of SeqRecords + SeqIO.write(sequevars_list, output_fasta, "fasta") #write the sequevars to fasta + return + +with open(results.fastaToParse,'r') as inFile: + #read nucleotide sequences from fasta file into SeqRecords, uppercases and adds to a list + seqList = [rec.upper() for rec in list(SeqIO.parse(inFile, "fasta", alphabet=IUPAC.ambiguous_dna))] + sequevars = {} # empty dictionary of sequevar: list + safe_seqs = [] # empty list of nucleotide sequences + + if results.safelist: #if safelist specified by user, read these sequences into a list + safe_seqs = [record.seq for record in SeqIO.parse(results.safelist, "fasta", alphabet=IUPAC.ambiguous_dna)] + print("Safeseqs:") + for seq in safe_seqs: + print(seq) #print all the vetted unique sequences + + for record in seqList: #parse SeqRecord for exact match to amplicon or reverse complement + sequence = str(record.seq) + #check if the sequence has already been tested and vetted (i.e. in safelist) + if (sequence in safe_seqs): #only add to dictionary if not previously tested + print("Sequence from %s in safe list!" % (record.id) )#print mssg to console + else: + #if the sequence is a key in the dict, add SeqRecord to list + if sequence in sequevars: + sequevars[sequence].append(record) + else: + #add sequence as new key to dict, accessing a list with SeqRecord as its first item + sequevars[sequence] = [record] + #get a sorted list of unique sequence keys + sorted_unique_seq_keys = sorted(sequevars.keys()) + #process list of SeqRecords in each sequevar and print to results summary file for end-user + outputInformationFile(sequevars) #output information .txt file + #outputSequevarsToFasta(sequevars) #output fasta file of sequevars i.e.representative record + +inFile.close() +gb_accessions.close() diff -r 8af7fd9b7476 -r 71bc4fd01351 tools/amplicon_sequevars_to_dict/amplicon_sequevars_to_dict.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/amplicon_sequevars_to_dict/amplicon_sequevars_to_dict.xml Fri Sep 07 15:22:34 2018 -0400 @@ -0,0 +1,35 @@ + + + biopython + + + + + + + + + + + + + + + + + + + + + + + +#NEEDS MODIFICATIONS!!! \ No newline at end of file