Mercurial > repos > deepakjadmin > mayatool3_test2
comparison docs/scripts/txt/InfoSequenceFiles.txt @ 0:4816e4a8ae95 draft default tip
Uploaded
| author | deepakjadmin |
|---|---|
| date | Wed, 20 Jan 2016 09:23:18 -0500 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:4816e4a8ae95 |
|---|---|
| 1 NAME | |
| 2 InfoSequenceFiles.pl - List information about sequence and alignment | |
| 3 files | |
| 4 | |
| 5 SYNOPSIS | |
| 6 InfoSequenceFiles.pl SequenceFile(s) AlignmentFile(s)... | |
| 7 | |
| 8 InfoSequenceFiles.pl [-a, --all] [-c, --count] [-d, --detail infolevel] | |
| 9 [-f, --frequency] [--FrequencyBins number | "number, number, | |
| 10 [number,...]"] [-h, --help] [-i, --IgnoreGaps yes | no] [-l, --longest] | |
| 11 [-s, --shortest] [--SequenceLengths] [-w, --workingdir dirname] | |
| 12 SequenceFile(s)... | |
| 13 | |
| 14 DESCRIPTION | |
| 15 List information about contents of *SequenceFile(s) and | |
| 16 AlignmentFile(s)*: number of sequences, shortest and longest sequences, | |
| 17 distribution of sequence lengths and so on. The file names are separated | |
| 18 by spaces. All the sequence files in a current directory can be | |
| 19 specified by **.aln*, **.msf*, **.fasta*, **.fta*, **.pir* or any other | |
| 20 supported formats; additionally, *DirName* corresponds to all the | |
| 21 sequence files in the current directory with any of the supported file | |
| 22 extension: *.aln, .msf, .fasta, .fta, and .pir*. | |
| 23 | |
| 24 Supported sequence formats are: *ALN/CLustalW*, *GCG/MSF*, *PILEUP/MSF*, | |
| 25 *Pearson/FASTA*, and *NBRF/PIR*. Instead of using file extensions, file | |
| 26 formats are detected by parsing the contents of *SequenceFile(s) and | |
| 27 AlignmentFile(s)*. | |
| 28 | |
| 29 OPTIONS | |
| 30 -a, --all | |
| 31 List all the available information. | |
| 32 | |
| 33 -c, --count | |
| 34 List number of of sequences. This is default behavior. | |
| 35 | |
| 36 -d, --detail *InfoLevel* | |
| 37 Level of information to print about sequences during various | |
| 38 options. Default: *1*. Possible values: *1, 2 or 3*. | |
| 39 | |
| 40 -f, --frequency | |
| 41 List distribution of sequence lengths using the specified number of | |
| 42 bins or bin range specified using FrequencyBins option. | |
| 43 | |
| 44 This option is ignored for input files containing only single | |
| 45 sequence. | |
| 46 | |
| 47 --FrequencyBins *number | "number,number,[number,...]"* | |
| 48 This value is used with -f, --frequency option to list distribution | |
| 49 of sequence lengths using the specified number of bins or bin range. | |
| 50 Default value: *10*. | |
| 51 | |
| 52 The bin range list is used to group sequence lengths into different | |
| 53 groups; It must contain values in ascending order. Examples: | |
| 54 | |
| 55 100,200,300,400,500,600 | |
| 56 200,400,600,800,1000 | |
| 57 | |
| 58 The frequency value calculated for a specific bin corresponds to all | |
| 59 the sequence lengths which are greater than the previous bin value | |
| 60 and less than or equal to the current bin value. | |
| 61 | |
| 62 -h, --help | |
| 63 Print this help message. | |
| 64 | |
| 65 -i, --IgnoreGaps *yes | no* | |
| 66 Ignore gaps during calculation of sequence lengths. Possible values: | |
| 67 *yes or no*. Default value: *no*. | |
| 68 | |
| 69 -l, --longest | |
| 70 List information about longest sequence: ID, sequence and sequence | |
| 71 length. This option is ignored for input files containing only | |
| 72 single sequence. | |
| 73 | |
| 74 -s, --shortest | |
| 75 List information about shortest sequence: ID, sequence and sequence | |
| 76 length. This option is ignored for input files containing only | |
| 77 single sequence. | |
| 78 | |
| 79 --SequenceLengths | |
| 80 List information about sequence lengths. | |
| 81 | |
| 82 -w, --WorkingDir *dirname* | |
| 83 Location of working directory. Default: current directory. | |
| 84 | |
| 85 EXAMPLES | |
| 86 To count number of sequences in sequence files, type: | |
| 87 | |
| 88 % InfoSequenceFiles.pl Sample1.fasta | |
| 89 % InfoSequenceFiles.pl Sample1.msf Sample1.aln Sample1.pir | |
| 90 % InfoSequenceFiles.pl *.fasta *.fta *.msf *.pir *.aln | |
| 91 | |
| 92 To list all available information with maximum level of available detail | |
| 93 for a sequence alignment file Sample1.msf, type: | |
| 94 | |
| 95 % InfoSequenceFiles.pl -a -d 3 Sample1.msf | |
| 96 | |
| 97 To list sequence length information after ignoring sequence gaps in | |
| 98 Sample1.aln file, type: | |
| 99 | |
| 100 % InfoSequenceFiles.pl --SequenceLengths --IgnoreGaps Yes | |
| 101 Sample1.aln | |
| 102 | |
| 103 To list shortest and longest sequence length information after ignoring | |
| 104 sequence gaps in Sample1.aln file, type: | |
| 105 | |
| 106 % InfoSequenceFiles.pl --longest --shortest --IgnoreGaps Yes | |
| 107 Sample1.aln | |
| 108 | |
| 109 To list distribution of sequence lengths after ignoring sequence gaps in | |
| 110 Sample1.aln file and report the frequency distribution into 10 bins, | |
| 111 type: | |
| 112 | |
| 113 % InfoSequenceFiles.pl --frequency --FrequencyBins 10 | |
| 114 --IgnoreGaps Yes Sample1.aln | |
| 115 | |
| 116 To list distribution of sequence lengths after ignoring sequence gaps in | |
| 117 Sample1.aln file and report the frequency distribution into specified | |
| 118 bin range, type: | |
| 119 | |
| 120 % InfoSequenceFiles.pl --frequency --FrequencyBins | |
| 121 "150,200,250,300,350" --IgnoreGaps Yes Sample1.aln | |
| 122 | |
| 123 AUTHOR | |
| 124 Manish Sud <msud@san.rr.com> | |
| 125 | |
| 126 SEE ALSO | |
| 127 AnalyzeSequenceFilesData.pl, ExtractFromSequenceFiles.pl, | |
| 128 InfoAminoAcids.pl, InfoNucleicAcids.pl | |
| 129 | |
| 130 COPYRIGHT | |
| 131 Copyright (C) 2015 Manish Sud. All rights reserved. | |
| 132 | |
| 133 This file is part of MayaChemTools. | |
| 134 | |
| 135 MayaChemTools is free software; you can redistribute it and/or modify it | |
| 136 under the terms of the GNU Lesser General Public License as published by | |
| 137 the Free Software Foundation; either version 3 of the License, or (at | |
| 138 your option) any later version. | |
| 139 |
