Mercurial > repos > deepakjadmin > mayatool3_test2
view docs/scripts/man1/InfoSequenceFiles.1 @ 0:4816e4a8ae95 draft default tip
Uploaded
author | deepakjadmin |
---|---|
date | Wed, 20 Jan 2016 09:23:18 -0500 |
parents | |
children |
line wrap: on
line source
.\" Automatically generated by Pod::Man 2.25 (Pod::Simple 3.22) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is turned on, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .ie \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . nr % 0 . rr F .\} .el \{\ . de IX .. .\} .\" .\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2). .\" Fear. Run. Save yourself. No user-serviceable parts. . \" fudge factors for nroff and troff .if n \{\ . ds #H 0 . ds #V .8m . ds #F .3m . ds #[ \f1 . ds #] \fP .\} .if t \{\ . ds #H ((1u-(\\\\n(.fu%2u))*.13m) . ds #V .6m . ds #F 0 . ds #[ \& . ds #] \& .\} . \" simple accents for nroff and troff .if n \{\ . ds ' \& . ds ` \& . ds ^ \& . ds , \& . ds ~ ~ . ds / .\} .if t \{\ . ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u" . ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u' . ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u' . ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u' . ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u' . ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u' .\} . \" troff and (daisy-wheel) nroff accents .ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V' .ds 8 \h'\*(#H'\(*b\h'-\*(#H' .ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#] .ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H' .ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u' .ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#] .ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#] .ds ae a\h'-(\w'a'u*4/10)'e .ds Ae A\h'-(\w'A'u*4/10)'E . \" corrections for vroff .if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u' .if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u' . \" for low resolution devices (crt and lpr) .if \n(.H>23 .if \n(.V>19 \ \{\ . ds : e . ds 8 ss . ds o a . ds d- d\h'-1'\(ga . ds D- D\h'-1'\(hy . ds th \o'bp' . ds Th \o'LP' . ds ae ae . ds Ae AE .\} .rm #[ #] #H #V #F C .\" ======================================================================== .\" .IX Title "INFOSEQUENCEFILES 1" .TH INFOSEQUENCEFILES 1 "2015-03-29" "perl v5.14.2" "MayaChemTools" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" InfoSequenceFiles.pl \- List information about sequence and alignment files .SH "SYNOPSIS" .IX Header "SYNOPSIS" InfoSequenceFiles.pl SequenceFile(s) AlignmentFile(s)... .PP InfoSequenceFiles.pl [\fB\-a, \-\-all\fR] [\fB\-c, \-\-count\fR] [\fB\-d, \-\-detail\fR infolevel] [\fB\-f, \-\-frequency\fR] [\fB\-\-FrequencyBins\fR number | \*(L"number, number, [number,...]\*(R"] [\fB\-h, \-\-help\fR] [\fB\-i, \-\-IgnoreGaps\fR yes | no] [\fB\-l, \-\-longest\fR] [\fB\-s, \-\-shortest\fR] [\fB\-\-SequenceLengths\fR] [\fB\-w, \-\-workingdir\fR dirname] SequenceFile(s)... .SH "DESCRIPTION" .IX Header "DESCRIPTION" List information about contents of \fISequenceFile(s) and AlignmentFile(s)\fR: number of sequences, shortest and longest sequences, distribution of sequence lengths and so on. The file names are separated by spaces. All the sequence files in a current directory can be specified by \fI*.aln\fR, \&\fI*.msf\fR, \fI*.fasta\fR, \fI*.fta\fR, \fI*.pir\fR or any other supported formats; additionally, \fIDirName\fR corresponds to all the sequence files in the current directory with any of the supported file extension: \fI.aln, .msf, .fasta, .fta, and .pir\fR. .PP Supported sequence formats are: \fIALN/CLustalW\fR, \fI\s-1GCG/MSF\s0\fR, \fI\s-1PILEUP/MSF\s0\fR, \fIPearson/FASTA\fR, and \fI\s-1NBRF/PIR\s0\fR. Instead of using file extensions, file formats are detected by parsing the contents of \fISequenceFile(s) and AlignmentFile(s)\fR. .SH "OPTIONS" .IX Header "OPTIONS" .IP "\fB\-a, \-\-all\fR" 4 .IX Item "-a, --all" List all the available information. .IP "\fB\-c, \-\-count\fR" 4 .IX Item "-c, --count" List number of of sequences. This is \fBdefault behavior\fR. .IP "\fB\-d, \-\-detail\fR \fIInfoLevel\fR" 4 .IX Item "-d, --detail InfoLevel" Level of information to print about sequences during various options. Default: \fI1\fR. Possible values: \fI1, 2 or 3\fR. .IP "\fB\-f, \-\-frequency\fR" 4 .IX Item "-f, --frequency" List distribution of sequence lengths using the specified number of bins or bin range specified using \fBFrequencyBins\fR option. .Sp This option is ignored for input files containing only single sequence. .ie n .IP "\fB\-\-FrequencyBins\fR \fInumber | ""number,number,[number,...]""\fR" 4 .el .IP "\fB\-\-FrequencyBins\fR \fInumber | ``number,number,[number,...]''\fR" 4 .IX Item "--FrequencyBins number | number,number,[number,...]" This value is used with \fB\-f, \-\-frequency\fR option to list distribution of sequence lengths using the specified number of bins or bin range. Default value: \fI10\fR. .Sp The bin range list is used to group sequence lengths into different groups; It must contain values in ascending order. Examples: .Sp .Vb 2 \& 100,200,300,400,500,600 \& 200,400,600,800,1000 .Ve .Sp The frequency value calculated for a specific bin corresponds to all the sequence lengths which are greater than the previous bin value and less than or equal to the current bin value. .IP "\fB\-h, \-\-help\fR" 4 .IX Item "-h, --help" Print this help message. .IP "\fB\-i, \-\-IgnoreGaps\fR \fIyes | no\fR" 4 .IX Item "-i, --IgnoreGaps yes | no" Ignore gaps during calculation of sequence lengths. Possible values: \fIyes or no\fR. Default value: \fIno\fR. .IP "\fB\-l, \-\-longest\fR" 4 .IX Item "-l, --longest" List information about longest sequence: \s-1ID\s0, sequence and sequence length. This option is ignored for input files containing only single sequence. .IP "\fB\-s, \-\-shortest\fR" 4 .IX Item "-s, --shortest" List information about shortest sequence: \s-1ID\s0, sequence and sequence length. This option is ignored for input files containing only single sequence. .IP "\fB\-\-SequenceLengths\fR" 4 .IX Item "--SequenceLengths" List information about sequence lengths. .IP "\fB\-w, \-\-WorkingDir\fR \fIdirname\fR" 4 .IX Item "-w, --WorkingDir dirname" Location of working directory. Default: current directory. .SH "EXAMPLES" .IX Header "EXAMPLES" To count number of sequences in sequence files, type: .PP .Vb 3 \& % InfoSequenceFiles.pl Sample1.fasta \& % InfoSequenceFiles.pl Sample1.msf Sample1.aln Sample1.pir \& % InfoSequenceFiles.pl *.fasta *.fta *.msf *.pir *.aln .Ve .PP To list all available information with maximum level of available detail for a sequence alignment file Sample1.msf, type: .PP .Vb 1 \& % InfoSequenceFiles.pl \-a \-d 3 Sample1.msf .Ve .PP To list sequence length information after ignoring sequence gaps in Sample1.aln file, type: .PP .Vb 2 \& % InfoSequenceFiles.pl \-\-SequenceLengths \-\-IgnoreGaps Yes \& Sample1.aln .Ve .PP To list shortest and longest sequence length information after ignoring sequence gaps in Sample1.aln file, type: .PP .Vb 2 \& % InfoSequenceFiles.pl \-\-longest \-\-shortest \-\-IgnoreGaps Yes \& Sample1.aln .Ve .PP To list distribution of sequence lengths after ignoring sequence gaps in Sample1.aln file and report the frequency distribution into 10 bins, type: .PP .Vb 2 \& % InfoSequenceFiles.pl \-\-frequency \-\-FrequencyBins 10 \& \-\-IgnoreGaps Yes Sample1.aln .Ve .PP To list distribution of sequence lengths after ignoring sequence gaps in Sample1.aln file and report the frequency distribution into specified bin range, type: .PP .Vb 2 \& % InfoSequenceFiles.pl \-\-frequency \-\-FrequencyBins \& "150,200,250,300,350" \-\-IgnoreGaps Yes Sample1.aln .Ve .SH "AUTHOR" .IX Header "AUTHOR" Manish Sud <msud@san.rr.com> .SH "SEE ALSO" .IX Header "SEE ALSO" AnalyzeSequenceFilesData.pl, ExtractFromSequenceFiles.pl, InfoAminoAcids.pl, InfoNucleicAcids.pl .SH "COPYRIGHT" .IX Header "COPYRIGHT" Copyright (C) 2015 Manish Sud. All rights reserved. .PP This file is part of MayaChemTools. .PP MayaChemTools is free software; you can redistribute it and/or modify it under the terms of the \s-1GNU\s0 Lesser General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.