Mercurial > repos > deepakjadmin > mayatool3_test2
comparison docs/modules/txt/SequenceFileUtil.txt @ 0:4816e4a8ae95 draft default tip
Uploaded
| author | deepakjadmin |
|---|---|
| date | Wed, 20 Jan 2016 09:23:18 -0500 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:4816e4a8ae95 |
|---|---|
| 1 NAME | |
| 2 SequenceFileUtil | |
| 3 | |
| 4 SYNOPSIS | |
| 5 use SequenceFileUtil ; | |
| 6 | |
| 7 use SequenceFileUtil qw(:all); | |
| 8 | |
| 9 DESCRIPTION | |
| 10 SequenceFileUtil module provides the following functions: | |
| 11 | |
| 12 AreSequenceLengthsIdentical, CalcuatePercentSequenceIdentity, | |
| 13 CalculatePercentSequenceIdentityMatrix, GetLongestSequence, | |
| 14 GetSequenceLength, GetShortestSequence, IsClustalWSequenceFile, | |
| 15 IsGapResidue, IsMSFSequenceFile, IsPIRFastaSequenceFile, | |
| 16 IsPearsonFastaSequenceFile, IsSupportedSequenceFile, | |
| 17 ReadClustalWSequenceFile, ReadMSFSequenceFile, ReadPIRFastaSequenceFile, | |
| 18 ReadPearsonFastaSequenceFile, ReadSequenceFile, | |
| 19 RemoveSequenceAlignmentGapColumns, RemoveSequenceGaps, | |
| 20 WritePearsonFastaSequenceFile SequenceFileUtil module provides various | |
| 21 methods to process sequence files and retreive appropriate information. | |
| 22 | |
| 23 FUNCTIONS | |
| 24 AreSequenceLengthsIdentical | |
| 25 $Status = AreSequenceLengthsIdentical($SequencesDataRef); | |
| 26 | |
| 27 Checks the lengths of all the sequences available in | |
| 28 *SequencesDataRef* and returns 1 or 0 based whether lengths of all | |
| 29 the sequence is same. | |
| 30 | |
| 31 CalcuatePercentSequenceIdentity | |
| 32 $PercentIdentity = | |
| 33 AreSequenceLengthsIdenticalAreSequenceLengthsIdentical( | |
| 34 $Sequence1, $Sequence2, [$IgnoreGaps, $Precision]); | |
| 35 | |
| 36 Returns percent identity between *Sequence1* and *Sequence2*. | |
| 37 Optional arguments *IgnoreGaps* and *Precision* control handling of | |
| 38 gaps in sequences and precision of the returned value. By default, | |
| 39 gaps are ignored and precision is set up to 1 decimal. | |
| 40 | |
| 41 CalculatePercentSequenceIdentityMatrix | |
| 42 $IdentityMatrixDataRef = CalculatePercentSequenceIdentityMatrix( | |
| 43 $SequencesDataRef, [$IgnoreGaps, | |
| 44 $Precision]); | |
| 45 | |
| 46 Calculate pairwise percent identity between all the sequences | |
| 47 available in *SequencesDataRef* and returns a reference to identity | |
| 48 matrix hash. Optional arguments *IgnoreGaps* and *Precision* control | |
| 49 handling of gaps in sequences and precision of the returned value. | |
| 50 By default, gaps are ignored and precision is set up to 1 decimal. | |
| 51 | |
| 52 GetSequenceLength | |
| 53 $SeqquenceLength = GetSequenceLength($Sequence, [$IgnoreGaps]); | |
| 54 | |
| 55 Returns length of the specified sequence. Optional argument | |
| 56 *IgnoreGaps* controls handling of gaps. By default, gaps are | |
| 57 ignored. | |
| 58 | |
| 59 GetShortestSequence | |
| 60 ($ID, $Sequence, $SeqLen, $Description) = GetShortestSequence( | |
| 61 $SequencesDataRef, [$IgnoreGaps]); | |
| 62 | |
| 63 Checks the lengths of all the sequences available in | |
| 64 $SequencesDataRef and returns $ID, $Sequence, $SeqLen, and | |
| 65 $Description values for the shortest sequence. Optional arguments | |
| 66 $IgnoreGaps controls handling of gaps in sequences. By default, gaps | |
| 67 are ignored. | |
| 68 | |
| 69 GetLongestSequence | |
| 70 ($ID, $Sequence, $SeqLen, $Description) = GetLongestSequence( | |
| 71 $SequencesDataRef, [$IgnoreGaps]); | |
| 72 | |
| 73 Checks the lengths of all the sequences available in | |
| 74 *SequencesDataRef* and returns ID, Sequence, SeqLen, and Description | |
| 75 values for the longest sequence. Optional argument $*IgnoreGaps* | |
| 76 controls handling of gaps in sequences. By default, gaps are | |
| 77 ignored. | |
| 78 | |
| 79 IsGapResidue | |
| 80 $Status = AreSequenceLengthsIdentical($Residue); | |
| 81 | |
| 82 Returns 1 or 0 based on whether *Residue* corresponds to a gap. Any | |
| 83 character other than A to Z is considered a gap residue. | |
| 84 | |
| 85 IsSupportedSequenceFile | |
| 86 $Status = IsSupportedSequenceFile($SequenceFile); | |
| 87 | |
| 88 Returns 1 or 0 based on whether *SequenceFile* corresponds to a | |
| 89 supported sequence format. | |
| 90 | |
| 91 IsClustalWSequenceFile | |
| 92 $Status = IsClustalWSequenceFile($SequenceFile); | |
| 93 | |
| 94 Returns 1 or 0 based on whether *SequenceFile* corresponds to | |
| 95 Clustal sequence alignment format. | |
| 96 | |
| 97 IsPearsonFastaSequenceFile | |
| 98 $Status = IsPearsonFastaSequenceFile($SequenceFile); | |
| 99 | |
| 100 Returns 1 or 0 based on whether *SequenceFile* corresponds to | |
| 101 Pearson FASTA sequence format. | |
| 102 | |
| 103 IsPIRFastaSequenceFile | |
| 104 $Status = IsPIRFastaSequenceFile($SequenceFile); | |
| 105 | |
| 106 Returns 1 or 0 based on whether *SequenceFile* corresponds to PIR | |
| 107 FASTA sequence format. | |
| 108 | |
| 109 IsMSFSequenceFile | |
| 110 $Status = IsClustalWSequenceFile($SequenceFile); | |
| 111 | |
| 112 Returns 1 or 0 based on whether *SequenceFile* corresponds to MSF | |
| 113 sequence alignment format. | |
| 114 | |
| 115 ReadSequenceFile | |
| 116 $SequenceDataMapRef = ReadSequenceFile($SequenceFile); | |
| 117 | |
| 118 Reads *SequenceFile* and returns reference to a hash containing | |
| 119 following key/value pairs: | |
| 120 | |
| 121 $SequenceDataMapRef->{IDs} - Array of sequence IDs | |
| 122 $SequenceDataMapRef->{Count} - Number of sequences | |
| 123 $SequenceDataMapRef->{Description}{$ID} - Sequence description | |
| 124 $SequenceDataMapRef->{Sequence}{$ID} - Sequence for a specific ID | |
| 125 $SequenceDataMapRef->{Sequence}{InputFileType} - File format | |
| 126 | |
| 127 ReadClustalWSequenceFile | |
| 128 $SequenceDataMapRef = ReadClustalWSequenceFile($SequenceFile); | |
| 129 | |
| 130 Reads ClustalW *SequenceFile* and returns reference to a hash | |
| 131 containing following key/value pairs as describes in | |
| 132 ReadSequenceFile method. | |
| 133 | |
| 134 ReadMSFSequenceFile | |
| 135 $SequenceDataMapRef = ReadMSFSequenceFile($SequenceFile); | |
| 136 | |
| 137 Reads MSF *SequenceFile* and returns reference to a hash containing | |
| 138 following key/value pairs as describes in ReadSequenceFile method. | |
| 139 | |
| 140 ReadPIRFastaSequenceFile | |
| 141 $SequenceDataMapRef = ReadPIRFastaSequenceFile($SequenceFile); | |
| 142 | |
| 143 Reads PIR FASTA *SequenceFile* and returns reference to a hash | |
| 144 containing following key/value pairs as describes in | |
| 145 ReadSequenceFile method. | |
| 146 | |
| 147 ReadPearsonFastaSequenceFile | |
| 148 $SequenceDataMapRef = ReadPearsonFastaSequenceFile($SequenceFile); | |
| 149 | |
| 150 Reads Pearson FASTA *SequenceFile* and returns reference to a hash | |
| 151 containing following key/value pairs as describes in | |
| 152 ReadSequenceFile method. | |
| 153 | |
| 154 RemoveSequenceGaps | |
| 155 $SeqWithoutGaps = RemoveSequenceGaps($Sequence); | |
| 156 | |
| 157 Removes gaps from *Sequence* and return a sequence without any gaps. | |
| 158 | |
| 159 RemoveSequenceAlignmentGapColumns | |
| 160 $NewAlignmentDataMapRef = RemoveSequenceAlignmentGapColumns( | |
| 161 $AlignmentDataMapRef); | |
| 162 | |
| 163 Using input alignment data map ref containing following keys, | |
| 164 generate a new hash with same set of keys after residue columns | |
| 165 containg only gaps have been removed: | |
| 166 | |
| 167 {IDs} : Array of IDs in order as they appear in file | |
| 168 {Count}: ID count | |
| 169 {Description}{$ID} : Description data | |
| 170 {Sequence}{$ID} : Sequence data | |
| 171 | |
| 172 WritePearsonFastaSequenceFile | |
| 173 WritePearsonFastaSequenceFile($SequenceFileName, $SequenceDataRef, | |
| 174 [$MaxLength]); | |
| 175 | |
| 176 Using sequence data specified via *SequenceDataRef*, write out a | |
| 177 Pearson FASTA sequence file. Optional argument *MaxLength* controls | |
| 178 maximum length sequence in each line; default is 80. | |
| 179 | |
| 180 AUTHOR | |
| 181 Manish Sud <msud@san.rr.com> | |
| 182 | |
| 183 SEE ALSO | |
| 184 PDBFileUtil.pm | |
| 185 | |
| 186 COPYRIGHT | |
| 187 Copyright (C) 2015 Manish Sud. All rights reserved. | |
| 188 | |
| 189 This file is part of MayaChemTools. | |
| 190 | |
| 191 MayaChemTools is free software; you can redistribute it and/or modify it | |
| 192 under the terms of the GNU Lesser General Public License as published by | |
| 193 the Free Software Foundation; either version 3 of the License, or (at | |
| 194 your option) any later version. | |
| 195 |
