| 0 | 1 <html> | 
|  | 2 <head> | 
|  | 3 <title>MayaChemTools:Documentation:SequenceFileUtil.pm</title> | 
|  | 4 <meta http-equiv="content-type" content="text/html;charset=utf-8"> | 
|  | 5 <link rel="stylesheet" type="text/css" href="../../css/MayaChemTools.css"> | 
|  | 6 </head> | 
|  | 7 <body leftmargin="20" rightmargin="20" topmargin="10" bottommargin="10"> | 
|  | 8 <br/> | 
|  | 9 <center> | 
|  | 10 <a href="http://www.mayachemtools.org" title="MayaChemTools Home"><img src="../../images/MayaChemToolsLogo.gif" border="0" alt="MayaChemTools"></a> | 
|  | 11 </center> | 
|  | 12 <br/> | 
|  | 13 <div class="DocNav"> | 
|  | 14 <table width="100%" border=0 cellpadding=0 cellspacing=2> | 
|  | 15 <tr align="left" valign="top"><td width="33%" align="left"><a href="./SDFileUtil.html" title="SDFileUtil.html">Previous</a>  <a href="./index.html" title="Table of Contents">TOC</a>  <a href="./StatisticsUtil.html" title="StatisticsUtil.html">Next</a></td><td width="34%" align="middle"><strong>SequenceFileUtil.pm</strong></td><td width="33%" align="right"><a href="././code/SequenceFileUtil.html" title="View source code">Code</a> | <a href="./../pdf/SequenceFileUtil.pdf" title="PDF US Letter Size">PDF</a> | <a href="./../pdfgreen/SequenceFileUtil.pdf" title="PDF US Letter Size with narrow margins: www.changethemargins.com">PDFGreen</a> | <a href="./../pdfa4/SequenceFileUtil.pdf" title="PDF A4 Size">PDFA4</a> | <a href="./../pdfa4green/SequenceFileUtil.pdf" title="PDF A4 Size with narrow margins: www.changethemargins.com">PDFA4Green</a></td></tr> | 
|  | 16 </table> | 
|  | 17 </div> | 
|  | 18 <p> | 
|  | 19 </p> | 
|  | 20 <h2>NAME</h2> | 
|  | 21 <p>SequenceFileUtil</p> | 
|  | 22 <p> | 
|  | 23 </p> | 
|  | 24 <h2>SYNOPSIS</h2> | 
|  | 25 <p>use SequenceFileUtil ;</p> | 
|  | 26 <p>use SequenceFileUtil qw(:all);</p> | 
|  | 27 <p> | 
|  | 28 </p> | 
|  | 29 <h2>DESCRIPTION</h2> | 
|  | 30 <p><strong>SequenceFileUtil</strong> module provides the following functions:</p> | 
|  | 31 <p> <a href="#aresequencelengthsidentical">AreSequenceLengthsIdentical</a>, <a href="#calcuatepercentsequenceidentity">CalcuatePercentSequenceIdentity</a> | 
|  | 32 , <a href="#calculatepercentsequenceidentitymatrix">CalculatePercentSequenceIdentityMatrix</a>, <a href="#getlongestsequence">GetLongestSequence</a>, <a href="#getsequencelength">GetSequenceLength</a> | 
|  | 33 , <a href="#getshortestsequence">GetShortestSequence</a>, <a href="#isclustalwsequencefile">IsClustalWSequenceFile</a>, <a href="#isgapresidue">IsGapResidue</a>, <a href="#ismsfsequencefile">IsMSFSequenceFile</a> | 
|  | 34 , <a href="#ispirfastasequencefile">IsPIRFastaSequenceFile</a>, <a href="#ispearsonfastasequencefile">IsPearsonFastaSequenceFile</a>, <a href="#issupportedsequencefile">IsSupportedSequenceFile</a> | 
|  | 35 , <a href="#readclustalwsequencefile">ReadClustalWSequenceFile</a>, <a href="#readmsfsequencefile">ReadMSFSequenceFile</a>, <a href="#readpirfastasequencefile">ReadPIRFastaSequenceFile</a> | 
|  | 36 , <a href="#readpearsonfastasequencefile">ReadPearsonFastaSequenceFile</a>, <a href="#readsequencefile">ReadSequenceFile</a>, <a href="#removesequencealignmentgapcolumns">RemoveSequenceAlignmentGapColumns</a> | 
|  | 37 , <a href="#removesequencegaps">RemoveSequenceGaps</a>, <a href="#writepearsonfastasequencefile">WritePearsonFastaSequenceFile</a> | 
|  | 38 , <a href="#sequencefileutil module provides various methods to process sequence">SequenceFileUtil module provides various methods to process sequence</a> | 
|  | 39 , <a href="#files and retreive appropriate information.">files and retreive appropriate information.</a> | 
|  | 40 </p><p> | 
|  | 41 </p> | 
|  | 42 <h2>FUNCTIONS</h2> | 
|  | 43 <dl> | 
|  | 44 <dt><strong><a name="aresequencelengthsidentical" class="item"><strong>AreSequenceLengthsIdentical</strong></a></strong></dt> | 
|  | 45 <dd> | 
|  | 46 <div class="OptionsBox"> | 
|  | 47     $Status = AreSequenceLengthsIdentical($SequencesDataRef);</div> | 
|  | 48 <p>Checks the lengths of all the sequences available in <em>SequencesDataRef</em> and returns 1 | 
|  | 49 or 0 based whether lengths of all the sequence is same.</p> | 
|  | 50 </dd> | 
|  | 51 <dt><strong><a name="calcuatepercentsequenceidentity" class="item"><strong>CalcuatePercentSequenceIdentity</strong></a></strong></dt> | 
|  | 52 <dd> | 
|  | 53 <div class="OptionsBox"> | 
|  | 54     $PercentIdentity = | 
|  | 55        AreSequenceLengthsIdenticalAreSequenceLengthsIdentical( | 
|  | 56           $Sequence1, $Sequence2, [$IgnoreGaps, $Precision]);</div> | 
|  | 57 <p>Returns percent identity between <em>Sequence1</em> and <em>Sequence2</em>. Optional arguments | 
|  | 58 <em>IgnoreGaps</em> and <em>Precision</em> control handling of gaps in sequences and precision of the | 
|  | 59 returned value. By default, gaps are ignored and precision is set up to 1 decimal.</p> | 
|  | 60 </dd> | 
|  | 61 <dt><strong><a name="calculatepercentsequenceidentitymatrix" class="item"><strong>CalculatePercentSequenceIdentityMatrix</strong></a></strong></dt> | 
|  | 62 <dd> | 
|  | 63 <div class="OptionsBox"> | 
|  | 64     $IdentityMatrixDataRef = CalculatePercentSequenceIdentityMatrix( | 
|  | 65                              $SequencesDataRef, [$IgnoreGaps, | 
|  | 66                              $Precision]);</div> | 
|  | 67 <p>Calculate pairwise percent identity between all the sequences available in <em>SequencesDataRef</em> | 
|  | 68 and returns a reference to identity matrix hash. Optional arguments <em>IgnoreGaps</em> and | 
|  | 69 <em>Precision</em> control handling of gaps in sequences and precision of the returned value. By default, gaps | 
|  | 70 are ignored and precision is set up to 1 decimal.</p> | 
|  | 71 </dd> | 
|  | 72 <dt><strong><a name="getsequencelength" class="item"><strong>GetSequenceLength</strong></a></strong></dt> | 
|  | 73 <dd> | 
|  | 74 <div class="OptionsBox"> | 
|  | 75     $SeqquenceLength = GetSequenceLength($Sequence, [$IgnoreGaps]);</div> | 
|  | 76 <p>Returns length of the specified sequence. Optional argument <em>IgnoreGaps</em> controls handling | 
|  | 77 of gaps. By default, gaps are ignored.</p> | 
|  | 78 </dd> | 
|  | 79 <dt><strong><a name="getshortestsequence" class="item"><strong>GetShortestSequence</strong></a></strong></dt> | 
|  | 80 <dd> | 
|  | 81 <div class="OptionsBox"> | 
|  | 82    ($ID, $Sequence, $SeqLen, $Description) = GetShortestSequence( | 
|  | 83           $SequencesDataRef, [$IgnoreGaps]);</div> | 
|  | 84 <p>Checks the lengths of all the sequences available in $SequencesDataRef and returns $ID, | 
|  | 85 $Sequence, $SeqLen, and $Description values for the shortest sequence. Optional arguments $IgnoreGaps | 
|  | 86 controls handling of gaps in sequences. By default, gaps are ignored.</p> | 
|  | 87 </dd> | 
|  | 88 <dt><strong><a name="getlongestsequence" class="item"><strong>GetLongestSequence</strong></a></strong></dt> | 
|  | 89 <dd> | 
|  | 90 <div class="OptionsBox"> | 
|  | 91    ($ID, $Sequence, $SeqLen, $Description) = GetLongestSequence( | 
|  | 92           $SequencesDataRef, [$IgnoreGaps]);</div> | 
|  | 93 <p>Checks the lengths of all the sequences available in <em>SequencesDataRef</em> and returns <strong>ID</strong>, | 
|  | 94 <strong>Sequence</strong>, <strong>SeqLen</strong>, and <strong>Description</strong> values for the longest sequence. Optional argument | 
|  | 95 $<em>IgnoreGaps</em> controls handling of gaps in sequences. By default, gaps are ignored.</p> | 
|  | 96 </dd> | 
|  | 97 <dt><strong><a name="isgapresidue" class="item"><strong>IsGapResidue</strong></a></strong></dt> | 
|  | 98 <dd> | 
|  | 99 <div class="OptionsBox"> | 
|  | 100     $Status = AreSequenceLengthsIdentical($Residue);</div> | 
|  | 101 <p>Returns 1 or 0 based on whether <em>Residue</em> corresponds to a gap. Any character other than A to Z is | 
|  | 102 considered a gap residue.</p> | 
|  | 103 </dd> | 
|  | 104 <dt><strong><a name="issupportedsequencefile" class="item"><strong>IsSupportedSequenceFile</strong></a></strong></dt> | 
|  | 105 <dd> | 
|  | 106 <div class="OptionsBox"> | 
|  | 107     $Status = IsSupportedSequenceFile($SequenceFile);</div> | 
|  | 108 <p>Returns 1 or 0 based on whether <em>SequenceFile</em> corresponds to a supported sequence | 
|  | 109 format.</p> | 
|  | 110 </dd> | 
|  | 111 <dt><strong><a name="isclustalwsequencefile" class="item"><strong>IsClustalWSequenceFile</strong></a></strong></dt> | 
|  | 112 <dd> | 
|  | 113 <div class="OptionsBox"> | 
|  | 114     $Status = IsClustalWSequenceFile($SequenceFile);</div> | 
|  | 115 <p>Returns 1 or 0 based on whether <em>SequenceFile</em> corresponds to Clustal sequence alignment | 
|  | 116 format.</p> | 
|  | 117 </dd> | 
|  | 118 <dt><strong><a name="ispearsonfastasequencefile" class="item"><strong>IsPearsonFastaSequenceFile</strong></a></strong></dt> | 
|  | 119 <dd> | 
|  | 120 <div class="OptionsBox"> | 
|  | 121     $Status = IsPearsonFastaSequenceFile($SequenceFile);</div> | 
|  | 122 <p>Returns 1 or 0 based on whether <em>SequenceFile</em> corresponds to Pearson FASTA sequence | 
|  | 123 format.</p> | 
|  | 124 </dd> | 
|  | 125 <dt><strong><a name="ispirfastasequencefile" class="item"><strong>IsPIRFastaSequenceFile</strong></a></strong></dt> | 
|  | 126 <dd> | 
|  | 127 <div class="OptionsBox"> | 
|  | 128     $Status = IsPIRFastaSequenceFile($SequenceFile);</div> | 
|  | 129 <p>Returns 1 or 0 based on whether <em>SequenceFile</em> corresponds to PIR FASTA sequence | 
|  | 130 format.</p> | 
|  | 131 </dd> | 
|  | 132 <dt><strong><a name="ismsfsequencefile" class="item"><strong>IsMSFSequenceFile</strong></a></strong></dt> | 
|  | 133 <dd> | 
|  | 134 <div class="OptionsBox"> | 
|  | 135     $Status = IsClustalWSequenceFile($SequenceFile);</div> | 
|  | 136 <p>Returns 1 or 0 based on whether <em>SequenceFile</em> corresponds to MSF sequence alignment | 
|  | 137 format.</p> | 
|  | 138 </dd> | 
|  | 139 <dt><strong><a name="readsequencefile" class="item"><strong>ReadSequenceFile</strong></a></strong></dt> | 
|  | 140 <dd> | 
|  | 141 <div class="OptionsBox"> | 
|  | 142     $SequenceDataMapRef = ReadSequenceFile($SequenceFile);</div> | 
|  | 143 <p>Reads <em>SequenceFile</em> and returns reference to a hash containing following key/value | 
|  | 144 pairs:</p> | 
|  | 145 <div class="OptionsBox"> | 
|  | 146     $SequenceDataMapRef->{IDs} - Array of sequence IDs | 
|  | 147 <br/>    $SequenceDataMapRef->{Count} - Number of sequences | 
|  | 148 <br/>    $SequenceDataMapRef->{Description}{$ID} - Sequence description | 
|  | 149 <br/>    $SequenceDataMapRef->{Sequence}{$ID} - Sequence for a specific ID | 
|  | 150 <br/>    $SequenceDataMapRef->{Sequence}{InputFileType} - File format</div> | 
|  | 151 </dd> | 
|  | 152 <dt><strong><a name="readclustalwsequencefile" class="item"><strong>ReadClustalWSequenceFile</strong></a></strong></dt> | 
|  | 153 <dd> | 
|  | 154 <div class="OptionsBox"> | 
|  | 155     $SequenceDataMapRef = ReadClustalWSequenceFile($SequenceFile);</div> | 
|  | 156 <p>Reads ClustalW <em>SequenceFile</em> and returns reference to a hash containing following key/value | 
|  | 157 pairs as describes in <strong>ReadSequenceFile</strong> method.</p> | 
|  | 158 </dd> | 
|  | 159 <dt><strong><a name="readmsfsequencefile" class="item"><strong>ReadMSFSequenceFile</strong></a></strong></dt> | 
|  | 160 <dd> | 
|  | 161 <div class="OptionsBox"> | 
|  | 162     $SequenceDataMapRef = ReadMSFSequenceFile($SequenceFile);</div> | 
|  | 163 <p>Reads MSF <em>SequenceFile</em> and returns reference to a hash containing following key/value | 
|  | 164 pairs as describes in <strong>ReadSequenceFile</strong> method.</p> | 
|  | 165 </dd> | 
|  | 166 <dt><strong><a name="readpirfastasequencefile" class="item"><strong>ReadPIRFastaSequenceFile</strong></a></strong></dt> | 
|  | 167 <dd> | 
|  | 168 <div class="OptionsBox"> | 
|  | 169     $SequenceDataMapRef = ReadPIRFastaSequenceFile($SequenceFile);</div> | 
|  | 170 <p>Reads PIR FASTA <em>SequenceFile</em> and returns reference to a hash containing following key/value | 
|  | 171 pairs as describes in <strong>ReadSequenceFile</strong> method.</p> | 
|  | 172 </dd> | 
|  | 173 <dt><strong><a name="readpearsonfastasequencefile" class="item"><strong>ReadPearsonFastaSequenceFile</strong></a></strong></dt> | 
|  | 174 <dd> | 
|  | 175 <div class="OptionsBox"> | 
|  | 176     $SequenceDataMapRef = ReadPearsonFastaSequenceFile($SequenceFile);</div> | 
|  | 177 <p>Reads Pearson FASTA <em>SequenceFile</em> and returns reference to a hash containing following key/value | 
|  | 178 pairs as describes in <strong>ReadSequenceFile</strong> method.</p> | 
|  | 179 </dd> | 
|  | 180 <dt><strong><a name="removesequencegaps" class="item"><strong>RemoveSequenceGaps</strong></a></strong></dt> | 
|  | 181 <dd> | 
|  | 182 <div class="OptionsBox"> | 
|  | 183     $SeqWithoutGaps = RemoveSequenceGaps($Sequence);</div> | 
|  | 184 <p>Removes gaps from <em>Sequence</em> and return a sequence without any gaps.</p> | 
|  | 185 </dd> | 
|  | 186 <dt><strong><a name="removesequencealignmentgapcolumns" class="item"><strong>RemoveSequenceAlignmentGapColumns</strong></a></strong></dt> | 
|  | 187 <dd> | 
|  | 188 <div class="OptionsBox"> | 
|  | 189     $NewAlignmentDataMapRef = RemoveSequenceAlignmentGapColumns( | 
|  | 190                               $AlignmentDataMapRef);</div> | 
|  | 191 <p>Using input alignment data map ref containing following keys, generate a new hash with | 
|  | 192 same set of keys after residue columns containg only gaps have been removed:</p> | 
|  | 193 <div class="OptionsBox"> | 
|  | 194     {IDs} : Array of IDs in order as they appear in file | 
|  | 195 <br/>    {Count}: ID count | 
|  | 196 <br/>    {Description}{$ID} : Description data | 
|  | 197 <br/>    {Sequence}{$ID} : Sequence data</div> | 
|  | 198 </dd> | 
|  | 199 <dt><strong><a name="writepearsonfastasequencefile" class="item"><strong>WritePearsonFastaSequenceFile</strong></a></strong></dt> | 
|  | 200 <dd> | 
|  | 201 <div class="OptionsBox"> | 
|  | 202     WritePearsonFastaSequenceFile($SequenceFileName, $SequenceDataRef, | 
|  | 203                                   [$MaxLength]);</div> | 
|  | 204 <p>Using sequence data specified via <em>SequenceDataRef</em>, write out a Pearson FASTA sequence | 
|  | 205 file. Optional argument <em>MaxLength</em> controls maximum length sequence in each line; default is | 
|  | 206 80.</p> | 
|  | 207 </dd> | 
|  | 208 </dl> | 
|  | 209 <p> | 
|  | 210 </p> | 
|  | 211 <h2>AUTHOR</h2> | 
|  | 212 <p><a href="mailto:msud@san.rr.com">Manish Sud</a></p> | 
|  | 213 <p> | 
|  | 214 </p> | 
|  | 215 <h2>SEE ALSO</h2> | 
|  | 216 <p><a href="./PDBFileUtil.html">PDBFileUtil.pm</a> | 
|  | 217 </p> | 
|  | 218 <p> | 
|  | 219 </p> | 
|  | 220 <h2>COPYRIGHT</h2> | 
|  | 221 <p>Copyright (C) 2015 Manish Sud. All rights reserved.</p> | 
|  | 222 <p>This file is part of MayaChemTools.</p> | 
|  | 223 <p>MayaChemTools is free software; you can redistribute it and/or modify it under | 
|  | 224 the terms of the GNU Lesser General Public License as published by the Free | 
|  | 225 Software Foundation; either version 3 of the License, or (at your option) | 
|  | 226 any later version.</p> | 
|  | 227 <p> </p><p> </p><div class="DocNav"> | 
|  | 228 <table width="100%" border=0 cellpadding=0 cellspacing=2> | 
|  | 229 <tr align="left" valign="top"><td width="33%" align="left"><a href="./SDFileUtil.html" title="SDFileUtil.html">Previous</a>  <a href="./index.html" title="Table of Contents">TOC</a>  <a href="./StatisticsUtil.html" title="StatisticsUtil.html">Next</a></td><td width="34%" align="middle"><strong>March 29, 2015</strong></td><td width="33%" align="right"><strong>SequenceFileUtil.pm</strong></td></tr> | 
|  | 230 </table> | 
|  | 231 </div> | 
|  | 232 <br /> | 
|  | 233 <center> | 
|  | 234 <img src="../../images/h2o2.png"> | 
|  | 235 </center> | 
|  | 236 </body> | 
|  | 237 </html> |