Mercurial > repos > deepakjadmin > mayatool3_test2
comparison docs/modules/txt/SequenceFileUtil.txt @ 0:4816e4a8ae95 draft default tip
Uploaded
author | deepakjadmin |
---|---|
date | Wed, 20 Jan 2016 09:23:18 -0500 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:4816e4a8ae95 |
---|---|
1 NAME | |
2 SequenceFileUtil | |
3 | |
4 SYNOPSIS | |
5 use SequenceFileUtil ; | |
6 | |
7 use SequenceFileUtil qw(:all); | |
8 | |
9 DESCRIPTION | |
10 SequenceFileUtil module provides the following functions: | |
11 | |
12 AreSequenceLengthsIdentical, CalcuatePercentSequenceIdentity, | |
13 CalculatePercentSequenceIdentityMatrix, GetLongestSequence, | |
14 GetSequenceLength, GetShortestSequence, IsClustalWSequenceFile, | |
15 IsGapResidue, IsMSFSequenceFile, IsPIRFastaSequenceFile, | |
16 IsPearsonFastaSequenceFile, IsSupportedSequenceFile, | |
17 ReadClustalWSequenceFile, ReadMSFSequenceFile, ReadPIRFastaSequenceFile, | |
18 ReadPearsonFastaSequenceFile, ReadSequenceFile, | |
19 RemoveSequenceAlignmentGapColumns, RemoveSequenceGaps, | |
20 WritePearsonFastaSequenceFile SequenceFileUtil module provides various | |
21 methods to process sequence files and retreive appropriate information. | |
22 | |
23 FUNCTIONS | |
24 AreSequenceLengthsIdentical | |
25 $Status = AreSequenceLengthsIdentical($SequencesDataRef); | |
26 | |
27 Checks the lengths of all the sequences available in | |
28 *SequencesDataRef* and returns 1 or 0 based whether lengths of all | |
29 the sequence is same. | |
30 | |
31 CalcuatePercentSequenceIdentity | |
32 $PercentIdentity = | |
33 AreSequenceLengthsIdenticalAreSequenceLengthsIdentical( | |
34 $Sequence1, $Sequence2, [$IgnoreGaps, $Precision]); | |
35 | |
36 Returns percent identity between *Sequence1* and *Sequence2*. | |
37 Optional arguments *IgnoreGaps* and *Precision* control handling of | |
38 gaps in sequences and precision of the returned value. By default, | |
39 gaps are ignored and precision is set up to 1 decimal. | |
40 | |
41 CalculatePercentSequenceIdentityMatrix | |
42 $IdentityMatrixDataRef = CalculatePercentSequenceIdentityMatrix( | |
43 $SequencesDataRef, [$IgnoreGaps, | |
44 $Precision]); | |
45 | |
46 Calculate pairwise percent identity between all the sequences | |
47 available in *SequencesDataRef* and returns a reference to identity | |
48 matrix hash. Optional arguments *IgnoreGaps* and *Precision* control | |
49 handling of gaps in sequences and precision of the returned value. | |
50 By default, gaps are ignored and precision is set up to 1 decimal. | |
51 | |
52 GetSequenceLength | |
53 $SeqquenceLength = GetSequenceLength($Sequence, [$IgnoreGaps]); | |
54 | |
55 Returns length of the specified sequence. Optional argument | |
56 *IgnoreGaps* controls handling of gaps. By default, gaps are | |
57 ignored. | |
58 | |
59 GetShortestSequence | |
60 ($ID, $Sequence, $SeqLen, $Description) = GetShortestSequence( | |
61 $SequencesDataRef, [$IgnoreGaps]); | |
62 | |
63 Checks the lengths of all the sequences available in | |
64 $SequencesDataRef and returns $ID, $Sequence, $SeqLen, and | |
65 $Description values for the shortest sequence. Optional arguments | |
66 $IgnoreGaps controls handling of gaps in sequences. By default, gaps | |
67 are ignored. | |
68 | |
69 GetLongestSequence | |
70 ($ID, $Sequence, $SeqLen, $Description) = GetLongestSequence( | |
71 $SequencesDataRef, [$IgnoreGaps]); | |
72 | |
73 Checks the lengths of all the sequences available in | |
74 *SequencesDataRef* and returns ID, Sequence, SeqLen, and Description | |
75 values for the longest sequence. Optional argument $*IgnoreGaps* | |
76 controls handling of gaps in sequences. By default, gaps are | |
77 ignored. | |
78 | |
79 IsGapResidue | |
80 $Status = AreSequenceLengthsIdentical($Residue); | |
81 | |
82 Returns 1 or 0 based on whether *Residue* corresponds to a gap. Any | |
83 character other than A to Z is considered a gap residue. | |
84 | |
85 IsSupportedSequenceFile | |
86 $Status = IsSupportedSequenceFile($SequenceFile); | |
87 | |
88 Returns 1 or 0 based on whether *SequenceFile* corresponds to a | |
89 supported sequence format. | |
90 | |
91 IsClustalWSequenceFile | |
92 $Status = IsClustalWSequenceFile($SequenceFile); | |
93 | |
94 Returns 1 or 0 based on whether *SequenceFile* corresponds to | |
95 Clustal sequence alignment format. | |
96 | |
97 IsPearsonFastaSequenceFile | |
98 $Status = IsPearsonFastaSequenceFile($SequenceFile); | |
99 | |
100 Returns 1 or 0 based on whether *SequenceFile* corresponds to | |
101 Pearson FASTA sequence format. | |
102 | |
103 IsPIRFastaSequenceFile | |
104 $Status = IsPIRFastaSequenceFile($SequenceFile); | |
105 | |
106 Returns 1 or 0 based on whether *SequenceFile* corresponds to PIR | |
107 FASTA sequence format. | |
108 | |
109 IsMSFSequenceFile | |
110 $Status = IsClustalWSequenceFile($SequenceFile); | |
111 | |
112 Returns 1 or 0 based on whether *SequenceFile* corresponds to MSF | |
113 sequence alignment format. | |
114 | |
115 ReadSequenceFile | |
116 $SequenceDataMapRef = ReadSequenceFile($SequenceFile); | |
117 | |
118 Reads *SequenceFile* and returns reference to a hash containing | |
119 following key/value pairs: | |
120 | |
121 $SequenceDataMapRef->{IDs} - Array of sequence IDs | |
122 $SequenceDataMapRef->{Count} - Number of sequences | |
123 $SequenceDataMapRef->{Description}{$ID} - Sequence description | |
124 $SequenceDataMapRef->{Sequence}{$ID} - Sequence for a specific ID | |
125 $SequenceDataMapRef->{Sequence}{InputFileType} - File format | |
126 | |
127 ReadClustalWSequenceFile | |
128 $SequenceDataMapRef = ReadClustalWSequenceFile($SequenceFile); | |
129 | |
130 Reads ClustalW *SequenceFile* and returns reference to a hash | |
131 containing following key/value pairs as describes in | |
132 ReadSequenceFile method. | |
133 | |
134 ReadMSFSequenceFile | |
135 $SequenceDataMapRef = ReadMSFSequenceFile($SequenceFile); | |
136 | |
137 Reads MSF *SequenceFile* and returns reference to a hash containing | |
138 following key/value pairs as describes in ReadSequenceFile method. | |
139 | |
140 ReadPIRFastaSequenceFile | |
141 $SequenceDataMapRef = ReadPIRFastaSequenceFile($SequenceFile); | |
142 | |
143 Reads PIR FASTA *SequenceFile* and returns reference to a hash | |
144 containing following key/value pairs as describes in | |
145 ReadSequenceFile method. | |
146 | |
147 ReadPearsonFastaSequenceFile | |
148 $SequenceDataMapRef = ReadPearsonFastaSequenceFile($SequenceFile); | |
149 | |
150 Reads Pearson FASTA *SequenceFile* and returns reference to a hash | |
151 containing following key/value pairs as describes in | |
152 ReadSequenceFile method. | |
153 | |
154 RemoveSequenceGaps | |
155 $SeqWithoutGaps = RemoveSequenceGaps($Sequence); | |
156 | |
157 Removes gaps from *Sequence* and return a sequence without any gaps. | |
158 | |
159 RemoveSequenceAlignmentGapColumns | |
160 $NewAlignmentDataMapRef = RemoveSequenceAlignmentGapColumns( | |
161 $AlignmentDataMapRef); | |
162 | |
163 Using input alignment data map ref containing following keys, | |
164 generate a new hash with same set of keys after residue columns | |
165 containg only gaps have been removed: | |
166 | |
167 {IDs} : Array of IDs in order as they appear in file | |
168 {Count}: ID count | |
169 {Description}{$ID} : Description data | |
170 {Sequence}{$ID} : Sequence data | |
171 | |
172 WritePearsonFastaSequenceFile | |
173 WritePearsonFastaSequenceFile($SequenceFileName, $SequenceDataRef, | |
174 [$MaxLength]); | |
175 | |
176 Using sequence data specified via *SequenceDataRef*, write out a | |
177 Pearson FASTA sequence file. Optional argument *MaxLength* controls | |
178 maximum length sequence in each line; default is 80. | |
179 | |
180 AUTHOR | |
181 Manish Sud <msud@san.rr.com> | |
182 | |
183 SEE ALSO | |
184 PDBFileUtil.pm | |
185 | |
186 COPYRIGHT | |
187 Copyright (C) 2015 Manish Sud. All rights reserved. | |
188 | |
189 This file is part of MayaChemTools. | |
190 | |
191 MayaChemTools is free software; you can redistribute it and/or modify it | |
192 under the terms of the GNU Lesser General Public License as published by | |
193 the Free Software Foundation; either version 3 of the License, or (at | |
194 your option) any later version. | |
195 |