annotate mayachemtool/mayachemtools/docs/modules/txt/TextUtil.txt @ 0:a4a2ad5a214e draft default tip

Uploaded
author deepakjadmin
date Thu, 05 Nov 2015 02:37:56 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
1 NAME
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
2 TextUtil
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
3
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
4 SYNOPSIS
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
5 use TextUtil;
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
6
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
7 use TextUtil qw(:all);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
8
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
9 DESCRIPTION
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
10 TextUtil module provides the following functions:
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
11
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
12 AddNumberSuffix, ContainsWhiteSpaces, GetTextFileDataByNonUniqueKey,
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
13 GetTextFileDataByUniqueKey, GetTextLine, HashCode, IsEmpty, IsFloat,
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
14 IsInteger, IsNotEmpty, IsNumberPowerOfNumber, IsNumerical,
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
15 IsPositiveInteger, JoinWords, QuoteAWord,
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
16 RemoveLeadingAndTrailingWhiteSpaces, RemoveLeadingWhiteSpaces,
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
17 RemoveTrailingWhiteSpaces, SplitWords, WrapText
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
18
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
19 FUNCTIONS
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
20 AddNumberSuffix
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
21 $NumberWithSuffix = AddNumberSuffix($IntegerValue);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
22
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
23 Returns number with appropriate suffix: 0, 1st, 2nd, 3rd, 4th, and
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
24 so on.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
25
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
26 ContainsWhiteSpaces
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
27 $Status = ContainsWhiteSpaces($TheString);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
28
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
29 Returns 1 or 0 based on whether the string contains any white
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
30 spaces.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
31
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
32 GetTextLine
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
33 $Line = GetTextLine(\*TEXTFILE);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
34
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
35 Reads next line from an already opened text file, takes out any
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
36 carriage return, and returns it as a string. NULL is returned for
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
37 EOF.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
38
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
39 GetTextFileDataByNonUniqueKey
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
40 GetTextFileDataByNonUniqueKey($TextDataFile, $TextDataMapRef,
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
41 $DataKeyColNum, $InDelim);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
42
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
43 Load data from a text file into the specified hash reference using a
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
44 specific column for non-unique data key values.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
45
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
46 The lines starting with # are treated as comments and ignored. First
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
47 line not starting with # must contain column labels and the number
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
48 of columns in all other data rows must match the number of column
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
49 labels.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
50
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
51 The first column is assumed to contain data key value by default;
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
52 all other columns contain data as indicated in their column labels.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
53
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
54 In order to avoid dependence of data access on the specified column
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
55 labels, the column data is loaded into hash with Column<Num> hash
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
56 keys, where column number start from 1. The data key column is not
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
57 available as Colnum<Num> hash key;
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
58
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
59 The format of the data structure loaded into a specified hash
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
60 reference is:
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
61
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
62 @{$TextDataMapRef->{DataKeys}} - Array of unique data keys
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
63 @{$TextDataMapRef->{ColLabels}} - Array of column labels
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
64 @{$TextDataMapRef->{DataColIDs}} - Array of data column IDs
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
65 $TextDataMapRef->{NumOfCols} - Number of columns
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
66 %{$TextDataMapRef->{DataKey}} - Hash keys pair: <DataKey, DataKey>
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
67 @{$TextDataMapRef->{DataCol<Num>}} - Hash keys pair with data as an array:
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
68 <DataCol<Num>, DataKey>
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
69
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
70 GetTextFileDataByUniqueKey
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
71 GetTextFileDataByUniqueKey($TextDataFile, $TextDataMapRef, $DataKeyColNum,
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
72 $InDelim);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
73
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
74 Load data from a text file into the specified hash reference using a
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
75 a specific column for unique data key values.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
76
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
77 The lines starting with # are treated as comments and ignored. First
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
78 line not starting with # must contain column labels and the number
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
79 of columns in all other data rows must match the number of column
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
80 labels.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
81
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
82 The first column is assumed to contain data key value by default;
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
83 all other columns contain data as indicated in their column labels.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
84
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
85 In order to avoid dependence of data access on the specified column
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
86 labels, the column data is loaded into hash with Column<Num> hash
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
87 keys, where column number start from 1. The data key column is not
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
88 available as Colnum<Num> hash key;
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
89
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
90 The format of the data structure loaded into a specified hash
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
91 reference is:
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
92
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
93 @{$TextDataMapRef->{DataKeys}} - Array of unique data keys
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
94 @{$TextDataMapRef->{ColLabels}} - Array of column labels
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
95 @{$TextDataMapRef->{DataColIDs}} - Array of data column IDs
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
96 $TextDataMapRef->{NumOfCols} - Number of columns
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
97 %{$TextDataMapRef->{DataKey}} - Hash keys pair: <DataKey, DataKey>
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
98 %{$TextDataMapRef->{DataCol<Num>}} - Hash keys pair: <DataCol<Num>, DataKey>
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
99
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
100 HashCode
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
101 $HashCode = HashCode($TheString);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
102
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
103 Returns a 32 bit integer hash code using One-at-a-time algorithm By
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
104 Bob Jenkins [Ref 38]. It's also implemented in Perl for internal
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
105 hash keys in hv.h include file.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
106
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
107 IsEmpty
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
108 $Status = IsEmpty($TheString);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
109
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
110 Returns 1 or 0 based on whether the string is empty.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
111
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
112 IsInteger
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
113 $Status = IsInteger($TheString);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
114
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
115 Returns 1 or 0 based on whether the string is a positive integer.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
116
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
117 IsPositiveInteger
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
118 $Status = IsPositiveInteger($TheString);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
119
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
120 Returns 1 or 0 based on whether the string is an integer.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
121
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
122 IsFloat
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
123 $Status = IsFloat($TheString);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
124
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
125 Returns 1 or 0 based on whether the string is a float.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
126
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
127 IsNotEmpty
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
128 $Status = IsNotEmpty($TheString);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
129
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
130 Returns 0 or 1 based on whether the string is empty.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
131
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
132 IsNumerical
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
133 $Status = IsNumerical($TheString);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
134
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
135 Returns 1 or 0 based on whether the string is a number.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
136
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
137 IsNumberPowerOfNumber
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
138 $Status = IsNumberPowerOfNumber($FirstNum, $SecondNum);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
139
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
140 Returns 1 or 0 based on whether the first number is a power of
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
141 second number.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
142
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
143 JoinWords
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
144 $JoinedWords = JoinWords($Words, $Delim, $Quote);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
145
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
146 Joins different words using delimiter and quote parameters, and
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
147 returns it as a string.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
148
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
149 QuoteAWord
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
150 $QuotedWord = QuoteAWord($Word, $Quote);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
151
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
152 Returns a quoted string based on *Quote* value.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
153
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
154 RemoveLeadingWhiteSpaces
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
155 $OutString = RemoveLeadingWhiteSpaces($InString);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
156
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
157 Returns a string without any leading and traling white spaces.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
158
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
159 RemoveTrailingWhiteSpaces
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
160 $OutString = RemoveTrailingWhiteSpaces($InString);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
161
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
162 Returns a string without any trailing white spaces.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
163
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
164 RemoveLeadingAndTrailingWhiteSpaces
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
165 $OutString = RemoveLeadingAndTrailingWhiteSpaces($InString);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
166
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
167 Returns a string without any leading and traling white spaces.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
168
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
169 SplitWords
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
170 @Words = SplitWords($Line, $Delimiter);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
171
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
172 Returns an array *Words* ontaining unquoted words generated after
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
173 spliting string value *Line* containing quoted or unquoted words.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
174
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
175 This function is used to split strings generated by JoinWords as
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
176 replacement for Perl's core module funtion
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
177 Text::ParseWords::quotewords() which dumps core on very long
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
178 strings.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
179
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
180 WrapText
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
181 $OutString = WrapText($InString, [$WrapLength, $WrapDelimiter]);
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
182
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
183 Returns a wrapped string. By default, *WrapLenght* is *40* and
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
184 *WrapDelimiter* is Unix new line character.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
185
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
186 AUTHOR
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
187 Manish Sud <msud@san.rr.com>
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
188
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
189 SEE ALSO
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
190 FileUtil.pm
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
191
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
192 COPYRIGHT
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
193 Copyright (C) 2015 Manish Sud. All rights reserved.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
194
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
195 This file is part of MayaChemTools.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
196
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
197 MayaChemTools is free software; you can redistribute it and/or modify it
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
198 under the terms of the GNU Lesser General Public License as published by
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
199 the Free Software Foundation; either version 3 of the License, or (at
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
200 your option) any later version.
a4a2ad5a214e Uploaded
deepakjadmin
parents:
diff changeset
201