annotate mayachemtools/docs/modules/txt/TextUtil.txt @ 9:ab29fa5c8c1f draft default tip

Uploaded
author deepakjadmin
date Thu, 15 Dec 2016 14:18:03 -0500
parents 73ae111cf86f
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
1 NAME
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
2 TextUtil
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
3
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
4 SYNOPSIS
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
5 use TextUtil;
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
6
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
7 use TextUtil qw(:all);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
8
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
9 DESCRIPTION
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
10 TextUtil module provides the following functions:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
11
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
12 AddNumberSuffix, ContainsWhiteSpaces, GetTextFileDataByNonUniqueKey,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
13 GetTextFileDataByUniqueKey, GetTextLine, HashCode, IsEmpty, IsFloat,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
14 IsInteger, IsNotEmpty, IsNumberPowerOfNumber, IsNumerical,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
15 IsPositiveInteger, JoinWords, QuoteAWord,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
16 RemoveLeadingAndTrailingWhiteSpaces, RemoveLeadingWhiteSpaces,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
17 RemoveTrailingWhiteSpaces, SplitWords, WrapText
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
18
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
19 FUNCTIONS
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
20 AddNumberSuffix
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
21 $NumberWithSuffix = AddNumberSuffix($IntegerValue);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
22
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
23 Returns number with appropriate suffix: 0, 1st, 2nd, 3rd, 4th, and
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
24 so on.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
25
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
26 ContainsWhiteSpaces
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
27 $Status = ContainsWhiteSpaces($TheString);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
28
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
29 Returns 1 or 0 based on whether the string contains any white
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
30 spaces.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
31
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
32 GetTextLine
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
33 $Line = GetTextLine(\*TEXTFILE);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
34
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
35 Reads next line from an already opened text file, takes out any
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
36 carriage return, and returns it as a string. NULL is returned for
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
37 EOF.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
38
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
39 GetTextFileDataByNonUniqueKey
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
40 GetTextFileDataByNonUniqueKey($TextDataFile, $TextDataMapRef,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
41 $DataKeyColNum, $InDelim);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
42
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
43 Load data from a text file into the specified hash reference using a
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
44 specific column for non-unique data key values.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
45
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
46 The lines starting with # are treated as comments and ignored. First
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
47 line not starting with # must contain column labels and the number
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
48 of columns in all other data rows must match the number of column
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
49 labels.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
50
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
51 The first column is assumed to contain data key value by default;
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
52 all other columns contain data as indicated in their column labels.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
53
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
54 In order to avoid dependence of data access on the specified column
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
55 labels, the column data is loaded into hash with Column<Num> hash
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
56 keys, where column number start from 1. The data key column is not
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
57 available as Colnum<Num> hash key;
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
58
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
59 The format of the data structure loaded into a specified hash
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
60 reference is:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
61
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
62 @{$TextDataMapRef->{DataKeys}} - Array of unique data keys
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
63 @{$TextDataMapRef->{ColLabels}} - Array of column labels
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
64 @{$TextDataMapRef->{DataColIDs}} - Array of data column IDs
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
65 $TextDataMapRef->{NumOfCols} - Number of columns
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
66 %{$TextDataMapRef->{DataKey}} - Hash keys pair: <DataKey, DataKey>
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
67 @{$TextDataMapRef->{DataCol<Num>}} - Hash keys pair with data as an array:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
68 <DataCol<Num>, DataKey>
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
69
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
70 GetTextFileDataByUniqueKey
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
71 GetTextFileDataByUniqueKey($TextDataFile, $TextDataMapRef, $DataKeyColNum,
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
72 $InDelim);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
73
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
74 Load data from a text file into the specified hash reference using a
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
75 a specific column for unique data key values.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
76
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
77 The lines starting with # are treated as comments and ignored. First
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
78 line not starting with # must contain column labels and the number
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
79 of columns in all other data rows must match the number of column
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
80 labels.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
81
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
82 The first column is assumed to contain data key value by default;
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
83 all other columns contain data as indicated in their column labels.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
84
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
85 In order to avoid dependence of data access on the specified column
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
86 labels, the column data is loaded into hash with Column<Num> hash
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
87 keys, where column number start from 1. The data key column is not
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
88 available as Colnum<Num> hash key;
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
89
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
90 The format of the data structure loaded into a specified hash
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
91 reference is:
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
92
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
93 @{$TextDataMapRef->{DataKeys}} - Array of unique data keys
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
94 @{$TextDataMapRef->{ColLabels}} - Array of column labels
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
95 @{$TextDataMapRef->{DataColIDs}} - Array of data column IDs
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
96 $TextDataMapRef->{NumOfCols} - Number of columns
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
97 %{$TextDataMapRef->{DataKey}} - Hash keys pair: <DataKey, DataKey>
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
98 %{$TextDataMapRef->{DataCol<Num>}} - Hash keys pair: <DataCol<Num>, DataKey>
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
99
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
100 HashCode
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
101 $HashCode = HashCode($TheString);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
102
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
103 Returns a 32 bit integer hash code using One-at-a-time algorithm By
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
104 Bob Jenkins [Ref 38]. It's also implemented in Perl for internal
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
105 hash keys in hv.h include file.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
106
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
107 IsEmpty
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
108 $Status = IsEmpty($TheString);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
109
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
110 Returns 1 or 0 based on whether the string is empty.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
111
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
112 IsInteger
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
113 $Status = IsInteger($TheString);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
114
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
115 Returns 1 or 0 based on whether the string is a positive integer.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
116
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
117 IsPositiveInteger
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
118 $Status = IsPositiveInteger($TheString);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
119
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
120 Returns 1 or 0 based on whether the string is an integer.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
121
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
122 IsFloat
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
123 $Status = IsFloat($TheString);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
124
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
125 Returns 1 or 0 based on whether the string is a float.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
126
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
127 IsNotEmpty
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
128 $Status = IsNotEmpty($TheString);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
129
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
130 Returns 0 or 1 based on whether the string is empty.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
131
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
132 IsNumerical
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
133 $Status = IsNumerical($TheString);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
134
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
135 Returns 1 or 0 based on whether the string is a number.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
136
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
137 IsNumberPowerOfNumber
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
138 $Status = IsNumberPowerOfNumber($FirstNum, $SecondNum);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
139
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
140 Returns 1 or 0 based on whether the first number is a power of
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
141 second number.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
142
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
143 JoinWords
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
144 $JoinedWords = JoinWords($Words, $Delim, $Quote);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
145
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
146 Joins different words using delimiter and quote parameters, and
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
147 returns it as a string.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
148
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
149 QuoteAWord
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
150 $QuotedWord = QuoteAWord($Word, $Quote);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
151
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
152 Returns a quoted string based on *Quote* value.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
153
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
154 RemoveLeadingWhiteSpaces
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
155 $OutString = RemoveLeadingWhiteSpaces($InString);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
156
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
157 Returns a string without any leading and traling white spaces.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
158
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
159 RemoveTrailingWhiteSpaces
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
160 $OutString = RemoveTrailingWhiteSpaces($InString);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
161
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
162 Returns a string without any trailing white spaces.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
163
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
164 RemoveLeadingAndTrailingWhiteSpaces
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
165 $OutString = RemoveLeadingAndTrailingWhiteSpaces($InString);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
166
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
167 Returns a string without any leading and traling white spaces.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
168
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
169 SplitWords
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
170 @Words = SplitWords($Line, $Delimiter);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
171
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
172 Returns an array *Words* ontaining unquoted words generated after
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
173 spliting string value *Line* containing quoted or unquoted words.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
174
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
175 This function is used to split strings generated by JoinWords as
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
176 replacement for Perl's core module funtion
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
177 Text::ParseWords::quotewords() which dumps core on very long
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
178 strings.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
179
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
180 WrapText
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
181 $OutString = WrapText($InString, [$WrapLength, $WrapDelimiter]);
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
182
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
183 Returns a wrapped string. By default, *WrapLenght* is *40* and
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
184 *WrapDelimiter* is Unix new line character.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
185
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
186 AUTHOR
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
187 Manish Sud <msud@san.rr.com>
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
188
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
189 SEE ALSO
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
190 FileUtil.pm
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
191
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
192 COPYRIGHT
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
193 Copyright (C) 2015 Manish Sud. All rights reserved.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
194
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
195 This file is part of MayaChemTools.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
196
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
197 MayaChemTools is free software; you can redistribute it and/or modify it
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
198 under the terms of the GNU Lesser General Public License as published by
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
199 the Free Software Foundation; either version 3 of the License, or (at
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
200 your option) any later version.
73ae111cf86f Uploaded
deepakjadmin
parents:
diff changeset
201