annotate docs/modules/txt/TextUtil.txt @ 3:90ea638ce878 draft default tip

Uploaded
author deepakjadmin
date Wed, 20 Jan 2016 09:11:59 -0500
parents 2abf0d43254d
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
1 NAME
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
2 TextUtil
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
3
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
4 SYNOPSIS
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
5 use TextUtil;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
6
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
7 use TextUtil qw(:all);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
8
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
9 DESCRIPTION
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
10 TextUtil module provides the following functions:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
11
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
12 AddNumberSuffix, ContainsWhiteSpaces, GetTextFileDataByNonUniqueKey,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
13 GetTextFileDataByUniqueKey, GetTextLine, HashCode, IsEmpty, IsFloat,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
14 IsInteger, IsNotEmpty, IsNumberPowerOfNumber, IsNumerical,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
15 IsPositiveInteger, JoinWords, QuoteAWord,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
16 RemoveLeadingAndTrailingWhiteSpaces, RemoveLeadingWhiteSpaces,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
17 RemoveTrailingWhiteSpaces, SplitWords, WrapText
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
18
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
19 FUNCTIONS
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
20 AddNumberSuffix
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
21 $NumberWithSuffix = AddNumberSuffix($IntegerValue);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
22
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
23 Returns number with appropriate suffix: 0, 1st, 2nd, 3rd, 4th, and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
24 so on.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
25
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
26 ContainsWhiteSpaces
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
27 $Status = ContainsWhiteSpaces($TheString);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
28
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
29 Returns 1 or 0 based on whether the string contains any white
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
30 spaces.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
31
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
32 GetTextLine
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
33 $Line = GetTextLine(\*TEXTFILE);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
34
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
35 Reads next line from an already opened text file, takes out any
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
36 carriage return, and returns it as a string. NULL is returned for
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
37 EOF.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
38
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
39 GetTextFileDataByNonUniqueKey
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
40 GetTextFileDataByNonUniqueKey($TextDataFile, $TextDataMapRef,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
41 $DataKeyColNum, $InDelim);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
42
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
43 Load data from a text file into the specified hash reference using a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
44 specific column for non-unique data key values.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
45
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
46 The lines starting with # are treated as comments and ignored. First
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
47 line not starting with # must contain column labels and the number
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
48 of columns in all other data rows must match the number of column
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
49 labels.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
50
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
51 The first column is assumed to contain data key value by default;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
52 all other columns contain data as indicated in their column labels.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
53
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
54 In order to avoid dependence of data access on the specified column
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
55 labels, the column data is loaded into hash with Column<Num> hash
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
56 keys, where column number start from 1. The data key column is not
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
57 available as Colnum<Num> hash key;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
58
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
59 The format of the data structure loaded into a specified hash
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
60 reference is:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
61
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
62 @{$TextDataMapRef->{DataKeys}} - Array of unique data keys
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
63 @{$TextDataMapRef->{ColLabels}} - Array of column labels
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
64 @{$TextDataMapRef->{DataColIDs}} - Array of data column IDs
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
65 $TextDataMapRef->{NumOfCols} - Number of columns
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
66 %{$TextDataMapRef->{DataKey}} - Hash keys pair: <DataKey, DataKey>
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
67 @{$TextDataMapRef->{DataCol<Num>}} - Hash keys pair with data as an array:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
68 <DataCol<Num>, DataKey>
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
69
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
70 GetTextFileDataByUniqueKey
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
71 GetTextFileDataByUniqueKey($TextDataFile, $TextDataMapRef, $DataKeyColNum,
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
72 $InDelim);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
73
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
74 Load data from a text file into the specified hash reference using a
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
75 a specific column for unique data key values.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
76
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
77 The lines starting with # are treated as comments and ignored. First
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
78 line not starting with # must contain column labels and the number
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
79 of columns in all other data rows must match the number of column
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
80 labels.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
81
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
82 The first column is assumed to contain data key value by default;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
83 all other columns contain data as indicated in their column labels.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
84
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
85 In order to avoid dependence of data access on the specified column
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
86 labels, the column data is loaded into hash with Column<Num> hash
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
87 keys, where column number start from 1. The data key column is not
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
88 available as Colnum<Num> hash key;
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
89
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
90 The format of the data structure loaded into a specified hash
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
91 reference is:
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
92
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
93 @{$TextDataMapRef->{DataKeys}} - Array of unique data keys
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
94 @{$TextDataMapRef->{ColLabels}} - Array of column labels
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
95 @{$TextDataMapRef->{DataColIDs}} - Array of data column IDs
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
96 $TextDataMapRef->{NumOfCols} - Number of columns
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
97 %{$TextDataMapRef->{DataKey}} - Hash keys pair: <DataKey, DataKey>
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
98 %{$TextDataMapRef->{DataCol<Num>}} - Hash keys pair: <DataCol<Num>, DataKey>
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
99
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
100 HashCode
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
101 $HashCode = HashCode($TheString);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
102
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
103 Returns a 32 bit integer hash code using One-at-a-time algorithm By
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
104 Bob Jenkins [Ref 38]. It's also implemented in Perl for internal
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
105 hash keys in hv.h include file.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
106
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
107 IsEmpty
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
108 $Status = IsEmpty($TheString);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
109
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
110 Returns 1 or 0 based on whether the string is empty.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
111
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
112 IsInteger
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
113 $Status = IsInteger($TheString);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
114
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
115 Returns 1 or 0 based on whether the string is a positive integer.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
116
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
117 IsPositiveInteger
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
118 $Status = IsPositiveInteger($TheString);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
119
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
120 Returns 1 or 0 based on whether the string is an integer.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
121
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
122 IsFloat
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
123 $Status = IsFloat($TheString);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
124
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
125 Returns 1 or 0 based on whether the string is a float.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
126
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
127 IsNotEmpty
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
128 $Status = IsNotEmpty($TheString);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
129
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
130 Returns 0 or 1 based on whether the string is empty.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
131
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
132 IsNumerical
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
133 $Status = IsNumerical($TheString);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
134
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
135 Returns 1 or 0 based on whether the string is a number.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
136
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
137 IsNumberPowerOfNumber
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
138 $Status = IsNumberPowerOfNumber($FirstNum, $SecondNum);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
139
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
140 Returns 1 or 0 based on whether the first number is a power of
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
141 second number.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
142
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
143 JoinWords
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
144 $JoinedWords = JoinWords($Words, $Delim, $Quote);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
145
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
146 Joins different words using delimiter and quote parameters, and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
147 returns it as a string.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
148
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
149 QuoteAWord
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
150 $QuotedWord = QuoteAWord($Word, $Quote);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
151
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
152 Returns a quoted string based on *Quote* value.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
153
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
154 RemoveLeadingWhiteSpaces
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
155 $OutString = RemoveLeadingWhiteSpaces($InString);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
156
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
157 Returns a string without any leading and traling white spaces.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
158
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
159 RemoveTrailingWhiteSpaces
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
160 $OutString = RemoveTrailingWhiteSpaces($InString);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
161
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
162 Returns a string without any trailing white spaces.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
163
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
164 RemoveLeadingAndTrailingWhiteSpaces
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
165 $OutString = RemoveLeadingAndTrailingWhiteSpaces($InString);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
166
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
167 Returns a string without any leading and traling white spaces.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
168
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
169 SplitWords
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
170 @Words = SplitWords($Line, $Delimiter);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
171
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
172 Returns an array *Words* ontaining unquoted words generated after
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
173 spliting string value *Line* containing quoted or unquoted words.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
174
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
175 This function is used to split strings generated by JoinWords as
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
176 replacement for Perl's core module funtion
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
177 Text::ParseWords::quotewords() which dumps core on very long
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
178 strings.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
179
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
180 WrapText
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
181 $OutString = WrapText($InString, [$WrapLength, $WrapDelimiter]);
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
182
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
183 Returns a wrapped string. By default, *WrapLenght* is *40* and
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
184 *WrapDelimiter* is Unix new line character.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
185
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
186 AUTHOR
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
187 Manish Sud <msud@san.rr.com>
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
188
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
189 SEE ALSO
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
190 FileUtil.pm
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
191
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
192 COPYRIGHT
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
193 Copyright (C) 2015 Manish Sud. All rights reserved.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
194
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
195 This file is part of MayaChemTools.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
196
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
197 MayaChemTools is free software; you can redistribute it and/or modify it
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
198 under the terms of the GNU Lesser General Public License as published by
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
199 the Free Software Foundation; either version 3 of the License, or (at
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
200 your option) any later version.
2abf0d43254d Uploaded
deepakjadmin
parents:
diff changeset
201