annotate docs/scripts/txt/SortTextFiles.txt @ 0:4816e4a8ae95 draft default tip

Uploaded
author deepakjadmin
date Wed, 20 Jan 2016 09:23:18 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
1 NAME
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
2 SortTextFiles.pl - Sort TextFile(s) using values for a column
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
3
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
4 SYNOPSIS
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
5 SortTextFiles.pl TextFile(s)...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
6
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
7 SortTextFiles.pl [-d, --detail infolevel] [-h, --help] [--indelim comma
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
8 | semicolon] [-k, --key colnum | collabel] [--keydata numeric |
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
9 alphanumeric] [-m, --mode colnum | collabel] [-o, --overwrite]
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
10 [--outdelim comma | tab | semicolon] [-q, --quote yes | no] [-r, --root
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
11 rootname] [-s, --sort ascending | descending] [-w, --workingdir dirname]
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
12 TextFile(s)...
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
13
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
14 DESCRIPTION
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
15 Sort *TextFile(s)* using values for a key column specified by a column
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
16 number or label. Only one column key can be specified for sorting. In an
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
17 event of conflict during sorting process, two similar values for a
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
18 column key are simply transferred to output files in order of their
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
19 presence in input files. Additionally, rows with empty or inappropriate
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
20 values for column key are simply placed at the end. The file names are
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
21 separated by space. The valid file extensions are *.csv* and *.tsv* for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
22 comma/semicolon and tab delimited text files respectively. All other
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
23 file names are ignored. All the text files in a current directory can be
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
24 specified by **.csv*, **.tsv*, or the current directory name. The
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
25 --indelim option determines the format of *TextFile(s)*. Any file which
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
26 doesn't correspond to the format indicated by --indelim option is
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
27 ignored.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
28
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
29 OPTIONS
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
30 -d, --detail *infolevel*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
31 Level of information to print about lines being ignored. Default:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
32 *1*. Possible values: *1, 2 or 3*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
33
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
34 -h, --help
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
35 Print this help message.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
36
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
37 --indelim *comma | semicolon*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
38 Input delimiter for CSV *TextFile(s)*. Possible values: *comma or
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
39 semicolon*. Default value: *comma*. For TSV files, this option is
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
40 ignored and *tab* is used as a delimiter.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
41
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
42 -k, --key *col number | col name*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
43 This value is mode specific. It specifies which column to use for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
44 sorting *TextFile(s)*. Possible values: *col number or col label*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
45 Default value: *first column*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
46
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
47 --keydata *numeric | alphanumeric*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
48 Data type for column key. Possible values: *numeric or
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
49 alphanumeric*. Default value: *numeric*. For *alphanumeric* data
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
50 values, comparison is case insensitive.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
51
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
52 -m, --mode *colnum | collabel*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
53 Specify how to sort text files: using column number or column label.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
54 Possible values: *colnum or collabel*. Default value: *colnum*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
55
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
56 -o, --overwrite
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
57 Overwrite existing files.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
58
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
59 --outdelim *comma | tab | semicolon*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
60 Output text file delimiter. Possible values: *comma, tab, or
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
61 semicolon* Default value: *comma*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
62
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
63 -q, --quote *yes | no*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
64 Put quotes around column values in output text file. Possible
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
65 values: *yes or no*. Default value: *yes*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
66
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
67 -r, --root *rootname*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
68 New text file name is generated using the root: <Root>.<Ext>.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
69 Default new file name: <InitialTextFileName>SortedByColumn.<Ext>.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
70 The csv, and tsv <Ext> values are used for comma/semicolon, and tab
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
71 delimited text files respectively. This option is ignored for
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
72 multiple input files.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
73
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
74 -s, --sort *ascending | descending*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
75 Sorting order for column values. Possible values: *ascending or
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
76 descending*. Default value: *ascending*.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
77
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
78 -w, --workingdir *dirname*
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
79 Location of working directory. Default: current directory.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
80
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
81 EXAMPLES
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
82 To perform numerical sort in ascending order using first column values
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
83 and generate a new CSV text file NewSample1.csv, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
84
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
85 % SortTextFiles.pl -o -r NewSample1 Sample1.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
86
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
87 To perform numerical sort in descending order using MolWeight column and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
88 generate a new CSV text file NewSample1.csv, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
89
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
90 % SortTextFiles.pl -m collabel -k MolWeight --keydata numeric
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
91 -s descending -r NewSample1 -o Sample1.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
92
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
93 To perform numerical sort in ascending order using column number 1 and
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
94 generate a new TSV text file NewSample1.csv, type:
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
95
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
96 % SortTextFiles.pl -m colnum -k 1 --keydata numeric -s ascending
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
97 -r NewSample1 --outdelim tab -o Sample1.csv
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
98
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
99 AUTHOR
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
100 Manish Sud <msud@san.rr.com>
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
101
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
102 SEE ALSO
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
103 JoinTextFiles.pl, MergeTextFilesWithSD.pl, ModifyTextFilesFormat.pl,
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
104 SplitTextFiles.pl, TextFilesToHTML.pl
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
105
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
106 COPYRIGHT
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
107 Copyright (C) 2015 Manish Sud. All rights reserved.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
108
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
109 This file is part of MayaChemTools.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
110
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
111 MayaChemTools is free software; you can redistribute it and/or modify it
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
112 under the terms of the GNU Lesser General Public License as published by
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
113 the Free Software Foundation; either version 3 of the License, or (at
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
114 your option) any later version.
4816e4a8ae95 Uploaded
deepakjadmin
parents:
diff changeset
115