diff docs/scripts/txt/MergeTextFilesWithSD.txt @ 0:4816e4a8ae95 draft default tip

Uploaded
author deepakjadmin
date Wed, 20 Jan 2016 09:23:18 -0500
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/docs/scripts/txt/MergeTextFilesWithSD.txt	Wed Jan 20 09:23:18 2016 -0500
@@ -0,0 +1,137 @@
+NAME
+    MergeTextFilesWithSD.pl - Merge CSV or TSV TextFile(s) into SDFile
+
+SYNOPSIS
+    MergeTextFilesWithSD.pl SDFile TextFile(s)...
+
+    MergeTextFilesWithSD.pl [-h, --help] [--indelim comma | semicolon] [-c,
+    --columns colnum,...;... | collabel,...;...] [-k, --keys colkeynum;... |
+    colkeylabel;...] [-m, --mode colnum | collabel] [-o, --overwrite] [-r,
+    --root rootname] [-s, --sdkey sdfieldname] [-w, --workingdir dirname]
+    SDFile TextFile(s)...
+
+DESCRIPTION
+    Merge multiple CSV or TSV *TextFile(s)* into *SDFile*. Unless -k --keys
+    option is used, data rows from all *TextFile(s)* are added to *SDFile*
+    in a sequential order, and the number of compounds in *SDFile* is used
+    to determine how many rows of data are added from *TextFile(s)*.
+
+    Multiple *TextFile(s)* names are separated by spaces. The valid file
+    extensions are *.csv* and *.tsv* for comma/semicolon and tab delimited
+    text files respectively. All other file names are ignored. All the text
+    files in a current directory can be specified by **.csv*, **.tsv*, or
+    the current directory name. The --indelim option determines the format
+    of *TextFile(s)*. Any file which doesn't correspond to the format
+    indicated by --indelim option is ignored.
+
+OPTIONS
+    -h, --help
+        Print this help message.
+
+    --indelim *comma | semicolon*
+        Input delimiter for CSV *TextFile(s)*. Possible values: *comma or
+        semicolon*. Default value: *comma*. For TSV files, this option is
+        ignored and *tab* is used as a delimiter.
+
+    -c, --columns *colnum,...;... | collabel,...;...*
+        This value is mode specific. It is a list of columns to merge into
+        *SDFile* specified by column numbers or labels for each text file
+        delimited by ";". All *TextFile(s)* are merged into *SDFile*.
+
+        Default value: *all;all;...*. By default, all columns from
+        TextFile(s) are merged into *SDFile*.
+
+        For *colnum* mode, input value format is:
+        *colnum,...;colnum,...;...*. Example:
+
+            "1,2;1,3,4;7,8,9"
+
+        For *collabel* mode, input value format is:
+        *collabel,...;collabel,...;...*. Example:
+
+            "MW,SumNO;SumNHOH,ClogP,PSA;MolName,Mol_Id,Extreg"
+
+    -k, --keys *colkeynum;... | colkeylabel;...*
+        This value is mode specific. It specifies column keys to use for
+        merging *TextFile(s)* into *SDFile*. The column keys, delimited by
+        ";", are specified by column numbers or labels for *TextFile(s)*.
+
+        By default, data rows from *TextFile(s)* are merged into *SDFile* in
+        the order they appear.
+
+        For *colnum* mode, input value format is:*colkeynum, colkeynum;...*.
+        Example:
+
+            "1;3;7"
+
+        For *collabel* mode, input value format is:*colkeylabel,
+        colkeylabel;...*. Example:
+
+            "Mol_Id;Mol_Id;Cmpd_Id"
+
+    -m, --mode *colnum | collabel*
+        Specify how to merge *TextFile(s)* into *SDFile*: using column
+        numbers or column labels. Possible values: *colnum or collabel*.
+        Default value: *colnum*.
+
+    -o, --overwrite
+        Overwrite existing files.
+
+    -r, --root *rootname*
+        New SD file name is generated using the root: <Root>.sdf. Default
+        file name:
+        <InitialSDFileName>MergedWith<FirstTextFileName>1To<Count>.sdf.
+
+    -s, --sdkey *sdfieldname*
+        *SDFile* data field name used as a key to merge data from
+        TextFile(s). By default, data rows from *TextFile(s)* are merged
+        into *SDFile* in the order they appear.
+
+    -w, --workingdir *dirname*
+        Location of working directory. Default: current directory.
+
+EXAMPLES
+    To merge Sample1.csv and Sample2.csv into Sample.sdf and generate
+    NewSample.sdf, type:
+
+        % MergeTextFileswithSD.pl -r NewSample -o Sample.sdf
+          Sample1.csv Sample2.csv
+
+    To merge all Sample*.tsv into Sample.sdf and generate NewSample.sdf
+    file, type:
+
+        % MergeTextFilesWithSD.pl -r NewSample -o Sample.sdf
+          Sample*.tsv
+
+    To merge column numbers "1,2" and "3,4,5" from Sample2.csv and
+    Sample3.csv into Sample.sdf and to generate NewSample.sdf, type:
+
+        % MergeTextFilesWithSD.pl -r NewSample -m colnum -c "1,2;3,4,5"
+          -o Sample.sdf Sample1.csv Sample2.csv
+
+    To merge column "Mol_ID,Formula,MolWeight" and "Mol_ID,ChemBankID,NAME"
+    from Sample1.csv and Sample2.csv into Sample.sdf using "Mol_ID" as SD
+    and column keys to generate NewSample.sdf, type:
+
+        % MergeTextFilesWithSD.pl -r NewSample -s Mol_ID -k "Mol_ID;Mol_ID"
+          -m collabel -c "Mol_ID,Formula,MolWeight;Mol_ID,ChemBankID,NAME"
+          -o Sample1.sdf Sample1.csv Sample2.csv
+
+AUTHOR
+    Manish Sud <msud@san.rr.com>
+
+SEE ALSO
+    ExtractFromSDFiles.pl, FilterSDFiles.pl, InfoSDFiles.pl, JoinSDFiles.pl,
+    JoinTextFiles.pl, MergeTextFiles.pl, ModifyTextFilesFormat.pl,
+    SplitSDFiles.pl, SplitTextFiles.pl
+
+COPYRIGHT
+    Copyright (C) 2015 Manish Sud. All rights reserved.
+
+    This file is part of MayaChemTools.
+
+    MayaChemTools is free software; you can redistribute it and/or modify it
+    under the terms of the GNU Lesser General Public License as published by
+    the Free Software Foundation; either version 3 of the License, or (at
+    your option) any later version.
+