Mercurial > repos > bgruening > text_processing
comparison replace_text_in_line.xml @ 6:8928e6d1e7ba draft
Uploaded
| author | bgruening |
|---|---|
| date | Thu, 08 Jan 2015 09:07:31 -0500 |
| parents | 56e80527c482 |
| children | d64eace4f9f3 |
comparison
equal
deleted
inserted
replaced
| 5:3f0e0d4c15a9 | 6:8928e6d1e7ba |
|---|---|
| 5 </macros> | 5 </macros> |
| 6 <expand macro="requirements"> | 6 <expand macro="requirements"> |
| 7 <requirement type="package" version="4.2.2-sandbox">gnu_sed</requirement> | 7 <requirement type="package" version="4.2.2-sandbox">gnu_sed</requirement> |
| 8 </expand> | 8 </expand> |
| 9 <version_command>sed --version | head -n 1</version_command> | 9 <version_command>sed --version | head -n 1</version_command> |
| 10 <command interpreter="sh"> | 10 <command> |
| 11 <![CDATA[ | 11 <![CDATA[ |
| 12 sed | 12 sed |
| 13 -r | 13 -r |
| 14 --sandbox | 14 --sandbox |
| 15 "s/$find_pattern/$replace_pattern/g" | 15 "s/$find_pattern/$replace_pattern/g" |
| 16 "$input" | 16 "$infile" |
| 17 > "$output" | 17 > "$outfile" |
| 18 ]]> | 18 ]]> |
| 19 </command> | 19 </command> |
| 20 <inputs> | 20 <inputs> |
| 21 <param format="txt" name="input" type="data" label="File to process" /> | 21 <param format="txt" name="infile" type="data" label="File to process" /> |
| 22 <param name="find_pattern" type="text" size="20" label="Find pattern" help="Use simple text, or a valid regular expression (without backslashes // ) " > | 22 <param name="find_pattern" type="text" size="20" label="Find pattern" help="Use simple text, or a valid regular expression (without backslashes // ) " > |
| 23 <sanitizer> | 23 <sanitizer> |
| 24 <valid initial="string.printable"> | 24 <valid initial="string.printable"> |
| 25 <remove value="'"/> | 25 <remove value="'"/> |
| 26 </valid> | 26 </valid> |
| 33 </valid> | 33 </valid> |
| 34 </sanitizer> | 34 </sanitizer> |
| 35 </param> | 35 </param> |
| 36 </inputs> | 36 </inputs> |
| 37 <outputs> | 37 <outputs> |
| 38 <data format="input" name="output" metadata_source="input"/> | 38 <data name="outfile" format_source="infile" metadata_source="infile"/> |
| 39 </outputs> | 39 </outputs> |
| 40 <tests> | 40 <tests> |
| 41 <test> | 41 <test> |
| 42 <param name="input" value="replace_text_in_line_in1.txt" /> | 42 <param name="infile" value="replace_text_in_line1.txt" /> |
| 43 <param name="find_pattern" value="CTC." /> | 43 <param name="find_pattern" value="CTC." /> |
| 44 <param name="replace_pattern" value="FOOBAR" /> | 44 <param name="replace_pattern" value="FOOBAR" /> |
| 45 <output name="output" file="replace_text_in_line_output1.txt" /> | 45 <output name="outfile" file="replace_text_in_line_results1.txt" /> |
| 46 </test> | 46 </test> |
| 47 </tests> | 47 </tests> |
| 48 <help> | 48 <help> |
| 49 <![CDATA[ | 49 <![CDATA[ |
| 50 **What it does** | 50 **What it does** |
| 51 | 51 |
| 52 This tool performs find & replace operation on a specified file. | 52 This tool performs find & replace operation on a specified file. |
| 53 | 53 |
| 54 .. class:: infomark | 54 .. class:: infomark |
| 55 | 55 |
| 56 The **pattern to find** uses the **extended regular** expression syntax (same as running 'sed -r'). | 56 The **pattern to find** uses the **extended regular** expression syntax (same as running 'sed -r'). |
| 57 | 57 |
| 59 | 59 |
| 60 **TIP:** If you need more complex patterns, use the *sed* tool. | 60 **TIP:** If you need more complex patterns, use the *sed* tool. |
| 61 | 61 |
| 62 ----- | 62 ----- |
| 63 | 63 |
| 64 | |
| 65 **Examples of Find Patterns** | 64 **Examples of Find Patterns** |
| 66 | 65 |
| 67 - **HELLO** The word 'HELLO' (case sensitive). | 66 - **HELLO** The word 'HELLO' (case sensitive). |
| 68 - **AG.T** The letters A,G followed by any single character, followed by the letter T. | 67 - **AG.T** The letters A,G followed by any single character, followed by the letter T. |
| 69 - **A{4,}** Four or more consecutive A's. | 68 - **A{4,}** Four or more consecutive A's. |
| 70 - **chr2[012]\\t** The words 'chr20' or 'chr21' or 'chr22' followed by a tab character. | 69 - **chr2[012]\\t** The words 'chr20' or 'chr21' or 'chr22' followed by a tab character. |
| 71 - **hsa-mir-([^ ]+)** The text 'hsa-mir-' followed by one-or-more non-space characters. When using parenthesis, the matched content of the parenthesis can be accessed with **\1** in the **replace** pattern. | 70 - **hsa-mir-([^ ]+)** The text 'hsa-mir-' followed by one-or-more non-space characters. When using parenthesis, the matched content of the parenthesis can be accessed with **\1** in the **replace** pattern. |
| 72 | 71 |
| 73 | 72 |
| 74 | |
| 75 **Examples of Replace Patterns** | 73 **Examples of Replace Patterns** |
| 76 | 74 |
| 77 - **WORLD** The word 'WORLD' will be placed whereever the find pattern was found. | 75 - **WORLD** The word 'WORLD' will be placed whereever the find pattern was found. |
| 78 - **FOO-&-BAR** Each time the find pattern is found, it will be surrounded with 'FOO-' at the begining and '-BAR' at the end. **&** (ampersand) represents the matched find pattern. | 76 - **FOO-&-BAR** Each time the find pattern is found, it will be surrounded with 'FOO-' at the begining and '-BAR' at the end. **$** (ampersand) represents the matched find pattern. |
| 79 - **\\1** The text which matched the first parenthesis in the Find Pattern. | 77 - **\\1** The text which matched the first parenthesis in the Find Pattern. |
| 80 | |
| 81 | |
| 82 | 78 |
| 83 | 79 |
| 84 ----- | 80 ----- |
| 85 | 81 |
| 86 **Example 1** | 82 **Example 1** |
| 94 ----- | 90 ----- |
| 95 | 91 |
| 96 **Example 2** | 92 **Example 2** |
| 97 | 93 |
| 98 **Find Pattern:** ^(.{4}) | 94 **Find Pattern:** ^(.{4}) |
| 99 **Replace Pattern:** &\\t | 95 **Replace Pattern:** &\\t |
| 100 | 96 |
| 101 Find the first four characters in each line, and replace them with the same text, followed by a tab character. In practice - this will split the first line into two columns. | 97 Find the first four characters in each line, and replace them with the same text, followed by a tab character. In practice - this will split the first line into two columns. |
| 102 | 98 |
| 103 | 99 |
| 104 ----- | 100 ----- |
