view chemfp_clustering/butina_clustering.xml @ 6:438bc12d591b

Uploaded
author bgruening
date Fri, 26 Apr 2013 08:02:45 -0400
parents a8ac5250d59c
children 21d29a7f13d8
line wrap: on
line source

<tool id="chemfp_butina_clustering" name="Taylor-Butina Clustering" version="0.1">
    <description>of molecular fingerprints</description>
    <requirements>
        <requirement type="package" version="1.1p1">chemfp</requirement>
    </requirements>
    <command interpreter='python'>
        butina_clustering.py 
            -i $infile 
            -t $threshold 
            -o $outfile 
            -p 4
    </command>
    <inputs>
        <param name="infile" type="data" format="fps" label="Finperprint dataset" help="Dataset missing? See TIP below"/>
        <param name='threshold' type='float' value='0.8'/>
    </inputs>
    <outputs>
        <data format="tabular" name="outfile" label="${tool.name} on ${on_string}"/>
    </outputs>
    <tests>
        <test>
            <param name="infile" ftype="fps" value="q.fps"/>
            <param name='threshold' value='0.8' ></param>
            <output name="outfile" ftype="tabular"  file='Taylor-Butina_Clustering_on_data_q.txt'/>
        </test>
    </tests>
<help>

**Note**. You need molecular fingerprints in FPS format. Open Babel Fastsearch index is not supported.

**What it does**
Molecule library clustering using the Taylor-Butina algorithm.

-----

**Example**

* input::

	-  fingerprints in FPS format

		#FPS1
		#num_bits=881
		#type=CACTVS-E_SCREEN/1.0 extended=2
		#software=CACTVS/unknown
		#source=/home/mohammed/galaxy-central/database/files/000/dataset_423.dat
		#date=2012-02-09T13:20:37
		07ce04000000000000000000000000000080060000000c000000000000001a800f0000780008100000701487e960cc0bed3248000580644626004101b4844805901b041c2e
		19511e45039b8b2926101609401b13e40800000000000100200000040080000010000002000000000000	55169009
		07ce04000000000000000000000000000080060000000c000000000000001a800f0000780008100000701087e960cc0bed3248000580644626004101b4844805901b041c2e
		19111e45039b8b2926105609401313e40800000000000100200000040080000010000002000000000000	55079807
		........

	- Tanimoto threshold : 0.8 (between 0 and 1)

* output::

	0 true singletons
	=> 

	0 false singletons
	=> 

	1 clusters
	55091849 has 12 other members
	=> 6499094 6485578 55079807 3153534 55102353 55091466 55091416 6485577 55169009 55091752 55091467 55168823

 </help>

</tool>