view chemfp_clustering/nxn_clustering.xml @ 22:6c496b524b41

ChemicalToolBoX update.
author Bjoern Gruening <bjoern.gruening@gmail.com>
date Sun, 02 Jun 2013 19:53:56 +0200
parents 7c84cfa515e0
children 1868005213a1
line wrap: on
line source

<tool id="ctb_chemfp_nxn_clustering" name="NxN Clustering" version="0.1">
    <description>of molecular fingerprints</description>
    <requirements>
        <requirement type="package" version="1.7.0">numpy</requirement>
        <requirement type="package" version="1.1p1">chemfp</requirement>
        <requirement type="package" version="1.2.1">matplotlib</requirement>
        <requirement type="package" version="0.12.0">scipy</requirement>
        <requirement type="package" version="2.3.2">openbabel</requirement>
    </requirements>
    <command interpreter='python'>
        nxn_clustering.py
            -i $infile
            -t $threshold
            -o $outfile
            --oformat $oformat
    </command>
    <inputs>
        <param name="infile" type="data" format="fps" label="Finperprint dataset" help="Dataset missing? See TIP below"/>
        <param name='threshold' type='float' value='0.0' />

        <param name='oformat' type='select' format='text' label="Format of the resulting picture">
            <option value='png'>PNG</option>
            <option value='svg'>SVG</option>
        </param>
    </inputs>
    <outputs>
        <data type="data" format="svg" name="outfile" label="${tool.name} on ${on_string}">
            <change_format>
                <when input="oformat" value="png" format="png"/>
            </change_format>
        </data>
    </outputs>
    <tests>
        <test>
            <param name="infile" ftype="fps" value="q.fps" />
            <param value='0.75' />
            <output ftype="svg" name="outfile" file='NxN_Clustering_on_q.svg' />
        </test>
    </tests>
    <help>

**Note**. You need molecular fingerprints in FPS format. Open Babel Fastsearch index is not supported.

**Note**. Currently, that tool can only be used with a small dataset.


**What it does**
Generating hierarchical clusters and visualizing clusters with dendrograms.
For the clustering and the fingerprint handling the chemfp_ project is used.

.. _chemfp: http://chemfp.com/

-----

**Example**

* input::

	-  fingerprints in FPS format

		#FPS1
		#num_bits=881
		#type=CACTVS-E_SCREEN/1.0 extended=2
		#software=CACTVS/unknown
		#source=/home/mohammed/galaxy-central/database/files/000/dataset_423.dat
		#date=2012-02-09T13:20:37
		07ce04000000000000000000000000000080060000000c000000000000001a800f0000780008100000701487e960cc0bed3248000580644626004101b4844805901b041c2e
		19511e45039b8b2926101609401b13e40800000000000100200000040080000010000002000000000000	55169009
		07ce04000000000000000000000000000080060000000c000000000000001a800f0000780008100000701087e960cc0bed3248000580644626004101b4844805901b041c2e
		19111e45039b8b2926105609401313e40800000000000100200000040080000010000002000000000000	55079807
		........

	- Tanimoto threshold : 0.8 (between 0 and 1)

* output::

	clustring plot

.. image:: $PATH_TO_IMAGES/NxN_clustering.png

    </help>

</tool>