view NBmodel_stan2_galaxy.xml @ 0:8b2027117ce5 draft default tip

"planemo upload for repository https://github.com/McIntyre-Lab/BayesASE/tree/main/galaxy commit b0be4c13808f3b2973aad4df5bfe87c6f41a3359"
author malex
date Thu, 14 Jan 2021 21:31:44 +0000
parents
children
line wrap: on
line source

<tool id="NBmodel_stan2_galaxy" name="Run the Bayesian model" version="21.1.13">
    <description>nbmodel_stan2_flex_prior.R</description>
    <macros>
        <import>macros.xml</import>
    </macros>
    <expand macro="requirements" />
    <stdio>
      <exit_code range="1:" level="fatal" />
      <regex match="recommend"
             source="stderr"
             level="warning"
             description="recomendation message was written in stderr" />
      <regex match="Loading library"
             source="stderr"
             level="log" />
      <regex match="Execution halted"
         source="both"
         level="fatal"
         description="Execution halted." />
      <regex match="error"
         source="both"
         level="fatal"
         description="An undefined error occured, please check your intput carefully and contact your administrator." />
    </stdio>
    <command><![CDATA[
    nbmodel_stan2.py
    --design=$design
    --infile=$infile
    --outfile=$output
    --cond=$cond
    --iterations=$iterations
    --warmup=$warmup

]]></command>
    <inputs>
        <param name="design" type="data" format="tabular,tsv" label="Design File" help="Select your Comparate Design File."/>
        <param name="infile" type="data" format="tabular,tsv" label="Input File" help="Select the dataset with the merged comparates and new headers."/>
        <param name="cond" type="text" label="conditions" value="2" help="Enter the number of conditions your're comparing (e.g. M v F would be 2)"/>
        <param name="iterations" type="text" label="iterations" value="100000" help="Enter the number of iterations [default = 100000]"/>
        <param name="warmup" type="text" label="warmup" value="10000" help="Enter the warmup number [default = 10000]"/>
    </inputs>
    <outputs>
        <data name="output" format="tsv" label="${tool.name} on ${on_string}: Bayesian Model Output" />
    </outputs>
    <tests>
        <test>
            <param name="design" ftype="data" value="bayesian_input/comparate_design_file.tsv"/>
            <param name="infile" ftype="data" value="bayesian_input/bayesian_input_W55_M_V.tabular"/>
            <param name="cond" ftype="text" value="2" />
            <param name="iterations" ftype="text" value="100000" />
            <param name="warmup" ftype="text" value="10000" />
            <output name="output" file="bayesian_output_W55_M_V.tabular" />
        </test>
    </tests>
    <help><![CDATA[

**Tool Description**

The Run the Bayesian model tool sets the parameters for the NBModel, generalizes the input and output name requirements and initiates the STAN implementation of the NBModel.
The statistical model is packaged in the script environmentalmodel2.stan (supported on the STAN platform) and wrapped in the R script (nbmodel_stan2_flex_prior.R).
Th original R script has hardcoded the names of the conditions used in the original implementation.

The model is used to test three hypotheses of interest:
    for i = 1,2 and k = 1,2,…,Ki.
        Allelic balance in Comparate 1 (e.g., Male) or, equivalently, H01: α1 = 1.
        Allelic balance in Comparate 2 (e.g., Female), H02: α2 = 1.
        The level of AI is not different between comparates.

If either H01 or H02 are rejected, (at a threshold of  posterior p-value 0.05) “AI_{cn }_decision” , where n refers to the comparate number, is set to 1.
If either of these cannot be rejected, these column values remain 0.

Testing the third null hypothesis tests determines if the true proportion of reads coming from G1 is the same within Comparate 1 as it is within Comparate 2 (Novelo et. al 2018).
This is a separate process that effectively tests if α1 = α2. If H03 is rejected, the Bayesian model output columns “AI_diffinc2andc1” =1.
All model output is compiled into one wide format TSV file per Bayesian input file.


**Inputs**


**Comparate Design File [REQUIRED]**

**NOTE**:The Comparate Design File is created and supplied by the user. It explicitly lists the comparates that the user wants to compare.

The Comparate Design File must contain the following columns, in order::

    (1) Comparate_1: comparate 1 identifier (ex. W55_M)
    (2) Comparate_2: comparate 2 identifier (ex. W55_V)
    (3) CompID: An unique identifier that specifies the condition (ex. W55_M_V)

Example Comparate Design File:

    +---------------+---------------+----------+
    |  Comparate_1  |  Comparate_2  |  compID  |
    +---------------+---------------+----------+
    |  R105_F       |  R105_M       | R105_F_M |
    +---------------+---------------+----------+


**Dataset or Collection of Datasets containing Comparates for comparisons[REQUIRED]**

These files can be generated by the "Merge Comparate Datasets and Generate Headers tool.

Example Input File

    +-------------+-----------------+-------------------+--------------------+-----------------+-------------+------------------+-------------------+--------------------+------------------+-------------------------+---------------------+------------------+-------------------+--------------------+-------------------+--------------------------+------------------+--------------------+-------------------+--------------------+------------------+--------------+------------------+------------------+--------------------+-------------------+-------------------+------------------+-----------------+----------------------+---------------------+-------------------+-------------------------+-------------------+
    | Feature_ID  | prior_c1_both   | prior_c1_g1       | prior_c1_g2        | c1_flag_analyze | c1_num_reps | c1_g1_total_rep1 | c1_g2_total_rep1  | c1_both_total_rep1 | c1_flag_apn_rep1 | c1_APN_total_reads_rep1 | c1_APN_both_rep1    | c1_g1_total_rep2 | c1_g2_total_rep2  | c1_both_total_rep2 | c1_flag_apn_rep2  | c1_APN_total_reads_rep2  | c1_APN_both_rep2 |prior_c2_both       | prior_c2_g1       |  prior_c2_g2       | c2_flag_analyze  | c2_num_reps  | c2_g1_total_rep1 | c2_g2_total_rep1 | c2_both_total_rep1 | c2_flag_apn_rep1  | c2_flag_apn_rep1  | c2_APN_both_rep1 | c2_g1_total_rep2|  c2_g2_total_rep2    | c2_both_total_rep2  | c2_flag_apn_rep2  | c2_APN_total_reads_rep2 | c2_APN_both_rep2  |
    +-------------+-----------------+-------------------+--------------------+-----------------+-------------+------------------+-------------------+--------------------+------------------+-------------------------+---------------------+------------------+-------------------+--------------------+-------------------+--------------------------+------------------+--------------------+-------------------+--------------------+------------------+--------------+------------------+------------------+--------------------+-------------------+-------------------+------------------+-----------------+----------------------+---------------------+-------------------+-------------------------+-------------------+
    |l(1)G1096    |0.799907266902715| 0.118361153262519 | 0.0817315798347665 |        1        |       2     |        295       |         234       |        2197        |         1        |     12.7234208727912    | 10.2551010446158    |        1885      |        1165       |        12201       |         1         | 71.2019427901982         | 56.9617787757493 | 0.802196053469128  | 0.114417568427753 | 0.0833863781031191 |         1        |       2      |        691       |        519       |         5020       |         1         | 29.0734648052328  | 23.4243873865079 |      1075       |           812        |         7481        |          1        | 43.7266913990042        | 34.9168212437762  |
    +-------------+-----------------+-------------------+--------------------+-----------------+-------------+------------------+-------------------+--------------------+------------------+-------------------------+---------------------+------------------+-------------------+--------------------+-------------------+--------------------------+------------------+--------------------+-------------------+--------------------+------------------+--------------+------------------+------------------+--------------------+-------------------+-------------------+------------------+-----------------+----------------------+---------------------+-------------------+-------------------------+-------------------+
    | CG10932     |0.853881278538813| 0.0597412480974125| 0.0863774733637747 |        1        |       2     |         13       |          39       |         308        |         1        |     5.06815839835124    | 4.33534520830266    |         100      |         134       |         1394       |         1         | 22.9213896658325         | 19.6266745178861 | 0.866028708133971  | 0.0344497607655502| 0.0995215311004785 |         1        |       2      |         29       |         62       |          674       |         1         | 10.3878993081113  | 9.10716914470779 |        38       |           125        |          920        |          1        |  15.2470189901369       | 12.9534815250994  |
    +-------------+-----------------+-------------------+--------------------+-----------------+-------------+------------------+-------------------+--------------------+------------------+-------------------------+---------------------+------------------+-------------------+--------------------+-------------------+--------------------------+------------------+--------------------+-------------------+--------------------+------------------+--------------+------------------+------------------+--------------------+-------------------+-------------------+------------------+-----------------+----------------------+---------------------+-------------------+-------------------------+-------------------+
    | CG8920      |0.808955223880597| 0.123383084577114 | 0.0676616915422886 |        1        |       2     |         93       |          20       |         500        |         1        |     39.4720538720539    | 32.1912457912458    |         347      |         257       |         2633       |         1         | 208.422222222222         | 169.53063973064  | 0.821591948764867  | 0.108417200365965 | 0.0699908508691674 |         1        |       2      |        163       |        122       |         1112       |         1         | 89.9299663299663  | 71.5858585858586 |       237       |           134        |         1881        |          1        |  144.974410774411       | 121.086195286195  |
    +-------------+-----------------+-------------------+--------------------+-----------------+-------------+------------------+-------------------+--------------------+------------------+-------------------------+---------------------+------------------+-------------------+--------------------+-------------------+--------------------------+------------------+--------------------+-------------------+--------------------+------------------+--------------+------------------+------------------+--------------------+-------------------+-------------------+------------------+-----------------+----------------------+---------------------+-------------------+-------------------------+-------------------+

------------------------------------------------------------------------------------------------------

**Output**


The tool generates a single TSV for every comparison (row) in the Comparate Design File: 


Example Bayesian Output file:

    +-------------+------------+----------------+-----------------+------------------+------------------+----------------------+------------------+-----------------+-------------------+----------------------+--------------------+---------------------+------------------+----------------------------------+--------------------+-----------------+---------------+--------------+---------------------+--------------------+--------------------+---------------+--------------+----------------+---------------------+---------------------+----------------+-----------------+--------------+ 
    | comparison  |FEATURE_ID  | W55_M_num_reps | W55_V_num_reps  | counts_W55_M_g1  | counts_W55_M_g2  |  counts_W55_M_both   | counts_W55_V_g1  | counts_W55_V_g2 | counts_W55_V_both | prior_W55_M_g1       |prior_W55_M_g2      |prior_W55_V_g1       | prior_W55_V_g2   | H3_independence_Bayesian_pvalue  |g1_W55_M_sampleprop | g1_W55_M_theta  |g1_W55_M_q025  |g1_W55_M_q975 | g1_W55_M_Bayes_pval |g1_W55_M_AI_decision|g1_W55_V_sampleprop |g1_W55_V_theta |g1_W55_V_q025 | g1_W55_V_q975  | g1_W55_V_Bayes_pval |g1_W55_V_AI_decision |alpha1_postmean | alpha2_postmean | flaganalyze  |
    +=============+============+================+=================+==================+==================+======================+==================+=================+===================+======================+====================+=====================+==================+==================================+====================+=================+===============+==============+=====================+====================+====================+===============+==============+================+=====================+=====================+================+=================+==============+
    | W55_M_V     | l(1)G0196  |        3       |        3        |       362        |       520        |          3990        |         413      |         605     |         4475      | 0.0743021346469622   |0.106732348111658   |0.0751866011287093   |0.110140178408884 |0.866                             |0.4104              |0.4948           |0.3778         |0.612         |0.9284               | 0                  | 0.4057             |0.5082         | 0.3912       | 0.6231         | 0.8839              | 0                   | 1.0182         | 0.9905          |     1        |
    +-------------+------------+----------------+-----------------+------------------+------------------+----------------------+------------------+-----------------+-------------------+----------------------+--------------------+---------------------+------------------+----------------------------------+--------------------+-----------------+---------------+--------------+---------------------+--------------------+--------------------+---------------+--------------+----------------+---------------------+---------------------+----------------+-----------------+--------------+  
    | W55_M_V     | CG10932    |        3       |        3        |        45        |        79        |           661        |          91      |         101     |          723      | 0.0573248407643312   |0.100636942675159   |0.0994535519125683   |0.110382513661202 |00.5916                           |0.3629              |0.5006           |0.3518         |0.6487        |0.993                | 0                  |0.474               |0.4446         | 0.3019       | 0.5949         | 0.4525              |0                    | 1.0108         | 1.1341          |     1        |
    +-------------+------------+----------------+-----------------+------------------+------------------+----------------------+------------------+-----------------+-------------------+----------------------+--------------------+---------------------+------------------+----------------------------------+--------------------+-----------------+---------------+--------------+---------------------+--------------------+--------------------+---------------+--------------+----------------+---------------------+---------------------+----------------+-----------------+--------------+ 
    | W55_M_V     | CG8920     |        3       |        3        |        49        |        18        |           336        |          41      |          25     |          337      | 0.121588089330025    |0.0446650124069479  |0.101736972704715    |0.0620347394540943|0.8316                            |0.7313              |0.5274           |0.3786         |0.6721        |0.7099               | 0                  |0.6212              |0.5057         | 0.3621       | 0.6482         | 0.9345              |0                    | 0.9566         | 0.9996          |      1       |
    +-------------+------------+----------------+-----------------+------------------+------------------+----------------------+------------------+-----------------+-------------------+----------------------+--------------------+---------------------+------------------+----------------------------------+--------------------+-----------------+---------------+--------------+---------------------+--------------------+--------------------+---------------+--------------+----------------+---------------------+---------------------+----------------+-----------------+--------------+
    | W55_M_V     | Mapmodulin |        3       |        3        |        23        |       136        |          1553        |          15      |         188     |         1912      | 0.0134345794392523   |0.0794392523364486  |0.00709219858156028  |0.0888888888888889|0.8649                            |0.1447              |0.4709           |0.3396         |0.6084        |0.6644               |0                   |0.0739              |0.4552         |0.3175        | 0.602          | 0.5348              |0                    | 1.0717         | 1.1083          |      1       |
    +-------------+------------+----------------+-----------------+------------------+------------------+----------------------+------------------+-----------------+-------------------+----------------------+--------------------+---------------------+------------------+----------------------------------+--------------------+-----------------+---------------+--------------+---------------------+--------------------+--------------------+---------------+--------------+----------------+---------------------+---------------------+----------------+-----------------+--------------+


Headers:

    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    | Name                                | Description                                                                                      |
    +=====================================+==================================================================================================+
    | Comparison                          | • comparison being tested                                                                        |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    | FEATURE_ID                          | • Unique genic feature ID                                                                        |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    | c1_num_reps                         | • Number of replicates for comparate_1                                                           |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    | c2_num_reps                         | • Number of replicates for comparate 2                                                           |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    |counts_{comparate}_{g1/g2}           | • Number of reads that aligned preferentially to G1 (or G2) for indicated comparate              |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    |counts_{comparate}_both              | • Number of reads that aligned equally well to both updated genomes for indicated comparate      |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    | prior_{comparate}_{g1/g2}           | • The prior probability that a given read will map to G1 (or G2) for {comparate}                 |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    |H3_independence_Bayesian_evidence    | • Bayesian evidence for testing the null that the alleles are independent variables.             |
    |                                     |   Minimum value of ev such that the 1−ev central credible interval for α1−α2.                    |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    |{comparate}_sampleprop               | • The sample proportion of reads among the reads mapped to G1 or G2 (but not both) that have     |
    |                                     |   mapped preferentially to G1 within {comparate}.                                                |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    |{comparate}_theta                    | • The point estimate of the proportion generated by G1 after adjusting for systematic bias in    |
    |                                     |   {comparate}.  This proportion different from 1/2 implies AI in the comparate.  Since           |
    |                                     |   {comparate}_theta is an estimate of θn1, credible intervals for θn1 are used to flag the       |
    |                                     |   comparate as in AI or not.                                                                     |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    |{comparate}_q025                     | • Lower bound for the 95% central credible interval, or equivalently, the 2.5% quantile of the   |
    |                                     |   posterior distribution of θ1                                                                   |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    |{comparate}_q975                     | • Upper bound for the 95% central credible interval, or equivalently, the 97.5% quantile of      |
    |                                     |   the posterior distribution of θ(n1)                                                            |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    |{comparate}_Bayes_evidence           | • The Bayesian evidence. Smaller values can lead to rejection of the null. "ev" is the smallest  |
    |                                     |   value such that the 1−ev central credible interval for θ(n1) does not contain the value        |
    |                                     |   θ(n1) = 1/2, that implies allelic balance in comparate n.                                      |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    |{comparate}_AI_decision              | • A 0/1 flag where a "1" indicates that the Bayesian evidence was less than 0.05                 |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    |alpha_postmean                       | • Alpha value for comparate 1, θ(1,1)!=0.5. Indicator of allelic imbalance.                      |
    |                                     |   α1 = sqrt ((1/ Θi) – 1)                                                                        |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    |alpha2_postmean                      | • Alpha value for comparate θ(1,2) != 0.5 indicator of allelic imbalance.                        |
    |                                     |   α2 = sqrt ((1/ Θi) – 1)                                                                        |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    |flagAnalyze                          | • 0/1 flag where a "1" indicates that the {comparate}_flag_analyze variables going into the      |
    |                                     |   model were BOTH equal to 1.                                                                    |
    +-------------------------------------+--------------------------------------------------------------------------------------------------+
    ]]></help>
    <citations>
            <citation type="bibtex">@ARTICLE{Miller20BASE,
            author = {Brecca Miller, Alison M. Morse, Elyse Borgert, Zihao Liu, Kelsey Sinclair, Gavin Gamble, Fei Zou, Jeremy Newman, Luis Leon Novello, Fabio Marroni, Lauren M. McIntyre},
            title = {Testcrosses are an efficient strategy for identifying cis regulatory variation: Bayesian analysis of allele imbalance among conditions (BASE)},
            journal = {????},
            year = {submitted for publication}
            }</citation>
        </citations>
</tool>