# HG changeset patch # User jbrayet # Date 1455196495 18000 # Node ID 4b56cc39e4de05d577f9817d70bdb3a39528ecf5 # Parent 2e121ae0a2eb38f898cf65e45ed3b774372b6b7e Uploaded diff -r 2e121ae0a2eb -r 4b56cc39e4de ahopro_wrapper.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/ahopro_wrapper.xml Thu Feb 11 08:14:55 2016 -0500 @@ -0,0 +1,275 @@ + + + Motif search and P-value calculation + + institutcuriengsintegration/ahopro:1.3 + + +#set $actualN =len($function['motifs']) +##silent sys.stderr.write("\n\n $actual_motifs=%s\n\n" % str ($actualN)) +ahopro_wrapper.sh ${ahopro_config} ${motif_file} ${function.function_selector} $outfile $function['seq_name'] $function['nbr_motif'] $actualN +#if str ( $function.function_selector) == "p-value" +${letter_freq_file} +#end if + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + #if str( $function.function_selector ) == "only" or str( $function.function_selector ) == "occ_pvalue" +sequence $function['seq'] +search p-value + #else if str( $function.function_selector ) == "p-value" + #if str ( $function.getTextModel.get_selector ) == "0" and str($function.getTextModel.getLetterFreq.get_selector) == "upload" +sequence $function['seq'] + #end if + #if str ( $function.getTextModel.get_selector ) == "0" and str($function.getTextModel.getLetterFreq.get_selector) == "paste" +LetterFreqFile + #end if + #if ( $function.getTextModel.get_selector ) == "1" ##markov +sequence $function['seq'] + #end if +TextLen $function['textLen'] + #end if +TextModel $function.getTextModel.get_selector +MotifsNumber $function['nbr_motif'] + #for $m in $function.motifs +motifMode $m.mode +MotifFilename +name $m.name + #if str ( $m.mode ) == "PWM / PSSM" +threshold $m.threshold + #end if + #if str ( $function.function_selector) == "p-value" +occurenceNumber $m.occ + #end if +complement $m.complement + #end for + + + +#if (str( $function.function_selector ) == "p-value") and ( str( $function.getTextModel.get_selector ) == "0" ) and ( str($function.getTextModel.getLetterFreq.get_selector) == "paste" ) +$function.getTextModel.getLetterFreq['letterFreq'] +#end if + + + +#for $n, $m in enumerate($function['motifs']) +$m.file +#if int($function['nbr_motif']) != 1 and int($n) != int($function['nbr_motif'])-1 +* +#end if +#end for + + + + + + +##AhoPro was created to seach for overrepresentation of given motifs in DNA sequences and to search for motif cooccurrence. This could discover the synergy of transcription factors (TF), which usually takes place in regulatory modules of genes. + +**What it does** + +AhoPro is an exact p-value calculator for multiple occurrences of multiples motifs. + +Here you can : + +1. Search for motifs in your DNA sequence. +2. Calculate the p-value for a given number of TFBS motifs. +3. Find occurrences of your TFBS motifs in your texts and calculate p-value. + +Use these references to search for given motifs in your DNA sequence then calculate the P-value for the resultant motif occurrences. + +----- + +**Find a specific TFBS** + +If you wish to search for occurrences of a particular transcription factor binding site (TFBS) in a sequence, you can find information about your specific TFBS by refering to the Homo Sapiens Comprehensive Model Collection (HOCOMOCO) of transcription factor (TF) binding models. HOCOMOCO was obtained by careful integration of data from different sources, it contains 426 non-redundant curated binding models for 401 human TFs. + +HOCOMOCO homepage : ((http://autosome.ru/HOCOMOCO/index.php)) + +Thresholds for PWMs : http://autosome.ru/HOCOMOCO/download_helper.php?path=download/HOCOMOCOv9_AD_thresholds_PWM_hg19.zip&name=HOCOMOCOv9_AD_thresholds_PWM_hg19.zip + +----- + + +**Motif representation** + +You can choose : + +1. **List** for a motif given as enumeration of possible binding sites. + +* Example Bicoid motif:: + + Bicoid motif: + GCCCCTAATCCCTT + CCATCTAATCCCTT + TTGGCTAATCCCAG + GCCACTAATCCCGA + CAACGTAATCCCCA + AATTATAATCCCTT + ... + ... + + +2. **PWM** for a motif given by its position weight matrix (PWM) and threshold. A PWM is a rectangular grid of numbers which shows the relative frequency a nucleotide will occur at a specific position. Two orientations are possible : +Row and Column. + +AhroPro accepts both orientations. + +* Example Bicoid PWM:: + + Column PWM + + -0.544 0.423 0.356 -0.388 + -0.398 0.422 -0.329 0.128 + -0.398 -2.054 -2.054 0.992 + 1.135 -2.054 -1.400 -2.054 + 1.164 -2.054 -2.054 -2.054 + -2.054 -1.018 -0.728 1.025 + -2.054 1.408 -2.054 -2.054 + -1.520 1.185 -1.008 -0.702 + -0.713 0.422 0.356 -0.260 + + + Row PWM + + + -0.544 -0.398 -0.398 1.135 1.164 -2.054 -2.054 -1.520 -0.713 + 0.423 0.422 -2.054 -2.054 -2.054 -1.018 1.408 1.185 0.422 + 0.356 -0.329 -2.054 -1.400 -2.054 -0.728 -2.054 -1.008 0.356 + -0.388 0.128 0.992 -2.054 -2.054 1.025 -2.054 -0.702 -0.260 + + +3. **Consensus** for a motif given by its IUPAC consensus. + +* Example AP-1:: + + RSTGACTNMNW + + +----- + +**Cite AhoPro** + +if you use this tool, please cite : Boeva V, Clement J, Regnier M, Roytberg M, Makeev V: Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of cis-regulatory modules.Algorithms for Molecular Biology 2007, 2:13 + + +