annotate scripts/S01a_codons_counting.py @ 10:f62c76aab669 draft default tip

planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 3c7982d775b6f3b472f6514d791edcb43cd258a1
author lecorguille
date Mon, 24 Sep 2018 04:34:39 -0400
parents 04a9ada73cc4
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
1 #!/usr/bin/env python
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
2 # coding: utf-8
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
3 # Author : Victor Mataigne
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
4
9
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
5 import string, os, sys, re, random, itertools, argparse, copy, math
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
6 import pandas as pd
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
7 import numpy as np
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
8
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
9 def buildDicts(list_codons, content, dict_genetic_code, dict_aa_classif):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
10 """ Build dictionaries with values to 0. These dictionaries are used as starting point for each sequence counting
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
11
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
12 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
13 list_codons (list of str) : all codons except codons-stop
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
14 content (int or list) : an integer (for coutings and transitions), or an empty list (for resampling)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
15 dict_genetic_code (dict) : the genetic code : {'aa1': [codon1, codon2,...], ...}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
16 dict_aa_classif (dict) : the types of the amino-acids ( {type: [aa1, aa2, ...], ...}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
17
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
18 Returns :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
19 dico_codons, dico_aa, dico_aatypes (dicts) : keys : codons/amico-acids/amico-acids types, values : 0 or []
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
20 dico_codons_transitions, dico_aa_transitions, dico_aatypes_transitions (dicts of dicts) :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
21 actually, the three first dictionaries nested as values of keys codons/amico-acids/amico-acids
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
22 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
23
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
24 # I could have make sub-routines here.
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
25 # the copy commands are mandatory, otherwise all dictionaries will reference the same variable(s)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
26
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
27 dico_codons = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
28 for codon in list_codons:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
29 dico_codons[codon] = copy.deepcopy(content)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
30
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
31 dico_aa = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
32 for aa in dict_genetic_code.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
33 dico_aa[aa] = copy.deepcopy(content)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
34
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
35 dico_aatypes = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
36 for aatype in dict_aa_classif.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
37 dico_aatypes[aatype] = copy.deepcopy(content)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
38
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
39 dico_codons_transitions=copy.deepcopy(dico_codons)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
40 for key in dico_codons_transitions.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
41 dico_codons_transitions[key]=copy.deepcopy(dico_codons)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
42
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
43 dico_aa_transitions = copy.deepcopy(dico_aa)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
44 for key in dico_aa_transitions.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
45 dico_aa_transitions[key]=copy.deepcopy(dico_aa)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
46
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
47 dico_aatypes_transitions = copy.deepcopy(dico_aatypes)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
48 for key in dico_aatypes_transitions.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
49 dico_aatypes_transitions[key]=copy.deepcopy(dico_aatypes)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
50
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
51 return dico_codons, dico_aa, dico_aatypes, dico_codons_transitions, dico_aa_transitions, dico_aatypes_transitions
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
52
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
53 def viable(seqs, pos, method):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
54 """ Compute if, among a set of sequences, either at least one of the codons at the specified position has not a "-",
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
55 or not any codon has a "-"
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
56
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
57 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
58 seqs : a list (the sequences, which must all have the same size)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
59 pos : an integer (the positions, <= len(seqs) -3)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
60
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
61 Returns:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
62 bool
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
63 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
64 codons = [seq[pos:pos+3] for seq in seqs]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
65 if method == "all":
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
66 return not all("-" in codon for codon in codons)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
67 elif method == "any":
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
68 return not any("-" in codon for codon in codons)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
69
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
70 # # # Function for codons, aa, aatypes Countings -------------------------------------------------------------------------------
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
71
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
72 def computeAllCountingsAndFreqs(seq, list_codons, init_dict_codons, init_dict_aa, init_dict_classif, dict_genetic_code, dict_aa_classif):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
73 """ Call all functions dedicated to the computation of occurences and frequencies of codons, amino-acids, amino-acids types
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
74
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
75 Args : see sub-routines
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
76
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
77 Returns : 6 dictionaries (occurences and frequencies for codons, amino-acids, amino-acids types). See sub-routines for details.
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
78 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
79
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
80 # ------ Sub-routines ------ #
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
81
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
82 def codonsCountings(seq, list_codons, init_dict_codons):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
83 """ Count occurences of each codon in a sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
84 First reading frame only : input sequence is supposed to be an ORF
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
85
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
86 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
87 seq (str) : the sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
88 list_codons (list of str) : all codons except codons-stop
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
89 init_dict_codons (dict) : {codon1 : 0, codon2: 0, ...}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
90
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
91 Return :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
92 codon (dict) : codons (keys) and their occurences (values) in the sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
93 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
94
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
95 codons = copy.deepcopy(init_dict_codons)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
96
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
97 l = len(seq)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
98
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
99 if l%3 == 0: max_indice = l-3
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
100 if l%3 == 1: max_indice = l-4
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
101 if l%3 == 2: max_indice = l-5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
102
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
103 for codon in range(0,max_indice+1,3):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
104 if "-" not in seq[codon:codon+3] :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
105 codons[seq[codon:codon+3]] += 1
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
106
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
107 return codons
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
108
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
109 def codonsFreqs(dict_codons_counts):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
110 """ Computes frequencies of each codon in a sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
111
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
112 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
113 dict_codons (dict) : the output of codonsCounting()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
114
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
115 Return :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
116 codons_freqs (dict) : codons (keys) and their frequencies (values)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
117 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
118
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
119 codons_freqs = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
120
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
121 for key in dict_codons_counts.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
122 freq = float(dict_codons_counts[key])/sum(dict_codons_counts.values())
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
123 codons_freqs[key] = freq
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
124
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
125 return codons_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
126
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
127 def aaCountings(dict_codons, dict_genetic_code, init_dict_aa):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
128 """ Count occurences of each amino-acid in a sequence, based on the countings of codons (1st ORF)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
129
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
130 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
131 dict_codons (dict) : the output of codonsCounting()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
132 dict_genetic_code (dict) : the genetic code : {'aa1': [codon1, codon2,...], ...}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
133 init_dict_aa (dict) : {aa1 : 0, aa2: 0, ...}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
134
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
135 Return :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
136 dict_aa (dict) : amino-acids (keys) and their occurences (values)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
137 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
138 dict_aa = copy.deepcopy(init_dict_aa)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
139
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
140 for key in dict_codons.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
141 for value in dict_genetic_code.values():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
142 if key in value:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
143 dict_aa[dict_genetic_code.keys()[dict_genetic_code.values().index(value)]] += dict_codons[key]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
144
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
145 return dict_aa
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
146
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
147 def aaFreqs(dict_aa_counts):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
148 """ Computes frequencies of each amino-acid in a sequence, based on the countings of codons (1st ORF)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
149
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
150 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
151 dict_aa_counts (dict) : the output of aaCountings()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
152
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
153 Return :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
154 dict_aa_freqs (dict) : amino-acids (keys) and their frequencies (values)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
155 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
156
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
157 dict_aa_freqs = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
158
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
159 for key in dict_aa_counts.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
160 freq = float(dict_aa_counts[key])/sum(dict_aa_counts.values())
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
161 dict_aa_freqs[key] = freq
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
162
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
163 return dict_aa_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
164
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
165 def aatypesCountings(dict_aa, dict_aa_classif, init_dict_classif):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
166 """ Computes frequencies of each amino-acid type in a sequence, based on the countings of amino-acids (1st ORF)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
167
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
168 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
169 - dict_aa (dict) : the output of aaCountings()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
170 - dict_aa_classif (dict) : the types of the amino-acids ( {type: [aa1, aa2, ...], ...} )
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
171 - init_dict_classif (dict) : {'polar': 0, 'apolar': 0, ...}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
172
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
173 Return :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
174 dict_aatypes (dict) : amino-acids types (keys) and their occurences (values)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
175 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
176 dict_aatypes = copy.deepcopy(init_dict_classif)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
177
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
178 for key_classif in dict_aa_classif.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
179 for key_aa in dict_aa.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
180 if key_aa in dict_aa_classif[key_classif]:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
181 dict_aatypes[key_classif] += dict_aa[key_aa]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
182
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
183 return dict_aatypes
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
184
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
185 def aatypesFreqs(dict_aatypes):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
186 """ Computes frequencies of each amino-acid type in a sequence, based on the countings of amino-acids (1st ORF)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
187
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
188 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
189 dict_aatypes (dict) : the output of aatypesCountings()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
190
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
191 Return :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
192 dict_aatypes_freqs : amino-acids types (keys) and their frequencies (values)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
193 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
194 dict_aatypes_freqs = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
195
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
196 for key in dict_aatypes.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
197 freq = float(dict_aatypes[key])/sum(dict_aatypes.values())
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
198 dict_aatypes_freqs[key] = freq
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
199
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
200 return dict_aatypes_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
201
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
202 # ------ The function ------ #
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
203
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
204 codons_c = codonsCountings(seq, list_codons, init_dict_codons)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
205 codons_f = codonsFreqs(codons_c)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
206 aa_c = aaCountings(codons_c, dict_genetic_code, init_dict_aa)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
207 aa_f = aaFreqs(aa_c)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
208 aat_c = aatypesCountings(aa_c, dict_aa_classif, init_dict_classif)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
209 aat_f = aatypesFreqs(aat_c)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
210
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
211 return codons_c, codons_f, aa_c, aa_f, aat_c, aat_f
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
212
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
213 # # # Functions for various measures (ivywrel, ekqh...) -------------------------------------------------------------------------
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
214
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
215 def computeVarious(seq, dict_aa_counts, dict_aa_types):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
216 """ Call al the functions for nucleic and amino-acids sequences description
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
217
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
218 Args : See sub-routines for details.
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
219
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
220 Returns : 6 integer or floats. See sub-routines for details.
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
221 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
222
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
223 # ------ Sub-routines ------ #
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
224
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
225 def nbCodons(seq):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
226 """ Compute the number of full codons in a sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
227 Arg : seq (str): the sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
228 Return: nb_codons (int)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
229 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
230 l = len(seq)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
231 if l%3 == 0: nb_codons = l/3
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
232 if l%3 == 1: nb_codons = (l-1)/3
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
233 if l%3 == 2: nb_codons = (l-2)/3
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
234 return nb_codons
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
235
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
236 def maxIndice(seq):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
237 """ Compute the highest indice for parsing a sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
238 Arg : seq (str): the sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
239 Return : max_indice (int)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
240 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
241 l = len(seq)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
242 if l%3 == 0: max_indice = l-3
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
243 if l%3 == 1: max_indice = l-4
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
244 if l%3 == 2: max_indice = l-5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
245 return max_indice
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
246
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
247 def gc12Andgc3Count(seq, nb_codons, max_indice):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
248 """ Compute the frequency of gc12 in a sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
249 Arg : seq (str) : the sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
250 Return : (float)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
251 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
252
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
253 # TO IMPROVE ? : make this computation in the codonCountigns() function to avoid parsing twice the sequence ?
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
254 # But : involves a less readable code
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
255
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
256 gc12 = 0
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
257 gc3 = 0
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
258
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
259 for i in range(0, max_indice+1,3):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
260 if seq[i] in ["c","g"]: gc12 += 1
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
261 if seq[i+1] in ["c","g"]: gc12 += 1
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
262 if seq[i+2] in ["c","g"] or seq[i+2] in ["c","g"]: gc3 += 1
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
263
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
264 return float(gc3)/nb_codons, float(gc12)/(2*nb_codons)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
265
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
266 def ivywrelCount(nb_codons, dict_aa_counts):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
267 """ Compute the sum of occurences of amino-acids IVYWREL divided by the number of codons
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
268
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
269 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
270 nb_codons (int) : the number of codons in the sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
271 dict_aa_counts (dict) : the output of aaCountings()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
272
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
273 return : (float)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
274 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
275
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
276 IVYWREL = 0
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
277
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
278 for aa in ["I","V","Y","W","R","E","L"]: # Impossible to make a simple sum, in case one the aa is not in the dict keys
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
279 if aa in dict_aa_counts.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
280 IVYWREL += dict_aa_counts[aa]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
281
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
282 return float(IVYWREL)/nb_codons
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
283
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
284 def ekqhCount(dict_aa_counts):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
285 """ Compute the ratio of amino-acids EK/QH
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
286 Arg : dict_aa_counts (dict) : the output of aaCountings()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
287 Return : (float)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
288 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
289 ek = 0
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
290 qh = 0
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
291
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
292 ek = dict_aa_counts["E"] + dict_aa_counts["K"]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
293 qh = dict_aa_counts["Q"] + dict_aa_counts["H"]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
294
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
295 if qh != 0:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
296 return float(ek)/qh
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
297 else : return ek
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
298
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
299 def payresdgmCount(dict_aa_counts):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
300 """ Compute the ratio of amino-acids PASRE/SDGM
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
301 Arg : dict_aa_counts (dict) : the output of aaCountings()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
302 Return : (float)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
303 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
304 payre = 0
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
305 sdgm = 0
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
306
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
307 for aa in ["P","A","Y","R","E"]:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
308 payre += dict_aa_counts[aa]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
309 for aa in ["S","D","G","M"]:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
310 sdgm += dict_aa_counts[aa]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
311
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
312 if sdgm != 0:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
313 return float(payre)/sdgm
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
314 else : return payre
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
315
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
316 def purineLoad(seq, nb_codons):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
317 """ Compute the purine load indice of a sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
318 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
319 seq (str) : the sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
320 nb_codons (int) : the number of codons in the sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
321
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
322 Return (float)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
323 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
324
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
325 # TO IMPROVE ? : make this computation in the codonCountigns() function to avoid parsing twice the sequence ?
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
326 # But : involves a less readable code
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
327
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
328 g12, g3, A, c12, c3, T = 0.0,0.0,seq.count("a"),0.0,0.0,seq.count("t")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
329
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
330 # g3 and c3 : g and c in 3rd position of a codon
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
331 s = ""
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
332 for i in range(2, len(seq), 3):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
333 s += seq[i]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
334 g3 = s.count("g")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
335 c3 = s.count("c")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
336
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
337 # g12 and c12 : g and c in 1st and 2d positions of a codon
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
338 s = ""
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
339 for i in range(0, len(seq), 3):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
340 s += seq[i]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
341 g12 = s.count("g")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
342 c12 = s.count("c")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
343 s = ""
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
344 for i in range(1, len(seq), 3):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
345 s += seq[i]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
346 g12 += s.count("g")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
347 c12 += s.count("c")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
348
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
349 return float(1000*(g12+g3+A-c12-c3-T))/(3*nb_codons)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
350
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
351 def cvp(dict_aatypes):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
352 """ Compute the difference nb_charged_aamino_acids - nb_polar_amino_acids
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
353 Return: (int)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
354 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
355 return dict_aatypes["charged"] - dict_aatypes["polar"]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
356
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
357 # ------ The function ------ #
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
358
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
359 nb_codons = nbCodons(seq)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
360 max_indice = maxIndice(seq)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
361 GC3, GC12 = gc12Andgc3Count(seq, nb_codons, max_indice)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
362 IVYWREL, EKQH, PAYRESDGM = ivywrelCount(nb_codons, dict_aa_counts), ekqhCount(dict_aa_counts), payresdgmCount(dict_aa_counts)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
363 purineload, CvP = purineLoad(seq, nb_codons), cvp(dict_aa_types)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
364 return GC3, GC12, IVYWREL, EKQH, PAYRESDGM, purineload, CvP
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
365
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
366 # # # Function for codons, aa, aatypes Transitions -----------------------------------------------------------------------------
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
367
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
368 def computeAllBiases(seq1, seq2, dico_codons_transi, dico_aa_transi, dico_aatypes_transi, reversecode, reverseclassif) :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
369 """ Compute all biases (transisitions codon->codon, aa->-aa, aa_type->aa_type) between two sequences
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
370
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
371 Args : See sub-routines for details
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
372
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
373 Returns 3 dictionaries of dictionaries. See sub-routines for details
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
374 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
375
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
376 # ------ Sub-routines ------ #
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
377
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
378 def codons_transitions(seq1, seq2, dico_codons_transi):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
379 """ Compute the number of transitions from a codon of a sequence to the codon of a second sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
380
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
381 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
382 seq1 (str) : the first sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
383 seq2 (str) : the second sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
384 dico_codons_transi (dict of dicts) : { codon1 : {codon1: 0, codon2 : 0, ...}, ..}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
385
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
386 Return :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
387 codons_transi (dict of dicts) : the occurences of each codon to codon transition
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
388 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
389
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
390 codons_transi = copy.deepcopy(dico_codons_transi)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
391
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
392 for i in range(0, len(seq1), 3):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
393 # check if no indel and if len seq[i:i+3] is really 3 (check for the last codon)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
394 if viable([seq1, seq2], i, "any") and len(seq1[i:i+3]) == 3 and len(seq2[i:i+3]) == 3 :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
395 codons_transi[seq1[i:i+3]][seq2[i:i+3]] += 1
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
396
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
397 return codons_transi
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
398
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
399 def codons_transitions_freqs(codons_transitions_counts):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
400 """ Computes frequencies of codon transitions between two sequences
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
401
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
402 Arg :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
403 codons_transitions_counts (dict) : the output of codons_transitions()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
404
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
405 Return :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
406 codons_transi_freqs (dict of dicts) : the frequencies of each codon to codon transition
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
407 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
408
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
409 codons_transi_freqs = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
410
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
411 for key in codons_transitions_counts.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
412 codons_transi_freqs[key] = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
413 for key2 in codons_transitions_counts[key].keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
414 if sum(codons_transitions_counts[key].values()) != 0:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
415 freq = float(codons_transitions_counts[key][key2])/sum(codons_transitions_counts[key].values())
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
416 codons_transi_freqs[key][key2] = freq
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
417 else :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
418 codons_transi_freqs[key][key2] = 0
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
419 return codons_transi_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
420
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
421 def aa_transitions(dico_codons_transi, dico_aa_transi, reversecode):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
422 """ Compute the number of transitions from an amino-acid of a sequence to the amino-acid of a second sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
423
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
424 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
425 dico_codons_transi (dict of dicts) : the codons transitions computed by codons_transitions()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
426 dico_aa_transi (dict of dicts) : { aa1 : {aa1: 0, aa2 : 0, ...}, ..}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
427 reversecode (dict) : the reversed genetic code {aa1 : [codons],...} -> {codon1: aa1, codon2: aa2, ...}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
428
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
429 Return :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
430 aa_transi (dict of dicts) : the occurences of each aa to aa transition
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
431 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
432
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
433 aa_transi = copy.deepcopy(dico_aa_transi)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
434
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
435 for k in dico_codons_transi.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
436 newk = reversecode[k]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
437 for k2 in dico_codons_transi[k].keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
438 newk2 = reversecode[k2]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
439 aa_transi[newk][newk2] += dico_codons_transi[k][k2]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
440
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
441 return aa_transi
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
442
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
443 def aa_transitions_freqs(aa_transitions_counts):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
444 """ Computes frequencies of amico-acids transitions between two sequences
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
445 Arg : aa_transitions_counts (dict of dicts): the output of aa_transitions()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
446 Return : aa_transi_freqs (dict of dicts) : the frequencies of each aa to aa transition
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
447 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
448
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
449 aa_transi_freqs = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
450
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
451 for key in aa_transitions_counts.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
452 aa_transi_freqs[key] = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
453 for key2 in aa_transitions_counts[key].keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
454 if sum(aa_transitions_counts[key].values()) != 0:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
455 freq = float(aa_transitions_counts[key][key2])/sum(aa_transitions_counts[key].values())
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
456 aa_transi_freqs[key][key2] = freq
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
457 else :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
458 aa_transi_freqs[key][key2] = 0
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
459 return aa_transi_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
460
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
461 def aatypes_transitions(dico_aa_transi, dico_aatypes_transi, reverseclassif):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
462 """ Compute the number of transitions from an amino-acid type of a sequence to the amino-acid type of a second sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
463
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
464 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
465 dico_aa_transi (dict of dicts) : the output of aa_transitions()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
466 dico_aatypes_transi (dict of dicts) : { type1 : {type1: 0, type2 : 0, ...}, ..}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
467 reverseclassif (dict) : the reversed amino-acid clasification {aa1: type, aa2: type, ...}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
468
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
469 Return :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
470 aatypes_transi (dict of dicts) : the occurences of each aatype to aatype transition
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
471 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
472
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
473 aatypes_transi = copy.deepcopy(dico_aatypes_transi)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
474 for k in dico_aa_transi.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
475 newk = reverseclassif[k]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
476 for k2 in dico_aa_transi[k].keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
477 newk2 = reverseclassif[k2]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
478 aatypes_transi[newk][newk2] += dico_aa_transi[k][k2]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
479
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
480 return aatypes_transi
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
481
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
482 def aatypes_transitions_freqs(aatypes_transitions_counts):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
483 """ Computes frequencies of amico-acids types transitions between two sequences
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
484 Args : aatypes_transitions_counts (dict of dicts) : the output of aatypes_transitions()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
485 Return : aatypes_transi_freqs (dict of dicts) : the frequencies of each aatype to aatype transition
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
486 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
487
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
488 aatypes_transi_freqs = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
489
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
490 for key in aatypes_transitions_counts.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
491 aatypes_transi_freqs[key] = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
492 for key2 in aatypes_transitions_counts[key].keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
493 if sum(aatypes_transitions_counts[key].values()) != 0:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
494 freq = float(aatypes_transitions_counts[key][key2])/sum(aatypes_transitions_counts[key].values())
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
495 aatypes_transi_freqs[key][key2] = freq
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
496 else :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
497 aatypes_transi_freqs[key][key2] = 0
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
498 return aatypes_transi_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
499
9
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
500
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
501
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
502
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
503 # ------ The function ------ #
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
504
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
505 codons_transitions = codons_transitions(seq1, seq2, dico_codons_transi)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
506 codons_transitions_freqs = codons_transitions_freqs(codons_transitions)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
507 aa_transitions = aa_transitions(codons_transitions, dico_aa_transi, reversecode)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
508 aa_transitions_freqs = aa_transitions_freqs(aa_transitions)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
509 aatypes_transitions = aatypes_transitions(aa_transitions, dico_aatypes_transi, reverseclassif)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
510 aatypes_transitions_freqs = aatypes_transitions_freqs(aatypes_transitions)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
511
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
512 return codons_transitions, codons_transitions_freqs, aa_transitions, aa_transitions_freqs, aatypes_transitions, aatypes_transitions_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
513
9
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
514 def all_sed(codons_c, aa_c, aat_c, codons_transitions, aa_transitions, aatypes_transitions, dico_codons_transi, dico_aa_transi, dico_aatypes_transi):
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
515
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
516 def compute_sed(transi, counts, dico):
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
517 """ Compute the substitution exchangeability disequilibrium (SED) from one species A to another B between codons/aa//aatypes couples
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
518
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
519 Args:
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
520 transi ; dict - dictionaries of all counts of transition from codon/aa/aatype X to Y from sp A to sp B
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
521 counts : dict - dictionaries of codons/aa/aatypes counts in species A
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
522 dico : dict - a dictionary (nested) with all values to 0
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
523
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
524 """
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
525 dict_sed = copy.deepcopy(dico)
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
526
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
527 for key in transi.keys():
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
528 for key2 in transi.keys():
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
529 if counts[key] != 0 and float(transi[key2][key])/counts[key] != 0.0:
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
530 x = (float(transi[key][key2])/counts[key]) / (float(transi[key2][key])/counts[key])
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
531 dict_sed[key][key2] = - pow(2,1-x)+1
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
532 else :
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
533 dict_sed[key][key2] = 'NA'
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
534
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
535 return dict_sed
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
536
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
537 codons_sed = compute_sed(codons_transitions, codons_c, dico_codons_transi)
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
538 aa_sed = compute_sed(aa_transitions, aa_c, dico_aa_transi)
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
539 aatypes_sed = compute_sed(aatypes_transitions, aat_c, dico_aatypes_transi)
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
540
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
541 return codons_sed, aa_sed, aatypes_sed
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
542
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
543 # # # Function for random resampling --------------------------------------------------------------------------------------------
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
544
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
545 def sampling (dict_seq, nb_iter, len_sample, list_codons, genetic_code, aa_classif, reversecode, reverseclassif):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
546 """ Resample randomly codons from sequences (sort of bootsrap)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
547
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
548 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
549 dict_seq (dict) : contains the species name and sequences used ( { 'sp1': seq1, 'sp2': seq2, ...}) without '-' removal
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
550 nb_iter (int) : number of resampling iterations (better >= 1000)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
551 len_sample (int) : length (in codons) of a resampled sequence (better >= 1000)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
552 list_codons (list of str): all codons except codons-stop
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
553 genetic_code (dict) : the genetic code : {'aa1': [codon1, codon2,...], ...}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
554 aa_classif (dict) : the types of the amino-acids : {type: [aa1, aa2, ...], ...}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
555 reversecode (dict) : the reversed genetic code : {codon1: aa1, codon2: aa2, ...}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
556 reverseclassif (dict) : the reversed amino-acid : clasification {aa1: type, aa2: type, ...}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
557
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
558 Returns :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
559 codons_lst, aa_lst, classif_lst (dicts) : keys : codons/aa/aatypes, values : []
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
560 codons_transitions_lst, aa_transitions_lst, classif_transitions_lst (dict of dicts) : keys : codons/aa/aatypes, values : the 3 previous dicts
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
561 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
562
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
563 # Initialize empty dictionaries for countings and transitions. It's also possible to isntanciate these ones in the main() but it would make a function with ~14 parameters
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
564 codons_0, aa_0, classif_0, codons_transitions_0, aa_transitions_0, classif_transitions_0 = buildDicts(list_codons, 0, genetic_code, aa_classif)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
565 codons_lst, aa_lst, classif_lst, codons_transitions_lst, aa_transitions_lst, classif_transitions_lst = buildDicts(list_codons, [], genetic_code, aa_classif)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
566
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
567 # determine the max position where sampling is possible
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
568 l = len(dict_seq.values()[1])
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
569 if l%3 == 0: max_indice = l-3
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
570 if l%3 == 1: max_indice = l-4
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
571 if l%3 == 2: max_indice = l-5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
572
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
573 # List of positions to resample
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
574 viable_positions = [pos for pos in range(0,max_indice,3) if viable(dict_seq.values(), pos, "all")]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
575 sample_positions = np.random.choice(viable_positions, len_sample)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
576
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
577 # nb_iter resampled sequences
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
578 for i in range(nb_iter):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
579 if (i+1)%(nb_iter/10) == 0:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
580 print " "+str( (i+1)*100/nb_iter)+"%"
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
581
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
582 seqa, seqb = "", ""
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
583 for pos in sample_positions:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
584 codona, codonb = "---", "---"
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
585 # The sequence to be resampled in this position is randomly chosen ; no "-" resampled
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
586 while "-" in codona :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
587 codona = dict_seq.values()[random.randrange(0, len(dict_seq.keys())-1)][pos:pos+3]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
588 while "-" in codonb :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
589 codonb = dict_seq.values()[random.randrange(0, len(dict_seq.keys())-1)][pos:pos+3]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
590 seqa += codona
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
591 seqb += codonb
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
592
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
593 # dictionaries : frequences of codons, aa, aatypes (seq1)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
594 codons_occ_tmp, codons_freq_tmp, aa_occ_tmp, aa_freq_tmp, aatypes_occ_tmp, aatypes_freq_tmp = computeAllCountingsAndFreqs(seqa, list_codons, codons_0, aa_0, classif_0, genetic_code, aa_classif)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
595 # dictionaries frequences of transitions (seqa->seqb)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
596 codons_transitions_tmp, codons_transitions_freq_tmp, aa_transition_tmp, aa_transitions_freq_tmp, aatypes_transitions_tmp, aatypes_transitions_freq_tmp = computeAllBiases(seqa, seqb, codons_transitions_0, aa_transitions_0, classif_transitions_0, reversecode, reverseclassif)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
597
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
598 # Adding occurrences in final dicts
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
599 for key in codons_freq_tmp.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
600 codons_lst[key].append(codons_freq_tmp[key])
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
601 for key in aa_freq_tmp.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
602 aa_lst[key].append(aa_freq_tmp[key])
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
603 for key in aatypes_freq_tmp.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
604 classif_lst[key].append(aatypes_freq_tmp[key])
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
605
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
606 # Adding occurrences in final dicts (transitions)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
607 for key in codons_transitions_freq_tmp.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
608 for key2 in codons_transitions_freq_tmp[key].keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
609 codons_transitions_lst[key][key2].append(codons_transitions_freq_tmp[key][key2])
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
610 for key in aa_transitions_freq_tmp.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
611 for key2 in aa_transitions_freq_tmp[key].keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
612 aa_transitions_lst[key][key2].append(aa_transitions_freq_tmp[key][key2])
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
613 for key in aatypes_transitions_freq_tmp.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
614 for key2 in aatypes_transitions_freq_tmp[key].keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
615 classif_transitions_lst[key][key2].append(aatypes_transitions_freq_tmp[key][key2])
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
616
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
617 return codons_lst, aa_lst, classif_lst, codons_transitions_lst, aa_transitions_lst, classif_transitions_lst
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
618
9
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
619 def testPvalues(dict_counts, dict_resampling, nb_iter, method):
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
620 """ Computes where the observed value is located in the expected counting distribution
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
621
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
622 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
623 dict_counts (dict) : observed frequencies obtained from the functions computeAllCountingsAndFreqs() or computeAllBiases()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
624 dict_resampling (dict) : expected frequencies obtained from the functions computeAllCountingsAndFreqs() or computeAllBiases() within the sampling() function
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
625
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
626 Return :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
627 pvalue (dict, dict of dicts) : the pvalues of all observed countings (dict) and transitions (dict of dicts)
9
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
628
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
629
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
630 pnorm computes the pvalue to have a value inferior to the observed value under a normal distribution
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
631 One sided to left tail :
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
632 p < 0.05 indicates significantly lower counts
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
633 p > 0.95 indicates significantly higher counts
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
634 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
635
9
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
636
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
637 def p_resampling(obs, values, nb_iter):
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
638 """ The pvalue is the proportion of bootsrapped values smaller than the observed value
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
639 If p = 0.025 : 2.5% of the bootstrapped values are smaller than the observed value
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
640 p < 0.025 : the obs value is most likely significantly lower.
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
641 If p = 0.975 : 97.5% of the bootstrapped values are smaller than the observed value
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
642 p > 0.975 the obs value is most likely significantly higher.
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
643
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
644 Args :
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
645 obs : int or float - the observed value
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
646 values : list - values of resampling (int or floats)
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
647 nb_iter : int - the number of resampled values (=len(values))
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
648
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
649 Return :
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
650 pvalue (float)
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
651 """
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
652
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
653 num = len([x for x in values if x < obs])
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
654 return float(num + 1) / (nb_iter+1)
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
655
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
656 def testPvalue(obs, exp, nb_iter):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
657 """ Compute a pvalue
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
658
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
659 Args :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
660 exp (list of floatsà : a list of length nb_iter, containing expected frequencies of a codon/aa/aatype at each iteration
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
661 obs (float) : the observed value
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
662 nb_iter (int) : the number of iterations for resampling
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
663
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
664 Returns :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
665 pvalue (float)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
666 """
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
667
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
668 max_val = nb_iter-1
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
669 min_val = 0
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
670 test_val = (max_val+min_val)/2
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
671
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
672 while max_val-min_val > 1:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
673 if obs > exp[test_val]:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
674 min_val = test_val
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
675 test_val = (max_val+min_val)/2
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
676 elif obs < exp[test_val]:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
677 max_val = test_val
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
678 test_val = (max_val+min_val)/2
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
679 else:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
680 break
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
681
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
682 pvalue = float(test_val+1)/(nb_iter+1)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
683
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
684 return pvalue
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
685
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
686 # ------ The function ------ #
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
687
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
688 pvalues = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
689
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
690 for key in dict_resampling.keys():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
691 if type(dict_resampling.values()[1]) is not dict :
9
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
692 if method == 'origin':
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
693 pvalues[key] = testPvalue(dict_counts[key], dict_resampling[key], nb_iter)
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
694 elif method == 'pnorm':
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
695 pvalues[key] = scipy.stats.norm.cdf(dict_counts[key], np.mean(dict_resampling[key]), np.std(dict_resampling[key]))
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
696 elif method == 'p_resampling':
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
697 pvalues[key] = p_resampling(dict_counts[key], dict_resampling[key], nb_iter)
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
698 else :
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
699 pvalues[key] = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
700 for key2 in dict_resampling[key].keys():
9
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
701 if method == 'origin':
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
702 pvalues[key][key2] = testPvalue(dict_counts[key][key2], dict_resampling[key][key2], nb_iter)
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
703 elif method == 'pnorm':
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
704 pvalues[key][key2] = scipy.stats.norm.cdf(dict_counts[key][key2], np.mean(dict_resampling[key][key2]), np.std(dict_resampling[key][key2]))
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
705 elif method == 'p_resampling':
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
706 pvalues[key][key2] = p_resampling(dict_counts[key][key2], dict_resampling[key][key2], nb_iter)
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
707
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
708 return pvalues
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
709
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
710 def main():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
711
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
712 # arguments
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
713 parser = argparse.ArgumentParser()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
714 parser.add_argument("sequences_file", help="File containing sequences (the output of the tool 'ConcatPhyl'")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
715 parser.add_argument("considered_species", help="The species name, separated by commas (must be the same than in the sequences_file). It is possible to consider only a subset of species.")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
716 parser.add_argument("species_for_bootstrap", help="The species which will be used for bootstrapping, separated by commas. It is possible to consider only a subset of species.")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
717 parser.add_argument("iteration", help="The number of iterations for bootstrapping (better if => 1000)", type=int)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
718 parser.add_argument("sample_length", help="The lenght of a bootstrapped sequences (better if >= 1000", type=int)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
719 args = parser.parse_args()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
720
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
721 print "\n ------ Occurences and frequencies of codons, amino-acids, amino-acids types -------\n"
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
722
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
723 print "The script counts the number of codons, amino acids, and types of amino acids in sequences,"
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
724 print "as well as the mutation bias from one item to another between 2 sequences.\n"
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
725
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
726 print "Counting are then compared to empirical p-values, obtained from bootstrapped sequences obtained from a subset of sequences."
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
727 print "In the output files, the pvalues indicate the position of the observed data in a distribution of empirical countings obtained from"
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
728 print "a resample of the data. Values above 0.95 indicate a significantly higher counting, values under 0.05 a significantly lower counting."
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
729
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
730 print " Sequences file : {}".format(args.sequences_file)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
731 print " Species retained for countings : {}\n".format(args.considered_species)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
732
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
733 print "Processing : reading input file, opening output files, building dictionaries."
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
734
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
735 # make pairs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
736 list_species = str.split(args.considered_species, ",")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
737 list_species_boot = str.split(args.species_for_bootstrap, ",")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
738 pairs_list=list(itertools.combinations(list_species,2))
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
739
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
740 # read sequences
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
741 sequences_for_counts = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
742 sequences_for_resampling = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
743 with open(args.sequences_file, "r") as file:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
744 for line1,line2 in itertools.izip_longest(*[file]*2):
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
745 species = line1.strip(">\r\n")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
746 sequence = line2.strip("\r\n")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
747 if species in list_species:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
748 sequences_for_counts[species] = sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
749 if species in list_species_boot:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
750 sequences_for_resampling[species] = sequence
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
751
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
752 print " Warning : countings might be biased and show high differences between species because of high variations of the indels proportions among sequences."
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
753 print " Frequences are more representative."
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
754
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
755 print "\n Indels percent :"
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
756
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
757 for k,v in sequences_for_counts.items():
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
758 print " {} : {} %".format(k, float(v.count("-"))/len(v)*100)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
759
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
760 # useful dictionaries
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
761 dict_genetic_code={"F":["ttt","ttc"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
762 "L":["tta","ttg","ctt","ctc","cta","ctg"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
763 "I":["att","atc","ata"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
764 "M":["atg"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
765 "V":["gtt","gtc","gta","gtg"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
766 "S":["tct","tcc","tca","tcg","agt","agc"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
767 "P":["cct","cca","ccg","ccc"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
768 "T":["act","acc","aca","acg"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
769 "A":["gct","gcc","gca","gcg"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
770 "Y":["tat","tac"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
771 "H":["cat","cac"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
772 "Q":["caa","cag"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
773 "N":["aat","aac"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
774 "K":["aaa","aag"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
775 "D":["gat","gac"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
776 "E":["gaa","gag"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
777 "C":["tgt","tgc"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
778 "W":["tgg"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
779 "R":["cgt","cgc","cga","cgg","aga","agg"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
780 "G":["ggt","ggc","gga","ggg"]}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
781
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
782 dict_aa_classif={"unpolar":["G","A","V","L","M","I"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
783 "polar":["S","T","C","P","N","Q"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
784 "charged":["K","R","H","D","E"],
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
785 "aromatics":["F","Y","W"]}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
786
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
787 reversecode={v:k for k in dict_genetic_code for v in dict_genetic_code[k]}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
788 reverseclassif={v:k for k in dict_aa_classif for v in dict_aa_classif[k]}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
789
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
790 # codons list (without stop codons)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
791 nucleotides = ['a', 'c', 'g', 't']
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
792 list_codons = [''.join(comb) for comb in itertools.product(nucleotides, repeat=3)]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
793 list_codons.remove("taa")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
794 list_codons.remove("tag")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
795 list_codons.remove("tga")
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
796
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
797 # Store already computed species + row.names in output files
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
798 index = []
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
799 index_transi = []
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
800
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
801 # Final dictionaries writed to csv files
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
802 all_codons = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
803 all_aa = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
804 all_aatypes = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
805 all_various = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
806 all_codons_transitions = {} # Not used because too much : 61*61 columns
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
807 all_aa_transitions = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
808 all_aatypes_transitions = {}
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
809
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
810 # RUN
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
811
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
812 print "\nProcessing : resampling ..."
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
813 print " Parameters : {niter} iterations, {lensample} codon per resampled sequence, species used : {species}\n".format(niter=args.iteration, lensample=args.sample_length, species=args.species_for_bootstrap)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
814
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
815 codons_boot, aa_boot, aatypes_boot, codons_transi_boot, aa_transi_boot, aatypes_transi_boot = sampling(sequences_for_resampling, args.iteration, args.sample_length, list_codons, dict_genetic_code, dict_aa_classif, reversecode, reverseclassif)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
816 print " Done.\n"
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
817
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
818 print "Processing : countings....\n"
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
819
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
820 # Initialize empty dictionaries for countings and transitions
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
821 init_dict_codons, init_dict_aa, init_dict_classif, dico_codons_transitions, dico_aa_transitions, dico_aatypes_transitions = buildDicts(list_codons, 0, dict_genetic_code, dict_aa_classif)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
822
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
823 for pair in pairs_list:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
824 p1, p2 = pair[0], pair[1]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
825 if p1 not in index:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
826 print "Countings on {}".format(p1)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
827
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
828 p1_codons_counts, p1_codons_freqs, p1_aa_counts, p1_aa_freqs, p1_aatypes_counts, p1_aatypes_freqs = computeAllCountingsAndFreqs(sequences_for_counts[p1], list_codons, init_dict_codons, init_dict_aa, init_dict_classif, dict_genetic_code, dict_aa_classif)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
829 p1_GC3, p1_GC12, p1_IVYWREL, p1_EKQH, p1_PAYRESDGM, p1_purineload, p1_CvP = computeVarious(sequences_for_counts[p1], p1_aa_counts, p1_aatypes_freqs)
9
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
830
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
831
9
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
832 p1_codons_pvalues = testPvalues(p1_codons_freqs, codons_boot, args.iteration, 'p_resampling')
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
833 p1_aa_pvalues = testPvalues(p1_aa_freqs, aa_boot, args.iteration, 'p_resampling')
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
834 p1_aatypes_pvalues = testPvalues(p1_aatypes_freqs, aatypes_boot, args.iteration, 'p_resampling')
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
835
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
836 all_codons[p1+"_obs_counts"] = p1_codons_counts
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
837 all_codons[p1+"_obs_freqs"] = p1_codons_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
838 all_codons[p1+"_pvalues"] = p1_codons_pvalues
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
839 all_aa[p1+"_obs_counts"] = p1_aa_counts
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
840 all_aa[p1+"_obs_freqs"] = p1_aa_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
841 all_aa[p1+"_pvalues"] = p1_aa_pvalues
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
842 all_aatypes[p1+"_obs_counts"] = p1_aatypes_counts
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
843 all_aatypes[p1+"_obs_freqs"] = p1_aatypes_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
844 all_aatypes[p1+"_pvalues"] = p1_aatypes_pvalues
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
845 all_various[p1] = [p1_GC3, p1_GC12, p1_IVYWREL, p1_EKQH, p1_PAYRESDGM, p1_purineload, p1_CvP]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
846
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
847 index.append(p1)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
848
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
849 if p2 not in index:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
850 print "Countings on {}".format(p2)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
851
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
852 p2_codons_counts, p2_codons_freqs, p2_aa_counts, p2_aa_freqs, p2_aatypes_counts, p2_aatypes_freqs = computeAllCountingsAndFreqs(sequences_for_counts[p2], list_codons, init_dict_codons, init_dict_aa, init_dict_classif, dict_genetic_code, dict_aa_classif)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
853 p2_GC3, p2_GC12, p2_IVYWREL, p2_EKQH, p2_PAYRESDGM, p2_purineload, p2_CvP = computeVarious(sequences_for_counts[p2], p2_aa_counts, p2_aatypes_freqs)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
854
9
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
855 p2_codons_pvalues = testPvalues(p2_codons_freqs, codons_boot, args.iteration, 'p_resampling')
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
856 p2_aa_pvalues = testPvalues(p2_aa_freqs, aa_boot, args.iteration, 'p_resampling')
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
857 p2_aatypes_pvalues = testPvalues(p2_aatypes_freqs, aatypes_boot, args.iteration, 'p_resampling')
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
858
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
859 all_codons[p2+"_obs_counts"] = p2_codons_counts
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
860 all_codons[p2+"_obs_freqs"] = p2_codons_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
861 all_codons[p2+"_pvalues"] = p2_codons_pvalues
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
862 all_aa[p2+"_obs_counts"] = p2_aa_counts
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
863 all_aa[p2+"_obs_freqs"] = p2_aa_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
864 all_aa[p2+"_pvalues"] = p2_aa_pvalues
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
865 all_aatypes[p2+"_obs_counts"] = p2_aatypes_counts
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
866 all_aatypes[p2+"_obs_freqs"] = p2_aatypes_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
867 all_aatypes[p2+"_pvalues"] = p2_aatypes_pvalues
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
868 all_various[p2] = p2_GC3, p2_GC12, p2_IVYWREL, p2_EKQH, p2_PAYRESDGM, p2_purineload, p2_CvP
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
869
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
870 index.append(p2)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
871
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
872 if (p1, p2) not in index_transi and p1 in sequences_for_counts and p2 in sequences_for_counts:
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
873 print "Countings transitions between {} and {}".format(p1, p2)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
874 codons_transitions, codons_transitions_freqs, aa_transitions, aa_transitions_freqs, aatypes_transitions, aatypes_transitions_freqs = computeAllBiases(sequences_for_counts[p1], sequences_for_counts[p2], dico_codons_transitions, dico_aa_transitions, dico_aatypes_transitions, reversecode, reverseclassif)
9
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
875
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
876 # Ajout
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
877 codons_sed, aa_sed, aatypes_sed = all_sed(p1_codons_counts, p1_aa_counts, p1_aatypes_counts, codons_transitions, aa_transitions, aatypes_transitions, dico_codons_transitions, dico_aa_transitions, dico_aatypes_transitions)
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
878
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
879 index_transi.append((p1,p2))
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
880
9
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
881 p1p2_codons_pvalues = testPvalues(codons_transitions_freqs, codons_transi_boot, args.iteration, 'p_resampling')
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
882 p1p2_aa_pvalues = testPvalues(aa_transitions_freqs, aa_transi_boot, args.iteration, 'p_resampling')
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
883 p1p2_aatypes_pvalues = testPvalues(aatypes_transitions_freqs, aatypes_transi_boot, args.iteration, 'p_resampling')
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
884
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
885 all_codons_transitions[p1+">"+p2+"_obs_counts"] = codons_transitions
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
886 all_codons_transitions[p1+">"+p2+"_obs_freqs"] = codons_transitions_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
887 all_codons_transitions[p1+">"+p2+"_pvalues"] = p1p2_codons_pvalues
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
888 all_aa_transitions[p1+">"+p2+"_obs_counts"] = aa_transitions
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
889 all_aa_transitions[p1+">"+p2+"_obs_freqs"] = aa_transitions_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
890 all_aa_transitions[p1+">"+p2+"_pvalues"] = p1p2_aa_pvalues
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
891 all_aatypes_transitions[p1+">"+p2+"_obs_counts"] = aatypes_transitions
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
892 all_aatypes_transitions[p1+">"+p2+"_obs_freqs"] = aatypes_transitions_freqs
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
893 all_aatypes_transitions[p1+">"+p2+"_pvalues"] = p1p2_aatypes_pvalues
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
894
9
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
895 all_codons_transitions[p1+">"+p2+"_sed"] = codons_sed
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
896 all_aa_transitions[p1+">"+p2+"_sed"] = aa_sed
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
897 all_aatypes_transitions[p1+">"+p2+"_sed"] = aatypes_sed
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
898
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
899 index_transi.append((p1, p2))
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
900
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
901 print "\n Done.\n"
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
902
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
903 print "Processing : creating dataframes ..."
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
904
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
905 frame_codons = pd.DataFrame(all_codons).T.astype('object')
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
906 frame_aa = pd.DataFrame(all_aa).T.astype('object')
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
907 frame_aatypes = pd.DataFrame(all_aatypes).T.astype('object')
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
908
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
909 frame_codons_transitions = pd.concat({k: pd.DataFrame(v) for k, v in all_codons_transitions.items()}).unstack()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
910 frame_codons_transitions.columns = frame_codons_transitions.columns.map('>'.join)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
911
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
912 frame_aa_transitions = pd.concat({k: pd.DataFrame(v) for k, v in all_aa_transitions.items()}).unstack()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
913 frame_aa_transitions.columns = frame_aa_transitions.columns.map('>'.join)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
914
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
915 frame_aatypes_transitions = pd.concat({k: pd.DataFrame(v) for k, v in all_aatypes_transitions.items()}).unstack()
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
916 frame_aatypes_transitions.columns = frame_aatypes_transitions.columns.map('>'.join)
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
917
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
918 frame_various = pd.DataFrame(all_various).T
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
919 frame_various.columns = ["GC3","GC12","IVYWREL","EKQH","PAYRESDGM","purineload", "CvP"]
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
920
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
921 frame_codons.index.name, frame_aa.index.name, frame_aatypes.index.name = "Species", "Species","Species"
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
922 frame_aa_transitions.index.name, frame_aatypes_transitions.index.name, frame_various.index.name = "Species","Species","Species"
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
923
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
924 print "Writing dataframes to output files ...\n"
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
925
9
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
926 frame_codons.round(8).to_csv("codons_freqs.csv", sep=",", encoding="utf-8")
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
927 frame_aa.round(8).to_csv("aa_freqs.csv", sep=",", encoding="utf-8")
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
928 frame_aatypes.astype('object').round(8).to_csv("aatypes_freqs.csv", sep=",", encoding="utf-8")
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
929 frame_codons_transitions.round(8).to_csv("codons_transitions_freqs.csv", sep=",", encoding="utf-8")
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
930 frame_aa_transitions.round(8).to_csv("aa_transitions_freqs.csv", sep=",", encoding="utf-8")
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
931 frame_aatypes_transitions.round(8).to_csv("aatypes_transitions_freqs.csv", sep=",", encoding="utf-8")
04a9ada73cc4 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit f1ba8d136e0129f3e8435b25a95f70f697d51464-dirty
abims-sbr
parents: 5
diff changeset
932 frame_various.round(8).to_csv("gc_and_others_freqs.csv", sep=",", encoding="utf-8")
5
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
933
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
934 print "Done."
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
935
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
936 if __name__ == "__main__":
0ba551449008 planemo upload for repository htpps://github.com/abims-sbr/adaptearch commit 273a9af69b672b2580cd5dec4c0e67a4a96fb0fe
abims-sbr
parents:
diff changeset
937 main()