annotate variant_effect_predictor/Bio/DB/NCBIHelper.pm @ 2:a5976b2dce6f

changing defualt values for ensembl database
author mahtabm
date Thu, 11 Apr 2013 17:15:42 +1000
parents 1f6dce3d34e0
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1 # $Id: NCBIHelper.pm,v 1.24.2.2 2003/06/12 09:29:38 heikki Exp $
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
3 # BioPerl module for Bio::DB::NCBIHelper
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
4 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
5 # Cared for by Jason Stajich
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
6 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
7 # Copyright Jason Stajich
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
8 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
9 # You may distribute this module under the same terms as perl itself
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
10 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
11 # POD documentation - main docs before the code
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
12 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
13 # Interfaces with new WebDBSeqI interface
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
14
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
15 =head1 NAME
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
16
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
17 Bio::DB::NCBIHelper - A collection of routines useful for queries to
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
18 NCBI databases.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
19
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
20 =head1 SYNOPSIS
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
21
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
22 #Do not use this module directly.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
23
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
24 # get a Bio::DB::NCBIHelper object somehow
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
25 my $seqio = $db->get_Stream_by_acc(['MUSIGHBA1']);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
26 foreach my $seq ( $seqio->next_seq ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
27 # process seq
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
28 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
29
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
30 =head1 DESCRIPTION
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
31
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
32 Provides a single place to setup some common methods for querying NCBI
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
33 web databases. This module just centralizes the methods for
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
34 constructing a URL for querying NCBI GenBank and NCBI GenPept and the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
35 common HTML stripping done in L<postprocess_data>().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
36
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
37 The base NCBI query URL used is
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
38 http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
39
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
40 =head1 FEEDBACK
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
41
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
42 =head2 Mailing Lists
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
43
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
44 User feedback is an integral part of the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
45 evolution of this and other Bioperl modules. Send
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
46 your comments and suggestions preferably to one
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
47 of the Bioperl mailing lists. Your participation
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
48 is much appreciated.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
49
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
50 bioperl-l@bioperl.org - General discussion
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
51 http://bioperl.org/MailList.shtml - About the mailing lists
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
52
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
53 =head2 Reporting Bugs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
54
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
55 Report bugs to the Bioperl bug tracking system to
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
56 help us keep track the bugs and their resolution.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
57 Bug reports can be submitted via email or the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
58 web:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
59
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
60 bioperl-bugs@bio.perl.org
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
61 http://bugzilla.bioperl.org/
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
62
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
63 =head1 AUTHOR - Jason Stajich
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
64
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
65 Email jason@bioperl.org
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
66
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
67 =head1 APPENDIX
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
68
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
69 The rest of the documentation details each of the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
70 object methods. Internal methods are usually
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
71 preceded with a _
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
72
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
73 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
74
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
75 # Let the code begin...
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
76
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
77 package Bio::DB::NCBIHelper;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
78 use strict;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
79 use vars qw(@ISA $HOSTBASE %CGILOCATION %FORMATMAP
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
80 $DEFAULTFORMAT $MAX_ENTRIES $VERSION);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
81
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
82 use Bio::DB::WebDBSeqI;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
83 use Bio::DB::Query::GenBank;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
84 use HTTP::Request::Common;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
85 use URI;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
86 use Bio::Root::IO;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
87 use Bio::DB::RefSeq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
88 use Bio::Root::Root;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
89
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
90 @ISA = qw(Bio::DB::WebDBSeqI Bio::Root::Root);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
91 $VERSION = '0.8';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
92
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
93 BEGIN {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
94 $MAX_ENTRIES = 19000;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
95 $HOSTBASE = 'http://eutils.ncbi.nlm.nih.gov';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
96 %CGILOCATION = (
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
97 'batch' => ['post' => '/entrez/eutils/efetch.fcgi'],
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
98 'query' => ['get' => '/entrez/eutils/efetch.fcgi'],
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
99 'single' => ['get' => '/entrez/eutils/efetch.fcgi'],
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
100 'version'=> ['get' => '/entrez/eutils/efetch.fcgi'],
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
101 'gi' => ['get' => '/entrez/eutils/efetch.fcgi'],
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
102 );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
103
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
104 %FORMATMAP = ( 'gb' => 'genbank',
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
105 'gp' => 'genbank',
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
106 'fasta' => 'fasta',
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
107 );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
108
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
109 $DEFAULTFORMAT = 'gb';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
110 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
111
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
112 # the new way to make modules a little more lightweight
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
113
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
114 sub new {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
115 my ($class, @args ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
116 my $self = $class->SUPER::new(@args);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
117 return $self;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
118 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
119
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
120
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
121 =head2 get_params
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
122
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
123 Title : get_params
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
124 Usage : my %params = $self->get_params($mode)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
125 Function: Returns key,value pairs to be passed to NCBI database
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
126 for either 'batch' or 'single' sequence retrieval method
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
127 Returns : a key,value pair hash
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
128 Args : 'single' or 'batch' mode for retrieval
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
129
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
130 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
131
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
132 sub get_params {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
133 my ($self, $mode) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
134 $self->throw("subclass did not implement get_params");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
135 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
136
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
137 =head2 default_format
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
138
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
139 Title : default_format
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
140 Usage : my $format = $self->default_format
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
141 Function: Returns default sequence format for this module
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
142 Returns : string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
143 Args : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
144
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
145 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
146
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
147 sub default_format {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
148 return $DEFAULTFORMAT;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
149 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
150
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
151 =head2 get_request
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
152
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
153 Title : get_request
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
154 Usage : my $url = $self->get_request
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
155 Function: HTTP::Request
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
156 Returns :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
157 Args : %qualifiers = a hash of qualifiers (ids, format, etc)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
158
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
159 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
160
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
161 sub get_request {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
162 my ($self, @qualifiers) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
163 my ($mode, $uids, $format, $query) = $self->_rearrange([qw(MODE UIDS
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
164 FORMAT QUERY)],
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
165 @qualifiers);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
166
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
167 $mode = lc $mode;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
168 ($format) = $self->request_format() unless ( defined $format);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
169 if( !defined $mode || $mode eq '' ) { $mode = 'single'; }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
170 my %params = $self->get_params($mode);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
171 if( ! %params ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
172 $self->throw("must specify a valid retrieval mode 'single' or 'batch' not '$mode'")
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
173 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
174 my $url = URI->new($HOSTBASE . $CGILOCATION{$mode}[1]);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
175
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
176 unless( defined $uids or defined $query) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
177 $self->throw("Must specify a query or list of uids to fetch");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
178 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
179
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
180 if ($uids) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
181 if( ref($uids) =~ /array/i ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
182 $uids = join(",", @$uids);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
183 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
184 $params{'id'} = $uids;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
185 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
186
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
187 elsif ($query && $query->can('cookie')) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
188 @params{'WebEnv','query_key'} = $query->cookie;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
189 $params{'db'} = $query->db;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
190 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
191
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
192 elsif ($query) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
193 $params{'id'} = join ',',$query->ids;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
194 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
195
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
196 $params{'rettype'} = $format;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
197 if ($CGILOCATION{$mode}[0] eq 'post') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
198 return POST $url,[%params];
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
199 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
200 $url->query_form(%params);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
201 $self->debug("url is $url \n");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
202 return GET $url;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
203 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
204 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
205
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
206 =head2 get_Stream_by_batch
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
207
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
208 Title : get_Stream_by_batch
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
209 Usage : $seq = $db->get_Stream_by_batch($ref);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
210 Function: Retrieves Seq objects from Entrez 'en masse', rather than one
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
211 at a time. For large numbers of sequences, this is far superior
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
212 than get_Stream_by_[id/acc]().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
213 Example :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
214 Returns : a Bio::SeqIO stream object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
215 Args : $ref : either an array reference, a filename, or a filehandle
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
216 from which to get the list of unique ids/accession numbers.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
217
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
218 NOTE: deprecated API. Use get_Stream_by_id() instead.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
219
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
220 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
221
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
222 *get_Stream_by_batch = sub {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
223 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
224 $self->deprecated('get_Stream_by_batch() is deprecated; use get_Stream_by_id() instead');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
225 $self->get_Stream_by_id(@_)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
226 };
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
227
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
228 =head2 get_Stream_by_query
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
229
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
230 Title : get_Stream_by_query
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
231 Usage : $seq = $db->get_Stream_by_query($query);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
232 Function: Retrieves Seq objects from Entrez 'en masse', rather than one
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
233 at a time. For large numbers of sequences, this is far superior
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
234 than get_Stream_by_[id/acc]().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
235 Example :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
236 Returns : a Bio::SeqIO stream object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
237 Args : $query : An Entrez query string or a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
238 Bio::DB::Query::GenBank object. It is suggested that you
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
239 create a Bio::DB::Query::GenBank object and get the entry
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
240 count before you fetch a potentially large stream.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
241
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
242 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
243
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
244 sub get_Stream_by_query {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
245 my ($self, $query) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
246 unless (ref $query && $query->can('query')) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
247 $query = Bio::DB::Query::GenBank->new($query);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
248 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
249 return $self->get_seq_stream('-query' => $query, '-mode'=>'query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
250 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
251
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
252 =head2 postprocess_data
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
253
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
254 Title : postprocess_data
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
255 Usage : $self->postprocess_data ( 'type' => 'string',
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
256 'location' => \$datastr);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
257 Function: process downloaded data before loading into a Bio::SeqIO
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
258 Returns : void
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
259 Args : hash with two keys - 'type' can be 'string' or 'file'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
260 - 'location' either file location or string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
261 reference containing data
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
262
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
263 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
264
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
265 # the default method, works for genbank/genpept, other classes should
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
266 # override it with their own method.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
267
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
268 sub postprocess_data {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
269 my ($self, %args) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
270 my $data;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
271 my $type = uc $args{'type'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
272 my $location = $args{'location'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
273 if( !defined $type || $type eq '' || !defined $location) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
274 return;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
275 } elsif( $type eq 'STRING' ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
276 $data = $$location;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
277 } elsif ( $type eq 'FILE' ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
278 open(TMP, $location) or $self->throw("could not open file $location");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
279 my @in = <TMP>;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
280 close TMP;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
281 $data = join("", @in);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
282 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
283
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
284 # transform links to appropriate descriptions
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
285 if ($data =~ /\nCONTIG\s+/) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
286 $self->warn("CONTIG found. GenBank get_Stream_by_acc about to run.");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
287 my(@batch,@accession,%accessions,@location,$id,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
288 $contig,$stream,$aCount,$cCount,$gCount,$tCount);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
289
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
290 # process GenBank CONTIG join(...) into two arrays
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
291 $data =~ /(?:CONTIG\s+join\()((?:.+\n)+)(?:\/\/)/;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
292 $contig = $1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
293 $contig =~ s/\n|\)//g;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
294 foreach (split /\s*,\s*/,$contig){
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
295 if (/>(.+)<.+>:(.+)/) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
296 ($id) = split /\./, $1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
297 push @accession, $id;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
298 push @location, $2;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
299 $accessions{$id}->{'count'}++;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
300 } elsif( /([\w\.]+):(.+)/ ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
301 ($id) = split /\./, $1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
302 $accessions{$id}->{'count'}++;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
303 push @accession, $id;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
304 push @location, $2;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
305 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
306 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
307
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
308 # grab multiple sequences by batch and join based location variable
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
309 my @unique_accessions = keys %accessions;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
310 $stream = $self->get_Stream_by_acc(\@unique_accessions);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
311 $contig = "";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
312 my $ct = 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
313 while( my $seq = $stream->next_seq() ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
314 if( $seq->accession_number !~ /$unique_accessions[$ct]/ ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
315 printf STDERR "warning, %s does not match %s\n",
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
316 $seq->accession_number, $unique_accessions[$ct];
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
317 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
318 $accessions{$unique_accessions[$ct]}->{'seq'} = $seq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
319 $ct++;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
320 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
321 for (my $i = 0; $i < @accession; $i++) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
322 my $seq = $accessions{$accession[$i]}->{'seq'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
323 unless( defined $seq ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
324 # seq not cached, get next sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
325 $self->warn("unable to find sequence $accession[$i]\n");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
326 return undef;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
327 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
328 my($start,$end) = split(/\.\./, $location[$i]);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
329 $contig .= $seq->subseq($start,$end-$start);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
330 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
331
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
332 # count number of each letter in sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
333 $aCount = () = $contig =~ /a/ig;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
334 $cCount = () = $contig =~ /c/ig;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
335 $gCount = () = $contig =~ /g/ig;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
336 $tCount = () = $contig =~ /t/ig;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
337
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
338 # remove everything after and including CONTIG
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
339 $data =~ s/(CONTIG[\s\S]+)$//i;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
340
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
341 # build ORIGIN part of data file using sequence and counts
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
342 $data .= "BASE COUNT $aCount a $cCount c $gCount g $tCount t\n";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
343 $data .= "ORIGIN \n";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
344 $data .= "$contig\n//";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
345 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
346 else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
347 $data =~ s/<a\s+href\s*=.+>\s*(\S+)\s*<\s*\/a\s*\>/$1/ig;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
348 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
349
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
350 # fix gt and lt
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
351 $data =~ s/&gt;/>/ig;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
352 $data =~ s/&lt;/</ig;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
353 if( $type eq 'FILE' ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
354 open(TMP, ">$location") or $self->throw("couldn't overwrite file $location");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
355 print TMP $data;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
356 close TMP;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
357 } elsif ( $type eq 'STRING' ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
358 ${$args{'location'}} = $data;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
359 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
360 $self->debug("format is ". join(',',$self->request_format()).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
361 " data is\n$data\n");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
362 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
363
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
364
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
365 =head2 request_format
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
366
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
367 Title : request_format
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
368 Usage : my ($req_format, $ioformat) = $self->request_format;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
369 $self->request_format("genbank");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
370 $self->request_format("fasta");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
371 Function: Get/Set sequence format retrieval. The get-form will normally not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
372 be used outside of this and derived modules.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
373 Returns : Array of two strings, the first representing the format for
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
374 retrieval, and the second specifying the corresponding SeqIO format.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
375 Args : $format = sequence format
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
376
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
377 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
378
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
379 sub request_format {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
380 my ($self, $value) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
381 if( defined $value ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
382 $value = lc $value;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
383 if( defined $FORMATMAP{$value} ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
384 $self->{'_format'} = [ $value, $FORMATMAP{$value}];
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
385 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
386 # Try to fall back to a default. Alternatively, we could throw
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
387 # an exception
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
388 $self->{'_format'} = [ $value, $value ];
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
389 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
390 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
391 return @{$self->{'_format'}};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
392 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
393
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
394 =head2 Bio::DB::WebDBSeqI methods
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
395
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
396 Overriding WebDBSeqI method to help newbies to retrieve sequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
397
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
398 =head2 get_Stream_by_acc
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
399
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
400 Title : get_Stream_by_acc
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
401 Usage : $seq = $db->get_Stream_by_acc([$acc1, $acc2]);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
402 Function: Gets a series of Seq objects by accession numbers
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
403 Returns : a Bio::SeqIO stream object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
404 Args : $ref : a reference to an array of accession numbers for
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
405 the desired sequence entries
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
406 Note : For GenBank, this just calls the same code for get_Stream_by_id()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
407
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
408 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
409
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
410 sub get_Stream_by_acc {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
411 my ($self, $ids ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
412 my $newdb = $self->_check_id($ids);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
413 if (defined $newdb && ref($newdb) && $newdb->isa('Bio::DB::RefSeq')) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
414 return $newdb->get_seq_stream('-uids' => $ids, '-mode' => 'single');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
415 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
416 return $self->get_seq_stream('-uids' => $ids, '-mode' => 'single');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
417 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
418 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
419
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
420
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
421 =head2 _check_id
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
422
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
423 Title : _check_id
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
424 Usage :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
425 Function:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
426 Returns : A Bio::DB::RefSeq reference or throws
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
427 Args : $id(s), $string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
428
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
429 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
430
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
431 sub _check_id {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
432 my ($self, $ids) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
433
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
434 # NT contigs can not be retrieved
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
435 $self->throw("NT_ contigs are whole chromosome files which are not part of regular".
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
436 "database distributions. Go to ftp://ftp.ncbi.nih.gov/genomes/.")
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
437 if $ids =~ /NT_/;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
438
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
439 # Asking for a RefSeq from EMBL/GenBank
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
440
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
441 if ($ids =~ /N._/) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
442 $self->warn("[$ids] is not a normal sequence database but a RefSeq entry.".
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
443 " Redirecting the request.\n")
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
444 if $self->verbose >= 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
445 return new Bio::DB::RefSeq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
446 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
447 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
448
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
449 =head2 delay_policy
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
450
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
451 Title : delay_policy
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
452 Usage : $secs = $self->delay_policy
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
453 Function: return number of seconds to delay between calls to remote db
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
454 Returns : number of seconds to delay
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
455 Args : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
456
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
457 NOTE: NCBI requests a delay of 3s between requests. This method
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
458 implements that policy.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
459
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
460 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
461
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
462 sub delay_policy {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
463 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
464 return 3;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
465 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
466
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
467 1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
468 __END__