Mercurial > repos > mahtabm > ensemb_rep_gvl
diff variant_effect_predictor/Bio/Tools/WWW.pm @ 0:2bc9b66ada89 draft default tip
Uploaded
author | mahtabm |
---|---|
date | Thu, 11 Apr 2013 06:29:17 -0400 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/variant_effect_predictor/Bio/Tools/WWW.pm Thu Apr 11 06:29:17 2013 -0400 @@ -0,0 +1,1023 @@ +#----------------------------------------------------------------------------- +# PACKAGE : Bio::Tools::WWW +# PURPOSE : To encapsulate commonly used URLs for web key websites in bioinformatics. +# AUTHOR : Steve Chervitz +# CREATED : 27 Aug 1996 +# REVISION: $Id: WWW.pm,v 1.12 2002/10/22 07:38:46 lapp Exp $ +# +# For documentation, run this module through pod2html +# (preferably from Perl v5.004 or better). +# +# MODIFIED: +# 0.014, sac --- Mon Aug 31 19:41:44 1998 +# * Updated and added a few URLs. +# * Added method strip_html(). +# * Documentation changes. +# +#----------------------------------------------------------------------------- + +package Bio::Tools::WWW; +use strict; +use Bio::Root::Root; +use Exporter (); +use vars qw(@ISA @EXPORT_OK %EXPORT_TAGS $ID $VERSION $BioWWW $Revision + $AUTHORITY); +$AUTHORITY = 'nobody@localhost'; +@ISA = qw( Bio::Root::Root Exporter); +@EXPORT_OK = qw($BioWWW); +%EXPORT_TAGS = ( obj => [qw($BioWWW)], + std => [qw($BioWWW)]); + +$ID = 'Bio::Tools::WWW'; +$VERSION = 0.014; +$Revision = '$Id: WWW.pm,v 1.12 2002/10/22 07:38:46 lapp Exp $'; #' + +## Static object. +$BioWWW = {}; +bless $BioWWW, $ID; +$BioWWW->{'_name'} = "Static $ID object"; + + +## POD Documentation: + +=head1 NAME + +Bio::Tools::WWW - Bioperl manager for web resources related to biology. + +=head1 SYNOPSIS + +=head2 Object Creation + + use Bio::Tools qw(:obj); + + $pdb = $BioWWW->home_url('pdb'); + +There is no need to create a new Bio::Tools::WWW.pm object when the +C<:obj> tag is used. This tag will import the static $BioWWW object +created by Bio::Tools::WWW.pm into your name space. This saves you +from having to call C<new Bio::Tools::WWW>. + +You are free to not use the :obj tag and create the object as you +like, but a Bio::Tools::WWW object is not configurable; any given +script only needs a single copy. + +=head1 INSTALLATION + +This module is included with the central Bioperl distribution: + + http://bio.perl.org/Core/Latest + ftp://bio.perl.org/pub/DIST + +You also need to define URLs for the following variables in this package: + + $Not_found_url : Generic page to show in place of a 404 error. + $Tmp_url : Web-accessible site that is Used for scripts that + need to generate temporary, web-accessible files. + The files need not necessarily be HTML files, but + being on the same disk as the server will permit + faster IO from server scripts. + +=head1 DESCRIPTION + +Bio::Tools::WWW is primarily a URL broker for a select set +of sites related to bioinformatics/genome analysis. It +definitely represents a biased, unexhaustive set. +It might be more accurate to call this module +"Bio::Tools::URL.pm". But this module does handle some non-URL +things and it may do more of this in the future. Having one +module to cover all biologically relevant web utilities +makes it more convenient, especially at this early stage +of development. + +Maintaining accurate URLs over time can be challenging as +new web sites spring up and old sites are re-organized. Because +of this fact, the URLs in this module are not guaranteed to be +correct or exhaustive and will require periodic updating. + +=head2 URL Management + +By keeping URL management within Bio::Tools::WWW.pm, other generic +modules can easily access a variety of different web sites without +having to know about a potential multitude of specific modules +specialized for one database or another. An alternative approach would +be to have addresses defined within modules specialized for different +web sites. This, however, may create maintenance headaches when updating +these addresses. + +=head2 Complex Websites + +Websites with complex datasets may require special treatment +within this module. As an example, +URLs for the Saccharomyces Genome Database are clustered +separately in this module, due to (1) the different ways to +access information at this database and (2) the familiarity +of the developer with this database. The Bio::SGD::WWW.pm inherits from +Bio::Tools::WWW.pm to permit access to the URLs provided by Bio::Tools::WWW.pm +and to SGD-specific HTML and images. + +The organization of Bio::Tools::WWW.pm is expected to evolve as +websites get born, die, and mutate their APIs. + +=head1 SEE ALSO + + http://bio.perl.org/Projects/modules.html - Online module documentation + http://bio.perl.org/ - Bioperl Project Homepage + +=head1 FEEDBACK + +=head2 Mailing Lists + +User feedback is an integral part of the evolution of this and other Bioperl modules. +Send your comments and suggestions preferably to one of the Bioperl mailing lists. +Your participation is much appreciated. + + bioperl-l@bioperl.org - General discussion + http://www.bioperl.org/MailList.shtml - About the mailing lists + +=head2 Reporting Bugs + +Report bugs to the Bioperl bug tracking system to help us keep track the bugs and +their resolution. Bug reports can be submitted via email or the web: + + bioperl-bugs@bio.perl.org + http://bugzilla.bioperl.org/ + +=head1 AUTHOR + +Steve Chervitz, sac@bioperl.org + +=head1 VERSION + +Bio::Tools::WWW.pm, 0.014 + +=head1 COPYRIGHT + +Copyright (c) 1996-98 Steve Chervitz. All Rights Reserved. +This module is free software; you can redistribute it and/or +modify it under the same terms as Perl itself. + + +=cut + + +# +## +### +#### END of main POD documentation. +### +## +# + + +############################ DATA ################################## + +### Database homepage links. +my %Home_url = + ( + 'bioperl' =>'http://bio.perl.org/', + 'bioperl-stanford'=>'http://genome-www.stanford.edu/perlOOP/bioperl/', + 'bioperl-schema' =>'http://bio.perl.org/Projects/Schema/', + 'biomoo' =>'http://bioinformatics.weizmann.ac.il/BioMOO/', + 'blast_ncbi' =>'http://www.ncbi.nlm.nih.gov/BLAST/', + 'blast_wu' =>'http://blast.wustl.edu/', + 'bsm' =>'http://www.biochem.ucl.ac.uk/bsm/', + 'clustal' =>'http://www.csc.fi/molbio/progs/clustalw/clustalw.html', + 'ebi' =>'http://www.ebi.ac.uk/', + 'emotif' =>'http://motif.Stanford.EDU/emotif', + 'entrez' =>'http://www3.ncbi.nlm.nih.gov/Entrez/', + 'expasy' =>'http://www.expasy.ch/', + 'gdb' =>'http://www.gdb.org/', # R.I.P. (Jan 1998); site still functional + 'mips' =>'http://speedy.mips.biochem.mpg.de/', + 'mmdb' =>'http://www.ncbi.nlm.nih.gov/Structure/', + 'modbase' =>'http://guitar.rockefeller.edu/', + 'ncbi' =>'http://www.ncbi.nlm.nih.gov/', + 'pedant' =>'http://pedant.mips.biochem.mpg.de', + 'phylip' =>'http://evolution.genetics.washington.edu/phylip.html', + 'pir' =>'http://www-nbrf.georgetown.edu/pir/', + 'pfam' =>'http://pfam.wustl.edu/', + 'pfam_uk' =>'http://www.sanger.ac.uk/Software/Pfam/', + 'pfam_us' =>'http://pfam.wustl.edu/', + 'pdb' =>'http://www.pdb.bnl.gov/', + 'presage' =>'http://presage.stanford.edu/', + 'geneQuiz' =>'http://www.sander.ebi.ac.uk/genequiz/genomes/sc/', + 'molMov' =>'http://bioinfo.mbb.yale.edu/MolMovDB/', +# 'protMot' =>'http://bioinfo.mbb.yale.edu/ProtMotDB/', # old, use molMov instead + 'pubmed' =>'http://www.ncbi.nlm.nih.gov/PubMed/', + 'sacch3d' =>'http://genome-www.stanford.edu/Sacch3D/', + 'sgd' =>'http://genome-www.stanford.edu/Saccharomyces/', +# 'scop' =>'http://www.pdb.bnl.gov/scop/', + 'scop' =>'http://scop.stanford.edu/scop/', + 'swissProt' =>'http://www.expasy.ch/sprot/sprot-top.html', + 'webmol' =>'http://genome-www.stanford.edu/structure/webmol/', + 'ypd' =>'http://quest7.proteome.com/YPDhome.html', + ); + +### Database access CGI stems. (For some DBs the home URL can be used as the CGI stem) +my %Stem_url = + ( + 'emotif' =>'http://dna.Stanford.EDU/cgi-bin/emotif/', + 'entrez' =>'http://www3.ncbi.nlm.nih.gov/htbin-post/Entrez/query?', + 'pdb' =>'http://www.pdb.bnl.gov/pdb-bin/', + 'pfam_uk' =>'http://www.sanger.ac.uk/cgi-bin/Pfam/', + 'pfam_us' =>'http://pfam.wustl.edu/cgi-bin/', + 'pir' =>'http://www-nbrf.georgetown.edu/cgi-bin/nbrfget?', + ); + + +### Database access stems/links. +my %Search_url = + ( #'3db' =>'http://pdb.pdb.bnl.gov/cgi-bin/pdbids?3DB_ID=', # Former stem + '3db' =>$Stem_url{'pdb'}.'opdbshort?oPDBid=', # New stem (aug 1997) + 'embl' =>$Home_url{'ebi'}.'htbin/emblfetch?', + 'expasy' =>$Home_url{'expasy'}.'cgi-bin/', # program name and query string must be supplied. + 'cath' =>$Home_url{'bsm'}.'cath/CATHSrch.pl?type=PDB&query=', + 'cog_seq' =>$Home_url{'ncbi'}.'cgi-bin/COG/nph-cognitor?seq=', # add sequence + # To cog_orf, append ORF name ('YAL005c'). Case-sensitive! YAL005C won't work! + 'cog_orf' =>$Home_url{'ncbi'}.'cgi-bin/COG/cogeseq?', + 'ec1' =>$Home_url{'gdb'}.'bin/bio/wais_q-bio?object_class_key=30&jhu_id=', + 'ec2' =>$Home_url{'bsm'}.'enzymes/', + 'ec3' =>$Home_url{'expasy'}.'cgi-bin/get-enzyme-entry?', + 'emotif_id' =>$Stem_url{'emotif'}.'nph-identify?sequence=', + 'entrez' =>$Stem_url{'entrez'}."db=p_r?db=1&choseninfo=ORF_NAME%20[Gene%20Name]\@1\@1&form=4&field=Gene%20Name&mode=0&retrievestring=ORF_NAME%20[Gene%20Name]", + 'gb_n' =>$Stem_url{'entrez'}."db=n&form=6&dopt=g&uid=", + 'gb_p' =>$Stem_url{'entrez'}."db=p&form=6&dopt=g&uid=", + 'gb_struct' =>$Stem_url{'entrez'}."db=t&form=6&dopt=s&uid=", + 'pdb' =>$Stem_url{'pdb'}.'send-text?filename=', + 'medline' =>$Stem_url{'entrez'}.'form=6&db=m&Dopt=r&uid=', + 'mmdb' =>$Stem_url{'entrez'}.'db=t&form=6&Dopt=s&uid=', + 'modbase_orf' =>$Home_url{'modbase'}.'gm-cgi-bin/orf_page.cgi?pg1=0.5&pg2=1.0&orf=', + # To the modbase_model, append yeast ORF name &pdb=<4-LETTER_CODE>&chain=<UPCASE LETTER, IF ANY> + 'modbase_model' =>$Home_url{'modbase'}.'gm-cgi-bin/model_page.cgi?pg1=0.5&pg2=1.0&orf=', + 'molMov' =>$Home_url{'molMov'}.'search.cgi?pdb=', + 'pdb' =>$Stem_url{'pdb'}.'opdbshort?oPDBid=', # same as 3db + 'pdb_coord' =>$Stem_url{'pdb'}.'send-pdb?filename=', # retrieves full coordinate file + 'pfam' =>$Home_url{'pfam'}.'cgi-bin/nph-hmm_search?evalue=1.0&protseq=', # default: seq search, US + 'pfam_sp_uk' =>$Stem_url{'pfam_uk'}.'swisspfamget.pl?name=', + 'pfam_seq_uk' =>$Stem_url{'pfam_uk'}.'nph-search.cgi?evalue=1.0&type=normal&protseq=', + 'pfam_sp_us' =>$Stem_url{'pfam_us'}.'getswisspfam?key=', + 'pfam_seq_us' =>$Stem_url{'pfam_us'}.'nph-hmm_search?evalue=1.0&protseq=', + 'pfam_form' =>$Home_url{'pfam'}.'cgi-bin/hmm_page.cgi', # interactive search form + 'pir_id' =>$Stem_url{'pir'}.'fmt=c&xref=0&id=', + 'pir_acc' =>$Stem_url{'pir'}.'fmt=c&xref=1&id=', + 'pir_uid' =>$Stem_url{'pir'}.'uid=', + 'pdbSum' =>$Home_url{'bsm'}.'cath/GetPDBSUMCODE.pl?code=', +# 'protMot' =>$Home_url{'protMot'}.'search.cgi?pdb=', # old, use molMov instead + 'presage_sp' =>$Home_url{'presage'}.'search.cgi?spac=', + 'swpr' =>$Home_url{'expasy'}.'cgi-bin/get-sprot-entry?', + 'swModel' =>$Home_url{'expasy'}.'cgi-bin/sprot-swmodel-sub?', + 'swprSearch' =>$Home_url{'expasy'}.'cgi-bin/sprot-search-ful?', + + ### SCOP tlev options can be appended to the stem after adding a PDB ID. + ### tlev options are: 'dm'(domain), 'sf'(superfamily), 'fa'(family), 'cf'(common fold), 'cl'(class) + ### E.g., search.cgi?pdb=1ARD;tlev=dm + + 'scop' =>$Home_url{'scop'}.'search.cgi?pdb=', ### better to use scop_pdb. + 'scop_pdb' =>$Home_url{'scop'}.'search.cgi?pdb=', + 'scop_data' =>$Home_url{'scop'}.'data/scop.', ### Deprecated: frequent changes. + + ## Search URLs for SGD/Sacch3D are contained %SGD_url and %S3d_url (below). + + # For wormpep, the query string MUST end with "&keyword=" (after appending a sequence ID) + 'wormpep' =>'http://www.sanger.ac.uk/cgi-bin/wormpep_fetch.pl?entry=', + 'wormace' =>'http://webace.sanger.ac.uk/cgi-bin/webace?db=wormace&class=Sequence&text=yes&object=', + + ### YPD: You must use a valid gene name or ORF name (IFF there is no gene name). + ### For this reason it is most convenient to use SGD's Protein_Info link + ### which can accept either and will provide a proper link to YPD. + 'ypd' =>'http://quest7.proteome.com/YPD/', + ); + + + +### CGI stems for SGD and Sacch3D. +my %SGD_stem_url = + ('stanford' =>'http://genome-www.stanford.edu/', + 'sgd' =>'http://genome-www.stanford.edu/cgi-bin/SGD/', + 'sgd2' =>'http://genome-www2.stanford.edu/cgi-bin/SGD/', + 's3d' =>'http://genome-www.stanford.edu/cgi-bin/SGD/Sacch3D/', + 's3d2' =>'http://genome-www2.stanford.edu/cgi-bin/SGD/Sacch3D/', + 's3d3' =>'http://genome-www3.stanford.edu/cgi-bin/SGD/Sacch3D/', + 'sacchdb' =>'http://genome-www.stanford.edu/cgi-bin/dbrun/SacchDB?', + ); + +### SGD stems and links. +my %SGD_url = + ('home' =>$Home_url{'sgd'}, + 'help' =>$Home_url{'sgd'}.'help/', + 'mammal' =>$Home_url{'sgd'}.'mammal/', + 'worm' =>$Home_url{'sgd'}.'worm/', + 'gene' =>$SGD_stem_url{'sacchdb'}.'find+Locus+', + 'locus' =>$SGD_stem_url{'sacchdb'}.'find+Locus+', + 'orf' =>$SGD_stem_url{'sacchdb'}.'find+Locus+', + 'mipsorf' =>$SGD_stem_url{'sgd'}."mips-orfs?", + 'gene_info' =>$SGD_stem_url{'sacchdb'}.'find+Gene_Info+', + 'prot_info' =>$SGD_stem_url{'sacchdb'}.'find+Protein_Info+', + 'seq' =>$SGD_stem_url{'sgd'}.'seqDisplay?seq=', + 'gi' =>$SGD_stem_url{'sacchdb'}.'find+Sequence+Database+=+GenPept+AND+NEXT+=+', + 'chr' =>$SGD_stem_url{'sgd2'}.'seqTools?chr=', + 'chr_old' =>$SGD_stem_url{'sgd'}.'dnaredir?chr=', + 'seq_an' =>$SGD_stem_url{'sgd2'}.'seqTools?seqname=', + 'seq_an_old' =>$SGD_stem_url{'sgd'}.'dnaredir?seqname=', + 'map_chr' =>$SGD_stem_url{'sgd'}.'ORFMAP/ORFmap?chr=', + 'map_orf' =>$SGD_stem_url{'sgd'}.'ORFMAP/ORFmap?seq=', +# 'chr' =>$SGD_stem_url{'sgd2'}.'seqform?chr=', +# 'seg' =>$SGD_stem_url{'sgd2'}.'seqform?seg=', +# 'fea' =>$SGD_stem_url{'sgd2'}.'featureform?seg=', + 'feature' =>$SGD_stem_url{'sgd2'}.'featureform?chr=', # complete with "5&beg=100&end=400" + 'search' =>$SGD_stem_url{'sgd'}.'search?', + 'images' =>$SGD_stem_url{'stanford'}.'images/', + 'suggest' =>$SGD_stem_url{'stanford'}.'forms/sgd-suggestion.html', + 'tmp' =>$SGD_stem_url{'stanford'}.'tmp/', + ); + + +### Sacch3D stems and links. +my %S3d_url = + ('home' =>$Home_url{'sacch3d'}, + 'search' =>$Home_url{'sacch3d'}.'search.html', + 'help' =>$Home_url{'sacch3d'}.'help/', + 'new' =>$Home_url{'sacch3d'}.'new/', + 'chrm' =>$Home_url{'sacch3d'}.'data/chr', + 'domains' =>$Home_url{'sacch3d'}.'domains/', + 'genequiz' =>$Home_url{'sacch3d'}.'genequiz/', + 'analysis' =>$Home_url{'sacch3d'}.'analysis/', + 'scop' =>$SGD_stem_url{'s3d3'}.'getscop?data=', + 'scop_fold' =>$SGD_stem_url{'s3d3'}.'getscop?type=fold&data=', + 'scop_class' =>$SGD_stem_url{'s3d3'}.'getscop?type=class&data=', + 'scop_gene' =>$SGD_stem_url{'s3d3'}.'getscop?type=gene&data=', + 'gene' =>$SGD_stem_url{'s3d'}.'get?class=gene&item=', + 'orf' =>$SGD_stem_url{'s3d'}.'get?class=orf&item=', + 'text' =>$SGD_stem_url{'s3d'}.'get?class=text&item=', + 'pdb' =>$SGD_stem_url{'s3d'}.'get?class=pdb&item=', + 'pdb_coord' =>$SGD_stem_url{'s3d'}.'pdbcoord.pl?id=', + 'dsc' =>$SGD_stem_url{'s3d'}.'dsc.pl?gene=', + 'emotif' =>$SGD_stem_url{'s3d'}.'seq_search.pl?db=emotif&gene=', + 'pfam' =>$SGD_stem_url{'s3d'}.'seq_search.pl?db=pfam&gene=', + 'pfam_uk' =>$SGD_stem_url{'s3d'}.'seq_search.pl?db=pfam&loc=uk&gene=', + 'pfam_us' =>$SGD_stem_url{'s3d'}.'seq_search.pl?db=pfam&loc=us&gene=', + 'blast_pdb' =>$SGD_stem_url{'s3d'}.'getblast?db=pdb&name=', + 'blast_nr' =>$SGD_stem_url{'s3d'}.'getblast?db=nr&name=', + 'blast_est' =>$SGD_stem_url{'s3d'}.'getblast?db=est&name=', + 'blast_mammal' =>$SGD_stem_url{'s3d'}.'getblast?db=mammal&name=', + 'blast_human' =>$SGD_stem_url{'s3d'}.'getblast?db=human&name=', + 'blast_worm' =>$SGD_stem_url{'s3d'}.'getblast?db=worm&name=', + 'blast_yeast' =>$SGD_stem_url{'s3d'}.'getblast?db=yeast&name=', + 'blast_worm_yeast'=>$SGD_stem_url{'s3d'}.'getblast?db=worm&query=worm&name=', + 'patmatch' =>$SGD_stem_url{'s3d2'}.'grepmatch?', ## deprecated + 'grepmatch' =>$SGD_stem_url{'s3d2'}.'grepmatch?', + 'pdb_neighbors' =>$SGD_stem_url{'s3d'}.'pdb_neighbors?id=CHAIN&gene=ORF_NAME', + ); + + +### 3D viewer stems. +my %Viewer_url = +# ('java' =>$SGD_stem_url{'sgd'}.'Sacch3D/pdbViewer.pl?pdbCode=PDB&orf=', + ( + 'java' =>$SGD_stem_url{'sgd'}.'Sacch3D/pdbViewer.pl?pdbCode=', # Default java viewer + 'webmol' =>$SGD_stem_url{'sgd'}.'Sacch3D/pdbViewer.pl?pdbCode=', + 'codebase' =>$SGD_stem_url{'stanford'}.'structure/webmol/lib', + 'rasmol' =>$Stem_url{'pdb'}.'send-ras?filename=', + 'chime' =>$Stem_url{'pdb'}.'ccpeek?id=', + 'cn3d' =>$Stem_url{'entrez'}.'db=t&form=6&Dopt=i&Complexity=Cn3D+Subset&uid=', + 'kinemage' =>'http://prosci.org/Kinemage', + ); + + +### Stock HTML +# The error reporting HTML strings represent some experiments in human psychology: +# how do you induce users to report errors that you should know about yet not +# get flooded with trivial problems caused by novices? +my %Html = + ('authority' =>qq|<A HREF="mailto:$AUTHORITY"><b>$AUTHORITY</b></A>|, + 'trouble' => <<"QQ_TROUBLE_QQ", +<p>If this problem persists, <A HREF="mailto:$AUTHORITY"><b>please notify us.</b></A> +Include a copy of this error page with your message. Thanks.<p> +QQ_TROUBLE_QQ + 'notify' => <<"QQ_NOTIFY_QQ", +<A HREF="mailto:$AUTHORITY"><b>Please notify us.</b></A> +Include a copy of this error page with your message. Thanks.<p> +QQ_NOTIFY_QQ + 'ourFault' => <<"QQ_FAULT_QQ", +<p><b>This is our fault!</b> There is apparently a problem with our software +that we may not know about. <A HREF="mailto:$AUTHORITY"><b>Please notify us!</b></A> +Include a copy of this error page with your message. Thanks.<p> +QQ_FAULT_QQ + 'techDiff' => <<"QQ_TECH_QQ", +<p><big>We are experiencing technical difficulties now.<br> +We will have the problem fixed soon. Sorry for any inconvenience.</big><p> +QQ_TECH_QQ + + ); + + +### Miscellaneous URLs. Configure as desired for your site. +my $Not_found_url = 'http://genome-www.stanford.edu/Sacch3D/notfound.html'; +my $Tmp_url = 'http://genome-www.stanford.edu/tmp/'; + + + +=head1 APPENDIX + +Methods beginning with a leading underscore are considered private +and are intended for internal use by this module. They are +B<not> considered part of the public interface and are described here +for documentation purposes only. + +=cut + +######################################################################### +## ACCESSOR METHODS +######################################################################### + + +=head2 home_url + + Usage : $BioWWW->home_url(<string>) + Purpose : To obtain the homepage URL for a biological database or resource. + Returns : String containing the URL (including "http://") + Argument : String + : Currently acceptable arguments are: + : bioperl bioperl-schema biomoo bsm ebi emotif entrez + : expasy mips mmdb ncbi pir pfam pdb geneQuiz + : molMov pubmed sacch3d sgd scop swissProt webmol ypd + Throws : Warns if argument cannot be resolved to a URL. + Comments : The URLs listed here do not represent a complete list. + : Expect this to evolve and grow with time. + +See Also : L<search_url>() + +=cut + +#------------- +sub home_url { +#------------- + my($self,$arg) = @_; + $arg eq 'all' and return %Home_url; + (exists $Home_url{$arg}) ? $Home_url{$arg} + : ($self->warn("Can't resolve argument to URL: $arg"), + $Not_found_url); +} + + + +=head2 search_url + + Usage : $BioWWW->search_url(<string>) + Purpose : To provide a URL stem for a search engine at a biological database + : or resource. + Returns : String containing the URL (including "http://") + Argument : String + : Currently acceptable arguments are: + : 3db embl cath ec1 ec2 ec3 emotif_id entrez gb1 gb2 + : gb3 gb4 gb5 pdb medline mmdb pdb pdb_coord pfam pir_acc + : pdbSum molMov swpr swModel swprSearch scop scop_pdb scop_data + : ypd + Throws : Warns if argument cannot be resolved to a URL. + Comments : Unlike the homepage URLs, this method does not return a complete + : URL but a stem which must be further modified, typically by + : appending data to it, before it can be used. The data appended + : depends on the specific URL; typically, it is a database ID or + : other unique identifier. + : The requirements for each URL will be described here eventually. + : + : The URLs listed here do not represent a complete list. + : Expect this to evolve and grow with time. + : + : Given this complexity, it may be useful to provide special methods + : for these different URLs. This would however result in an + : explosion of methods that might make this module less + : maintainable and harder to use. + +See Also : L<home_url>() + +=cut + +#-------------- +sub search_url { +#-------------- + my($self,$arg) = @_; + $arg eq 'all' and return %Search_url; + (exists $Search_url{$arg}) ? $Search_url{$arg} + : ($self->warn("Can't resolve argument to URL: $arg"), + $Not_found_url); +} + + + +=head2 stem_url + + Usage : $BioWWW->stem_url(<string>) + Purpose : To obtain the minimal stem URL for searching a biological database or resource. + Returns : String containing the URL (including "http://") + Argument : String + : Currently acceptable arguments are: + : emotif entrez pdb + Throws : Warns if argument cannot be resolved to a URL. + Comments : The URLs stems returned by this method are much more minimal than + : this provided by search_url(). Use of these stems requires knowledge + : of the CGI scripts which they invoke. + +See Also : L<search_url>() + +=cut + +#-------------- +sub stem_url { +#-------------- + my($self,$arg) = @_; + $arg eq 'all' and return %Stem_url; + (exists $Stem_url{$arg}) ? $Stem_url{$arg} + : ($self->warn("Can't resolve argument to URL: $arg"), + $Not_found_url); +} + + + +=head2 viewer_url + + Usage : $BioWWW->viewer_url(<string>) + Purpose : To obtain the stem URL for a 3D viewer (RasMol, WebMol, Cn3D) + Returns : String containing the URL (including "http://") + Argument : String + : Currently acceptable arguments are: + : rasmol webmol cn3d java (java is an alias for webmol) + Throws : Warns if argument cannot be resolved to a URL. + Comments : The 4-letter Brookhaven PDB identifier must be appended to the + : URL provided by this method. + : The URLs listed here do not represent a complete list. + : Expect this to evolve and grow with time. + +=cut + +#--------------- +sub viewer_url { +#--------------- + my($self,$arg) = @_; + $arg eq 'all' and return %Viewer_url; + (exists $Viewer_url{$arg}) ? $Viewer_url{$arg} + : ($self->warn("Can't resolve argument to URL: $arg"), + $Not_found_url); +} + + + +=head2 not_found_url + + Usage : $BioWWW->not_found_url() + Purpose : To obtain the URL for a web page to be shown in place of a 404 error. + Returns : String containing the URL (including "http://") + Argument : n/a + Throws : n/a + Comments : This URL should be customized as desired. + +=cut + +#----------------- +sub not_found_url { my $self = shift; $Not_found_url; } +#----------------- + + +=head2 tmp_url + + Usage : $BioWWW->tmp_url() + Purpose : To obtain the URL for a temporary, web-accessible directory. + Returns : String containing the URL (including "http://") + Argument : n/a + Throws : n/a + Comments : This URL should be customized as desired. + +=cut + +#----------- +sub tmp_url { my $self = shift; $Tmp_url; } +#----------- + + + +=head2 search_link + + Usage : $BioWWW->search_link(<site>, <value>, <text>) + Purpose : Wrapper for search_url() that returns the URL within an HTML anchor. + Returns : String containing the HTML anchor ( qq|<A HREF="http://..."</A>|) + Argument : <site> = string to be used as argument for search_url() + : <value> = string to be appended to the search URL stem. + : <text> = string to be shown as the link text (default = <value>). + Throws : n/a + Status : Experimental + +See Also : L<search_url>() + +=cut + +#--------------- +sub search_link { +#--------------- + my($self,$arg,$value,$text) = @_; + my $url = $self->search_url($arg); + $text ||= $value; + qq|<A HREF="$url$value">$text</A>|; +} + + + +=head2 viewer_link + + Usage : $BioWWW->viewer_link(<site>, <value>, <text>) + Purpose : Wrapper for viewer_url() that returns the complete URL within an HTML anchor. + Returns : String containing the HTML anchor ( qq|<A HREF="http://..."</A>|) + Argument : <site> = string to be used as argument for viewer_url() + : <value> = string to be appended to the viewer URL stem. + : <text> = string to be shown as the link text (default = <value>). + Throws : n/a + Status : Experimental + +See Also : L<viewer_url>() + +=cut + +#---------------- +sub viewer_link { +#---------------- + my($self,$arg,$value,$text) = @_; + my $url = $self->viewer_url($arg); + $text ||= $value; + qq|<A HREF="$url$value">$text</A>|; +} + + + +=head2 html + + Usage : $BioWWW->html(<string>) + Purpose : To obtain HTML-formatted text for frequently needed web-page messages. + Returns : String containing the HTML anchor ( qq|<A HREF="http://..."</A>|) + Argument : String. + : Currently acceptable arguments are: + : authority (mailto: link for webmaster; shows e-mail address as link) + : notify (wraps mailto:authority link with text for link "please notify us") + : ourFault ("this problem is our fault. If it persists <notify-link>") + : trouble (same as ourFault but doesn't blame us for the problem) + : techDiff ("we are experiencing technical difficulties. Please stand by.") + Throws : n/a + Comments : The authority (webmaster) is imported from the Bio::Root::Global.pm + : module. The value for $AUTHORITY should be set there, or + : customize this module so that it doesn't use Bio::Root::Global.pm. + +=cut + +#---------- +sub html { +#---------- + my($self,$arg) = @_; + $arg eq 'all' and return %Html; + (exists $Html{$arg}) ? $Html{$arg} : "<pre>(missing HTML for \"$arg\")</pre>"; +} + + +### +### Below are accessors specialized for the Saccharomyces Genome Database +### It is possible that they will be moved to Bio::SGD::WWW.pm in the future. +### + + +=head2 sgd_url + + Usage : $BioWWW->sgd_url(<string>) + Purpose : To obtain the webpage URL or search stem for SGD. + Returns : String containing the URL (including "http://") + Argument : String + : Currently acceptable arguments (TODO). + Throws : Warns if argument cannot be resolved to a URL. + Comments : This accessor is specialized for the Saccharomyces Genome Database. + : It is possible that it will be moved to SGD::WWW.pm in the future. + +See Also : L<search_url>() + +=cut + +#------------ +sub sgd_url { +#------------ + my($self,$arg) = @_; + $arg eq 'all' and return %SGD_url; + (exists $SGD_url{$arg}) ? $SGD_url{$arg} + : ($self->warn("Can't resolve argument to URL: $arg"), + $Not_found_url); +} + + + +=head2 s3d_url + + Usage : $BioWWW->s3d_url(<string>) + Purpose : To obtain the webpage URL or search stem for Sacch3D. + Returns : String containing the URL (including "http://") + Argument : String + : Currently acceptable arguments (TODO). + Throws : Warns if argument cannot be resolved to a URL. + Comments : This accessor is specialized for the Saccharomyces Genome Database. + : It is possible that it will be moved to SGD::WWW.pm in the future. + +See Also : L<search_url>() + +=cut + +#----------- +sub s3d_url { +#----------- + my($self,$arg) = @_; + $arg eq 'all' and return %S3d_url; + (exists $S3d_url{$arg}) ? $S3d_url{$arg} + : ($self->warn("Can't resolve argument to URL: $arg"), + $Not_found_url); +} + + + +=head2 sgd_stem_url + + Usage : $BioWWW->sgd_stem_url(<string>) + Purpose : To obtain the minimal stem URL for a SGD/Sacch3D CGI script. + Returns : String containing the URL (including "http://") + Argument : String + : Currently acceptable arguments (TODO). + Throws : Warns if argument cannot be resolved to a URL. + Comments : This accessor is specialized for the Saccharomyces Genome Database. + : It is possible that it will be moved to SGD::WWW.pm in the future. + +See Also : L<search_url>() + +=cut + +#----------------- +sub sgd_stem_url { +#----------------- + my($self,$arg) = @_; + $arg eq 'all' and return %SGD_stem_url; + (exists $SGD_stem_url{$arg}) ? $SGD_stem_url{$arg} + : ($self->warn("Can't resolve argument to URL: $arg"), + $Not_found_url); +} + + + +=head2 s3d_link + + Usage : $BioWWW->s3d_link(<site>, <value>, <text>) + Purpose : Wrapper for s3d_url() that returns the complete URL within an HTML anchor. + Returns : String containing the URL (including "http://") + Argument : <site> = string to be used as argument for s3d_url() + : <value> = string to be appended to the s3d URL stem. + : <text> = string to be shown as the link text (default = <value>). + Throws : n/a + Status : Experimental + Comments : This accessor is specialized for the Saccharomyces Genome Database. + : It is possible that it will be moved to SGD::WWW.pm in the future. + +See Also : L<s3d_url>(), L<sgd_link>() + +=cut + +#-------------- +sub s3d_link { +#-------------- + my($self,$arg,$value,$text) = @_; + my $url = $self->s3d_url($arg); + $text ||= $value; + qq|<A HREF="$url$value">$text</A>|; +} + + + +=head2 sgd_link + + Usage : $BioWWW->sgd_link(<site>, <value>, <text>) + Purpose : Wrapper for sgd_url() that returns the complete URL within an HTML anchor. + Returns : String containing the URL (including "http://") + Argument : <site> = string to be used as argument for sgd_url() + : <value> = string to be appended to the sgd URL stem. + : <text> = string to be shown as the link text (default = <value>). + Throws : n/a + Status : Experimental + Comments : This accessor is specialized for the Saccharomyces Genome Database. + : It is possible that it will be moved to SGD::WWW.pm in the future. + +See Also : L<sgd_url>(), L<s3d_link>() + +=cut + +#-------------- +sub sgd_link { +#-------------- + my($self,$arg,$value,$text) = @_; + my $url = $self->sgd_url($arg); + $text ||= $value; + qq|<A HREF="$url$value">$text</A>|; +} + + +######################################################################### +## INSTANCE METHODS +######################################################################### + +## Note that similar functions to those presented below are also availble +## via L. Stein's CGI.pm. These are more experimental versions. + +=head2 start_html + + Usage : $BioWWW->start_html() + Purpose : Prints the "Content-type: text/html\n\n<HTML>\n" header. + Returns : n/a; This method prints the Content-type string shown above. + Argument : n/a + Throws : n/a + Status : Experimental + Comments : This method prevents redundant invocations thus avoiding th + : accidental printing of the "content-type..." on the page. + : If using L. Stein's CGI.pm, this is similar to $query->header() + : (Does CGI.pm prevent redundant invocation?) + +=cut + +#---------------' +sub start_html { +#--------------- + my $self=shift; + if(!$self->{'_started_html'}) { + print "Content-type: text/html\n\n<HTML>\n"; + $self->{'_started_html'} = 1; + } +} + + +=head2 redirect + + Usage : $BioWWW->redirect(<string>) + Purpose : Prints the header needed to redirect a web browser to a supplied URL. + Returns : n/a; Prints the redirection header. + Argument : String containing the URL to be redirected to. + Throws : n/a + Status : Experimental + +=cut + +#------------- +sub redirect { +#------------- + my($self,$url) = @_; + + print "Location: $url\n"; + print "Content-type: text/html\n\n"; +} + + + +=head2 pre + + Usage : $BioWWW->pre("text to be pre-formatted"); + Purpose : To produce HTML for text that is not to be formated by the brower. + Returns : String containing the "<pre>" formatted html. + Argument : n/a + Throws : n/a + Status : Experimental + +=cut + +#-------- +sub pre { +#-------- + my $self = shift; + "<PRE>\n".shift()."\n</PRE>"; +} + + +#---------------- +sub html_footer { +#---------------- + my( $self, @param ) = @_; + + my( $linkTo, $linkText, $modified, $mail, $mailText, $top) = + $self->_rearrange([qw(LINKTO LINKTEXT MODIFIED MAIL MAILTEXT TOP)], @param); + + $modified = (scalar $modified) + ? qq|<center><small><b>Last modified: $modified </b></small></center>| + : ''; + + $linkTo ||= ''; + +# $top = (defined $top) ? qq|<a href="top">Top</a><br>| : ''; + $top = qq|<a href="#top">Top</a>|; ## Utilizing the HTML bug/feature wherein + ## a bogus name anchor defaults to the + ## top of the page. + + return <<"HTML"; +<p> +<hr size=3 noshade width=95%> +$top | <a href="$linkTo"> $linkText</a><br> +$modified +<small><i><a href="mailto:$mail">$mailText</a></i></small> +</body></html> + +HTML +} + + +=head2 strip_html + + Usage : $boolean = &strip_html( string_ref, [fast] ); + Purpose : Removes HTML formatting from a supplied string. + Returns : Boolean: true if string was stripped, false if not. + Argument : string_ref = reference to a string containing the whole + : web page to be stripped. + : fast = a non-zero value. Optional. If set, a faster + : but perhaps less thorough procedure is used for + : stripping. Default = not fast. + Throws : Exception if the argument is not a scalar reference. + Comments : Based on code originally written by Alex Dong Li + : (ali@genet.sickkids.on.ca). + : This is a more generic version of the function that appears + : in Bio::Tools::Blast::HTML.pm + : This version does not perform any Blast-specific stripping. + : + : This employs a simple method for removing tags that + : will fail under following conditions: + : 1) if quoted > appears in a tag (does this ever happen?) + : 2) if a tag is split over multiple lines and this method is + : used to process one line at a time. + : + : Without fast mode, large HTML files can take exceedingly long times to + : strip (e.g., 1Meg file with many tags can take 10 minutes versus 5 seconds + : in fast mode. Try the swissprot yeast table). If you know the HTML to be + : well-behaved (i.e., tags are not split across mutiple lines), use fast + : mode for large, dense files. + +=cut + +#--------------- +sub strip_html { +#--------------- + my ($self, $string_ref, $fast) = @_; + + ref $string_ref eq 'SCALAR' or + $self->throw("Can't strip HTML: ". + "Argument is should be a SCALAR reference not a ${\ref $string_ref}"); + + my $str = $$string_ref; + my $stripped = 0; + + if($fast) { + # MULTI-STRING-MODE: Much faster than single-string mode + # but will miss tags that span multiple lines. + # This is fine if you know the HTML to be "well-behaved". + + my @lines = split("\n", $str); + foreach (@lines) { + s/<[^>]+>| //gi and $stripped = 1; + } + + # This regexp likely won't work properly in this mode. + foreach (@lines) { + s/(\A|\n)>\s+/\n\n>/gi and $stripped = 1; + } + $$string_ref = join ("\n", @lines); + + } else { + + # SINGLE-STRING-MODE: Can be very slow for long strings with many substitutions. + + # Removing all "<>" tags. + $str =~ s/<[^>]+>| //sgi and $stripped = 1; + + # Re-uniting any lone '>' characters. Not really necessary for functional HTML + $str =~ s/(\A|\n)>\s+/\n\n>/sgi and $stripped = 1; + + $$string_ref = $str; + } + $stripped; +} + + +1; +__END__ + +######################################################################## +## END OF CLASS +######################################################################## + +=head1 FOR DEVELOPERS ONLY + +=head2 Data Members + +An instance of Bio::Tools::WWW.pm is a blessed reference to a hash containing +all or some of the following fields: + + FIELD VALUE + -------------------------------------------------------------- + _started_html Defined the on the initial invocation of start_html() + to avoid duplicate printing out the "Content-type..." header. + + +=cut + +1; + +