Mercurial > repos > galaxyp > psm_to_sam

--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/COPYING	Fri Oct 02 14:14:15 2015 -0400
@@ -0,0 +1,121 @@
+Creative Commons Legal Code
+
+CC0 1.0 Universal
+
+    CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE
+    LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN
+    ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS
+    INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES
+    REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS
+    PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM
+    THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED
+    HEREUNDER.
+
+Statement of Purpose
+
+The laws of most jurisdictions throughout the world automatically confer
+exclusive Copyright and Related Rights (defined below) upon the creator
+and subsequent owner(s) (each and all, an "owner") of an original work of
+authorship and/or a database (each, a "Work").
+
+Certain owners wish to permanently relinquish those rights to a Work for
+the purpose of contributing to a commons of creative, cultural and
+scientific works ("Commons") that the public can reliably and without fear
+of later claims of infringement build upon, modify, incorporate in other
+works, reuse and redistribute as freely as possible in any form whatsoever
+and for any purposes, including without limitation commercial purposes.
+These owners may contribute to the Commons to promote the ideal of a free
+culture and the further production of creative, cultural and scientific
+works, or to gain reputation or greater distribution for their Work in
+part through the use and efforts of others.
+
+For these and/or other purposes and motivations, and without any
+expectation of additional consideration or compensation, the person
+associating CC0 with a Work (the "Affirmer"), to the extent that he or she
+is an owner of Copyright and Related Rights in the Work, voluntarily
+elects to apply CC0 to the Work and publicly distribute the Work under its
+terms, with knowledge of his or her Copyright and Related Rights in the
+Work and the meaning and intended legal effect of CC0 on those rights.
+
+1. Copyright and Related Rights. A Work made available under CC0 may be
+protected by copyright and related or neighboring rights ("Copyright and
+Related Rights"). Copyright and Related Rights include, but are not
+limited to, the following:
+
+  i. the right to reproduce, adapt, distribute, perform, display,
+     communicate, and translate a Work;
+ ii. moral rights retained by the original author(s) and/or performer(s);
+iii. publicity and privacy rights pertaining to a person's image or
+     likeness depicted in a Work;
+ iv. rights protecting against unfair competition in regards to a Work,
+     subject to the limitations in paragraph 4(a), below;
+  v. rights protecting the extraction, dissemination, use and reuse of data
+     in a Work;
+ vi. database rights (such as those arising under Directive 96/9/EC of the
+     European Parliament and of the Council of 11 March 1996 on the legal
+     protection of databases, and under any national implementation
+     thereof, including any amended or successor version of such
+     directive); and
+vii. other similar, equivalent or corresponding rights throughout the
+     world based on applicable law or treaty, and any national
+     implementations thereof.
+
+2. Waiver. To the greatest extent permitted by, but not in contravention
+of, applicable law, Affirmer hereby overtly, fully, permanently,
+irrevocably and unconditionally waives, abandons, and surrenders all of
+Affirmer's Copyright and Related Rights and associated claims and causes
+of action, whether now known or unknown (including existing as well as
+future claims and causes of action), in the Work (i) in all territories
+worldwide, (ii) for the maximum duration provided by applicable law or
+treaty (including future time extensions), (iii) in any current or future
+medium and for any number of copies, and (iv) for any purpose whatsoever,
+including without limitation commercial, advertising or promotional
+purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each
+member of the public at large and to the detriment of Affirmer's heirs and
+successors, fully intending that such Waiver shall not be subject to
+revocation, rescission, cancellation, termination, or any other legal or
+equitable action to disrupt the quiet enjoyment of the Work by the public
+as contemplated by Affirmer's express Statement of Purpose.
+
+3. Public License Fallback. Should any part of the Waiver for any reason
+be judged legally invalid or ineffective under applicable law, then the
+Waiver shall be preserved to the maximum extent permitted taking into
+account Affirmer's express Statement of Purpose. In addition, to the
+extent the Waiver is so judged Affirmer hereby grants to each affected
+person a royalty-free, non transferable, non sublicensable, non exclusive,
+irrevocable and unconditional license to exercise Affirmer's Copyright and
+Related Rights in the Work (i) in all territories worldwide, (ii) for the
+maximum duration provided by applicable law or treaty (including future
+time extensions), (iii) in any current or future medium and for any number
+of copies, and (iv) for any purpose whatsoever, including without
+limitation commercial, advertising or promotional purposes (the
+"License"). The License shall be deemed effective as of the date CC0 was
+applied by Affirmer to the Work. Should any part of the License for any
+reason be judged legally invalid or ineffective under applicable law, such
+partial invalidity or ineffectiveness shall not invalidate the remainder
+of the License, and in such case Affirmer hereby affirms that he or she
+will not (i) exercise any of his or her remaining Copyright and Related
+Rights in the Work or (ii) assert any associated claims and causes of
+action with respect to the Work, in either case contrary to Affirmer's
+express Statement of Purpose.
+
+4. Limitations and Disclaimers.
+
+ a. No trademark or patent rights held by Affirmer are waived, abandoned,
+    surrendered, licensed or otherwise affected by this document.
+ b. Affirmer offers the Work as-is and makes no representations or
+    warranties of any kind concerning the Work, express, implied,
+    statutory or otherwise, including without limitation warranties of
+    title, merchantability, fitness for a particular purpose, non
+    infringement, or the absence of latent or other defects, accuracy, or
+    the present or absence of errors, whether or not discoverable, all to
+    the greatest extent permissible under applicable law.
+ c. Affirmer disclaims responsibility for clearing rights of other persons
+    that may apply to the Work or any use thereof, including without
+    limitation any person's Copyright and Related Rights in the Work.
+    Further, Affirmer disclaims responsibility for obtaining any necessary
+    consents, permissions or other rights required for any use of the
+    Work.
+ d. Affirmer understands and acknowledges that Creative Commons is not a
+    party to this document and has no duty or obligation with respect to
+    this CC0 or use of the Work.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/PSM2SAM.R	Fri Oct 02 14:14:15 2015 -0400
@@ -0,0 +1,400 @@
+#!/usr/bin/env Rscript
+
+## begin warning handler
+withCallingHandlers({
+
+library(methods) # Because Rscript does not always do this
+
+options('useFancyQuotes' = FALSE)
+
+suppressPackageStartupMessages(library("optparse"))
+suppressPackageStartupMessages(library("RGalaxy"))
+
+
+option_list <- list()
+
+option_list$passedPSM <- make_option('--passedPSM', type='character')
+option_list$XScolumn <- make_option('--XScolumn', type='character')
+option_list$exon_anno <- make_option('--exon_anno', type='character')
+option_list$proteinseq <- make_option('--proteinseq', type='character')
+option_list$procodingseq <- make_option('--procodingseq', type='character')
+option_list$header <- make_option('--header', type='character')
+option_list$OutputFile <- make_option('--OutputFile', type='character')
+
+
+opt <- parse_args(OptionParser(option_list=option_list))
+
+
+
+
+PSMtab2SAM <- function(
+	passedPSM_file = GalaxyInputFile(required=TRUE),
+	exon_anno_file = GalaxyInputFile(),
+	proteinseq_file = GalaxyInputFile(),
+	procodingseq_file = GalaxyInputFile(),
+	header_file = GalaxyInputFile(),
+	XScolumn = GalaxyCharacterParam(required=TRUE),
+	OutputFile = GalaxyOutput("proSAM","sam"))
+{
+    if (length(exon_anno_file) == 0) { exon_anno_file = "/export/src/tools-galaxyp-chambm/tools/bumbershoot/psm_to_sam/tool-data/exon_anno.RData" }
+    if (length(proteinseq_file) == 0) { proteinseq_file = "/export/src/tools-galaxyp-chambm/tools/bumbershoot/psm_to_sam/tool-data/proseq.RData" }
+    if (length(procodingseq_file) == 0) { procodingseq_file = "/export/src/tools-galaxyp-chambm/tools/bumbershoot/psm_to_sam/tool-data/procodingseq.RData" }
+    if (length(header_file) == 0) { header_file = "/export/src/tools-galaxyp-chambm/tools/bumbershoot/psm_to_sam/tool-data/header_refseq_hg19.txt" }
+
+    if (!file.exists(header_file)) { gstop("failed to read header file") }
+    if (file.exists(OutputFile))
+    {
+        if (file.info(OutputFile)$size > 0) { gstop("output file already exists") }
+        else
+        {
+            tryCatch(
+            {
+                file.remove(OutputFile)
+            }, error=function(err)
+            {
+                gstop("failed to read empty existing file")
+            })
+        }
+    }
+
+    suppressPackageStartupMessages(library(GenomicRanges))
+    suppressPackageStartupMessages(library(Biostrings))
+
+    options(stringsAsFactors=FALSE)
+
+    scoreName = XScolumn
+    columnName = gsub(":", "_", scoreName)
+
+    passedPSM <- tryCatch({
+        #read.delim(passedPSM_file, row.names=1)
+        suppressPackageStartupMessages(library(RSQLite))
+        drv <- dbDriver("SQLite")
+        con <- dbConnect(drv, passedPSM_file)
+
+        # do case-insensitive search for the score name
+        res <- dbSendQuery(con, paste("SELECT Id, Name FROM PeptideSpectrumMatchScoreName WHERE lower(Name)=lower('", scoreName, "')", sep=""))
+        scoreInfo = fetch(res, n=1)
+        scoreId = scoreInfo["Id"]
+        realScoreName = scoreInfo["Name"] # original case
+        dbClearResult(res)
+
+        sql <- paste("SELECT ss.Name as SourceName, s.NativeID",
+                     ", psm.ObservedNeutralMass AS precursor_neutral_mass",
+                     ", psm.Charge AS assumed_charge",
+                     ", s.ScanTimeInSeconds AS retention_time_sec",
+                     ", psm.Rank AS hit_rank",
+                     ", IFNULL(SUBSTR(pd.Sequence, pi.Offset+1, pi.Length), pep.DecoySequence) AS peptide",
+                     ", pro.Accession AS protein",
+                     ", COUNT(DISTINCT pro.Id) AS num_tot_proteins",
+                     ", psm.ObservedNeutralMass - (psm.MonoisotopicMassError - ROUND(psm.MonoisotopicMassError) * 1.0026) AS calc_neutral_pep_mass",
+                     ", psm.MonoisotopicMassError - ROUND(psm.MonoisotopicMassError) * 1.0026 AS massdiff",
+                     ", CTerminusIsSpecific+NTerminusIsSpecific AS num_tol_term",
+                     ", MissedCleavages AS num_missed_cleavages",
+                     ", psm.QValue AS qvalue",
+                     ", psmScore.Value AS ", columnName,
+                     ", GROUP_CONCAT(DISTINCT pm.Offset || ';' || mod.MonoMassDelta) AS modification",
+                     "FROM PeptideSpectrumMatch psm",
+                     "JOIN Spectrum s ON psm.Spectrum=s.Id",
+                     "JOIN SpectrumSource ss ON s.Source=ss.Id",
+                     "JOIN PeptideInstance pi ON psm.Peptide=pi.Peptide",
+                     "JOIN Protein pro ON pi.Protein=pro.Id",
+                     "JOIN Peptide pep ON pi.Peptide=pep.Id",
+                     "JOIN PeptideSpectrumMatchScore psmScore ON psmScore.PsmId=psm.Id AND ScoreNameId=", scoreId,
+                     "LEFT JOIN ProteinData pd ON pro.Id=pd.Id",
+                     "LEFT JOIN PeptideModification pm ON psm.Id=pm.PeptideSpectrumMatch",
+                     "LEFT JOIN Modification mod ON pm.Modification=mod.Id",
+                     "GROUP BY psm.Id"
+                    )
+
+        res <- dbSendQuery(con, sql)
+        passedPSM <- fetch(res, n=-1)
+        dbClearResult(res)
+        dbDisconnect(con)
+        passedPSM
+    }, error=function(err) {
+        gstop("failed to read passedPSM: ", err)
+    })
+
+    tryCatch({
+        load(exon_anno_file)
+        exon_anno <- exon
+    }, error=function(err) {
+        gstop("failed to read exon_anno: ", conditionMessage(err))
+    })
+
+    tryCatch({
+        load(proteinseq_file)
+    }, error=function(err) {
+        gstop("failed to read proteinseq: ", conditionMessage(err))
+    })
+
+    tryCatch({
+        load(procodingseq_file)
+    }, error=function(err) {
+        gstop("failed to read procodingseq: ", conditionMessage(err))
+    })
+
+    PEP <- passedPSM[, 'peptide']
+    #Spectrumid <- paste(passedPSM[, 'SourceName'], gsub(" ", "_", passedPSM[, 'NativeID']), sep="/")
+    Spectrumid <- paste(passedPSM[, 'SourceName'], passedPSM[, 'NativeID'], sep="/")
+    #PEP_SEQ <- formatPep(spectra[, 'Sequence'])
+
+    SAM <- c()
+
+    spectrumcount <- table(Spectrumid)
+
+    for(i in 1:dim(passedPSM)[1]){
+        #print(i)
+        peptide <- PEP[i]
+        QNAME <- Spectrumid[i]
+        idx <- grep(peptide, proteinseq[, 'peptide'], fixed=TRUE)
+        if(length(idx) == 0){
+            RNAME <- '*'
+            MAPQ <- 255
+            RNEXT <- '*'
+            PNEXT <- 0
+            TLEN <- 0
+            QUAL <- '*'
+            POS <- 0
+            SEQ <- '*'
+            CIGAR <- '*'
+            FLAG <- 0x4
+            annoted <- '?'
+            XA <- paste('XA:Z:', annoted, sep='')
+            res <- c(FLAG, RNAME, POS, MAPQ, CIGAR, RNEXT, PNEXT, TLEN,
+                    as.character(SEQ), QUAL, XA)
+            res <-  unique(data.frame(t(res)))
+        }else{
+            pro <- proteinseq[idx, ]
+            sta_pos <- unlist(lapply(pro[, 'peptide'], function(x)
+                        regexpr(peptide, x, fixed=TRUE)))
+            pep_len <- nchar(peptide)
+            end_pos <- sta_pos + pep_len - 1
+
+            coding <- procodingseq[match(pro[, 'pro_name'],
+                                        procodingseq[, 'pro_name']), ]
+            code_s <- (sta_pos-1) * 3 + 1
+            code_e <- end_pos * 3
+            codingseq <- substring(coding[, 'coding'], code_s, code_e)
+
+
+
+            exonp <- lapply(pro[, 'tx_name'], function(x)
+                        exon_anno[exon_anno[, 'tx_name']==x, ])
+
+            exonp <- lapply(exonp, function(x){
+                            if(length(unique(x[, 'tx_id'])) > 1){
+                                x[grep(x[1, 'tx_id'], x[, 'tx_id'],
+                                fixed=TRUE), ]
+                            }else x
+                        })
+
+            if(passedPSM[i, 'hit_rank'] == 1) pri <- TRUE else pri <- FALSE
+
+            res <- mapply(function(x, y, z, m)
+                    if(dim(z)[1] == 0){
+                        .proteinUnannotated(x, y, z, m, primary=pri)
+                    }else{
+                        if((nchar(m) != 3*pep_len) | (y > max(z[, 'cds_end'],
+                            na.rm = TRUE))){
+                        #if(toString(translate(DNAString(m))) != peptide){
+                            .peptideUnannotated(x, y, z, m, primary=pri)
+                        }else{
+                            .MapCoding2genome(x, y, z, m, primary=pri)
+                        }
+                    },
+                    code_s, code_e, exonp, codingseq)
+
+            res <-  unique(data.frame(t(res)))
+
+        }
+        XL <- paste('XL:i:', as.numeric(spectrumcount[QNAME]), sep='')
+        NH <- paste('NH:i:', dim(res)[1], sep='')
+        XP <- paste('XP:Z:', peptide, sep='')
+        #XF <- paste('XF:f:', round(passedPSM[i, XFcolumn], digits=4), sep='')
+        XC <- paste('XC:i:', passedPSM[i, 'assumed_charge'], sep='')
+        XS <- paste('XS:f:', round(as.numeric(passedPSM[i, columnName]),
+                        digits=4), sep='')
+        #XA <- paste('XA:Z:', annoted, sep='')
+        XN <- paste('XN:i:', passedPSM[i, 'num_missed_cleavages'], sep='')
+        XT <- paste('XT:i:', passedPSM[i, 'num_tol_term'], sep='')
+
+        XM <-  ifelse(is.na(passedPSM[i, 'modification']), paste('XM:Z:-'),
+                    paste('XM:Z:', passedPSM[i, 'modification'], sep=''))
+
+        res <- cbind(QNAME, res, NH, XL, XP, XC, XS, XM, XN, XT)
+        SAM <- rbind(SAM, res)
+    }
+
+    file.copy(header_file, OutputFile)
+    write.table(SAM, file=OutputFile, sep='\t', quote=FALSE, row.names=FALSE, col.names=FALSE, append=TRUE)
+}
+
+
+
+.proteinUnannotated <-function(c_sta, c_end, exon_anno, cseq, primary=TRUE, ...)
+{
+    RNAME <- '*'
+    MAPQ <- 255
+    RNEXT <- '*'
+    PNEXT <- 0
+    TLEN <- 0
+    QUAL <- '*'
+    POS <- 0
+    SEQ <- '*'
+    CIGAR <- '*'
+    annoted <- 2
+    XA <- paste('XA:Z:', annoted, sep='')
+    FLAG <- 0x4
+
+    tmp <- c(FLAG, RNAME, POS, MAPQ, CIGAR, RNEXT, PNEXT, TLEN,
+                as.character(SEQ), QUAL, XA)
+    tmp
+}
+
+
+
+.peptideUnannotated <- function(c_sta, c_end, exon_anno, cseq, primary=TRUE, ...)
+{
+    #RNAME <- as.character(exon_anno[1, 'chromosome_name'])
+    RNAME <- '*'
+    MAPQ <- 255
+    RNEXT <- '*'
+    PNEXT <- 0
+    TLEN <- 0
+    QUAL <- '*'
+    POS <- 0
+    SEQ <- '*'
+    CIGAR <- '*'
+    annoted <- 1
+    XA <- paste('XA:Z:', annoted, sep='')
+    FLAG <- 0x4
+    #if(exon_anno[1, 'strand'] == '+'){
+    #    FLAG <- ifelse(primary==TRUE, 0x00, 0x00+0x100)
+    #}else{
+    #    FLAG <- ifelse(primary==TRUE, 0x10, 0x10+0x100)
+    #}
+
+    tmp <- c(FLAG, RNAME, POS, MAPQ, CIGAR, RNEXT, PNEXT, TLEN,
+                as.character(SEQ), QUAL, XA)
+    tmp
+}
+
+
+
+.MapCoding2genome <- function(c_sta, c_end, exon_anno, cseq, primary=TRUE, ...)
+{
+    idxs <- intersect(which(exon_anno[, 'cds_start'] <= c_sta),
+                        which(exon_anno[, 'cds_end'] >= c_sta))
+    idxe <- intersect(which(exon_anno[, 'cds_start'] <= c_end),
+                        which(exon_anno[, 'cds_end'] >= c_end))
+    len <- c_end - c_sta + 1
+    RNAME <- as.character(exon_anno[1, 'chromosome_name'])
+    MAPQ <- 255
+    RNEXT <- '*'
+    PNEXT <- 0
+    TLEN <- 0
+    QUAL <- '*'
+    annoted <- 0
+    XA <- paste('XA:Z:', annoted, sep='')
+
+    if(exon_anno[1, 'strand'] == '+'){
+            POS <- exon_anno[idxs, 'cds_chr_start'] + c_sta -
+                    exon_anno[idxs, 'cds_start']
+            SEQ <- DNAString(cseq)
+            FLAG <- ifelse(primary==TRUE, 0x00, 0x00+0x100)
+
+     }else{
+            POS <- exon_anno[idxe, 'cds_chr_start'] +
+                    exon_anno[idxe, 'cds_end'] - c_end
+            SEQ <- reverseComplement(DNAString(cseq))
+            FLAG <- ifelse(primary==TRUE, 0x10, 0x10 + 0x100)
+     }
+
+    if(idxe == idxs){
+        CIGAR <- paste(len, 'M', sep='')
+    }else{
+         if(exon_anno[1, 'strand'] == '+'){
+            #insert <- exon_anno[idxe, 'cds_chr_start'] - exon_anno[idxs,
+            #                        'cds_chr_end']- 1
+            part1 <- exon_anno[idxs, 'cds_end'] - c_sta + 1
+            part2 <- c_end - exon_anno[idxe, 'cds_start'] + 1
+
+            insert <- unlist(lapply(1:(idxe - idxs), function(x)
+                paste(exon_anno[idxs + x, 'cds_chr_start'] -
+                    exon_anno[idxs+x-1, 'cds_chr_end']- 1, 'N', sep='')))
+            if(idxe-idxs >1){
+                innerexon <- unlist(lapply(1:(idxe-idxs-1), function(x)
+                        paste(exon_anno[idxs + x, 'cds_chr_end'] -
+                        exon_anno[idxs + x, 'cds_chr_start'] + 1, 'M',
+                        sep='')))
+            }else{ innerexon <- ''}
+
+            #ifelse(idxe-idxs >1, unlist(lapply(1:(idxe-idxs-1), function(x)
+            #paste(exon_anno[idxs+x, 'cds_chr_end'] -
+            #             exon_anno[idxs+x, 'cds_chr_start']+1,
+            #             'M', sep=''))), '')
+            midpattern <- rep(NA, length(insert) + length(innerexon))
+            midpattern[seq(1, length(insert) + length(innerexon),
+                                                        by=2)] <- insert
+            midpattern[seq(2, length(insert) + length(innerexon),
+                                                        by=2)] <- innerexon
+            midpattern <- paste(midpattern, collapse='')
+
+        }else{
+            #insert <- exon_anno[idxs, 'cds_chr_start'] -
+            #            exon_anno[idxe, 'cds_chr_end']- 1
+            part1 <- c_end- exon_anno[idxe, 'cds_start'] + 1
+            part2 <- exon_anno[idxs, 'cds_end'] - c_sta + 1
+
+            insert <- unlist(lapply(1:(idxe-idxs), function(x)
+                paste(exon_anno[idxe - x, 'cds_chr_start'] -
+                        exon_anno[idxe-x + 1, 'cds_chr_end']- 1,
+                        'N', sep='')))
+            if(idxe-idxs >1){
+                innerexon <- unlist(lapply(1:(idxe-idxs-1), function(x)
+                        paste(exon_anno[idxe-x, 'cds_chr_end'] -
+                        exon_anno[idxe-x, 'cds_chr_start']+1, 'M', sep='')))
+            }else{ innerexon <-''}
+
+            midpattern <- rep(NA, length(insert)+length(innerexon))
+            midpattern[seq(1, length(insert) + length(innerexon),
+                                                        by=2)] <- insert
+            midpattern[seq(2, length(insert) + length(innerexon),
+                                                        by=2)] <- innerexon
+            midpattern <- paste(midpattern, collapse='')
+
+        }
+
+        CIGAR <- paste(part1, 'M', midpattern, part2, 'M', sep='')
+    }
+
+    tmp <- c(FLAG, RNAME, POS, MAPQ, CIGAR, RNEXT, PNEXT, TLEN,
+                as.character(SEQ), QUAL, XA)
+    tmp
+}
+
+
+params <- list()
+for(param in names(opt))
+{
+    if (!param == "help")
+        params[param] <- opt[param]
+}
+
+setClass("GalaxyRemoteError", contains="character")
+wrappedFunction <- function(f)
+{
+    tryCatch(do.call(f, params),
+        error=function(e) new("GalaxyRemoteError", conditionMessage(e)))
+}
+
+
+suppressPackageStartupMessages(library(RGalaxy))
+do.call(PSMtab2SAM, params)
+
+## end warning handler
+}, warning = function(w) {
+    cat(paste("Warning:", conditionMessage(w), "\n"))
+    invokeRestart("muffleWarning")
+})
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/PSM2SAM.xml	Fri Oct 02 14:14:15 2015 -0400
@@ -0,0 +1,61 @@
+<tool id="PSMtoSAM" name="PSM to SAM" version="1.3.2">
+  <description>Generate SAM files from PSMs.</description>
+  <command interpreter="Rscript --vanilla">PSM2SAM.R
+       #if str($input).strip() != "":
+          --passedPSM="$input"
+       #end if
+       #if str($scoreColumn).strip() != "":
+          --XScolumn="$scoreColumn"
+       #end if
+       #if str($optionalUserInput.exonAnno).strip() != "None":
+          --exon_anno="$optionalUserInput.exonAnno"
+       #end if
+       #if str($optionalUserInput.proteinSeq).strip() != "None":
+          --proteinseq="$optionalUserInput.proteinSeq"
+       #end if
+       #if str($optionalUserInput.proCodingSeq).strip() != "None":
+          --procodingseq="$optionalUserInput.proCodingSeq"
+       #end if
+       #if str($optionalUserInput.header).strip() != "None":
+          --header="$optionalUserInput.header"
+       #end if
+       #if str($output).strip() != "":
+          --OutputFile="$output"
+       #end if
+
+2&gt;&amp;1</command>
+  <inputs>
+    <param name="input" type="data" format="idpdb" help="An IDPicker idpDB file to convert to SAM" label="Input PSMs">
+      <validator type="empty_field" message="This field is required."/>
+    </param>
+    <param name="scoreColumn" type="text" help="The name of a PSM score to include in the SAM output (e.g. &quot;MyriMatch:mvh&quot;)" size="60" label="Score Name">
+      <validator type="empty_field" message="This field is required."/>
+    </param>
+    <section name="optionalUserInput" label="Override Default Exon Annotation and Coding Sequences">
+      <param name="exonAnno" type="data" format="RData" help="A dataframe of exon annotations in an RData file" label="Exon Annotations" optional="true" />
+      <param name="proteinSeq" type="data" format="RData" help="A dataframe containing protein ids and protein sequences in an RData file" label="Protein Sequences" optional="true" />
+      <param name="proCodingSeq" type="data" format="RData" help="A dataframe cotaining coding sequences for each protein in an RData file" label="Protein Coding Sequences" optional="true" />
+      <param name="header" type="data" format="txt" help="A text file of SAM headers to include in the output file, usually corresponding to the exon and coding sequences used." label="SAM Headers" optional="true" />
+    </section>
+  </inputs>
+  <outputs>
+    <data format="sam" name="output" label="${input.name.rsplit('.',1)[0]}.sam"/>
+  </outputs>
+  <!--<tests>
+    <test>
+      <param name="input" value="dbo_z_20101126_JK_Res_Sens_Set2_H1819_A_07-subset.idpDB.gz" />
+      <param name="scoreColumn" value="Myrimatch:MVH" />
+      <output name="output" file="dbo_z_20101126_JK_Res_Sens_Set2_H1819_A_07-subset.sam" />
+    </test>
+    <test>
+      <param name="input" value="Ellis_033_2700_261_07-unrefined-subset.idpDB.gz" />
+      <param name="scoreColumn" value="Myrimatch:MVH" />
+      <output name="output" file="Ellis_033_2700_261_07-unrefined-subset.sam" />
+    </test>
+  </tests>-->
+  <help>
+**Description**
+
+Generate SAM files from confident peptide-spectrum-matches (PSMs).
+</help>
+</tool>
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/README.md	Fri Oct 02 14:14:15 2015 -0400
@@ -0,0 +1,54 @@
+GalaxyP - PSM to SAM
+===================
+
+* Home: <https://github.com/galaxyproteomics/tools-galaxyp/>
+* Galaxy Tool Shed: <http://toolshed.g2.bx.psu.edu/view/galaxyp/PSMtoSAM>
+* Tool ID: `PSMtoSAM`
+
+
+Description
+-----------
+
+Generate proBAMr SAM files from IDPicker peptide-spectrum-matches.
+
+See:
+
+* <https://www.bioconductor.org/packages/release/bioc/html/proBAMr.html>
+
+
+GalaxyP Community
+-----------------
+
+Current governing community policies for [GalaxyP](https://github.com/galaxyproteomics/) and other information can be found at:
+
+<https://github.com/galaxyproteomics>
+
+
+License
+-------
+
+Copyright (c) 2015 Vanderbilt University and Authors listed below.
+
+To the extent possible under law, the author(s) have dedicated all copyright and related and neighboring rights to this software to the public domain worldwide. This software is distributed without any warranty.
+
+You should have received a copy of the CC0 Public Domain Dedication along with this software. If not, see <https://creativecommons.org/publicdomain/zero/1.0/>.
+
+You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission.
+
+
+Contributing
+------------
+
+Contributions to this repository are reviewed through pull requests. If you would like your work acknowledged, please also add yourself to the Authors section. If your pull request is accepted, you will also be acknowledged in <https://github.com/galaxyproteomics/tools-galaxyp/>
+
+
+Authors
+-------
+
+Authors and contributors:
+
+* Matt Chambers <matt.chambers42@gmail.com>
+  Vanderbilt University Medical Center
+
+* Xiaojing Wang
+  Vanderbilt University Medical Center
Binary file tool-data/exon_anno.RData has changed
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tool-data/header_refseq_hg19.txt	Fri Oct 02 14:14:15 2015 -0400
@@ -0,0 +1,95 @@
+@HD	VN:1.0	SO:coordinate
+@SQ	SN:chr1	LN:249250621
+@SQ	SN:chr10	LN:135534747
+@SQ	SN:chr11	LN:135006516
+@SQ	SN:chr11_gl000202_random	LN:40103
+@SQ	SN:chr12	LN:133851895
+@SQ	SN:chr13	LN:115169878
+@SQ	SN:chr14	LN:107349540
+@SQ	SN:chr15	LN:102531392
+@SQ	SN:chr16	LN:90354753
+@SQ	SN:chr17	LN:81195210
+@SQ	SN:chr17_ctg5_hap1	LN:1680828
+@SQ	SN:chr17_gl000203_random	LN:37498
+@SQ	SN:chr17_gl000204_random	LN:81310
+@SQ	SN:chr17_gl000205_random	LN:174588
+@SQ	SN:chr17_gl000206_random	LN:41001
+@SQ	SN:chr18	LN:78077248
+@SQ	SN:chr18_gl000207_random	LN:4262
+@SQ	SN:chr19	LN:59128983
+@SQ	SN:chr19_gl000208_random	LN:92689
+@SQ	SN:chr19_gl000209_random	LN:159169
+@SQ	SN:chr1_gl000191_random	LN:106433
+@SQ	SN:chr1_gl000192_random	LN:547496
+@SQ	SN:chr2	LN:243199373
+@SQ	SN:chr20	LN:63025520
+@SQ	SN:chr21	LN:48129895
+@SQ	SN:chr21_gl000210_random	LN:27682
+@SQ	SN:chr22	LN:51304566
+@SQ	SN:chr3	LN:198022430
+@SQ	SN:chr4	LN:191154276
+@SQ	SN:chr4_ctg9_hap1	LN:590426
+@SQ	SN:chr4_gl000193_random	LN:189789
+@SQ	SN:chr4_gl000194_random	LN:191469
+@SQ	SN:chr5	LN:180915260
+@SQ	SN:chr6	LN:171115067
+@SQ	SN:chr6_apd_hap1	LN:4622290
+@SQ	SN:chr6_cox_hap2	LN:4795371
+@SQ	SN:chr6_dbb_hap3	LN:4610396
+@SQ	SN:chr6_mann_hap4	LN:4683263
+@SQ	SN:chr6_mcf_hap5	LN:4833398
+@SQ	SN:chr6_qbl_hap6	LN:4611984
+@SQ	SN:chr6_ssto_hap7	LN:4928567
+@SQ	SN:chr7	LN:159138663
+@SQ	SN:chr7_gl000195_random	LN:182896
+@SQ	SN:chr8	LN:146364022
+@SQ	SN:chr8_gl000196_random	LN:38914
+@SQ	SN:chr8_gl000197_random	LN:37175
+@SQ	SN:chr9	LN:141213431
+@SQ	SN:chr9_gl000198_random	LN:90085
+@SQ	SN:chr9_gl000199_random	LN:169874
+@SQ	SN:chr9_gl000200_random	LN:187035
+@SQ	SN:chr9_gl000201_random	LN:36148
+@SQ	SN:chrM	LN:16571
+@SQ	SN:chrUn_gl000211	LN:166566
+@SQ	SN:chrUn_gl000212	LN:186858
+@SQ	SN:chrUn_gl000213	LN:164239
+@SQ	SN:chrUn_gl000214	LN:137718
+@SQ	SN:chrUn_gl000215	LN:172545
+@SQ	SN:chrUn_gl000216	LN:172294
+@SQ	SN:chrUn_gl000217	LN:172149
+@SQ	SN:chrUn_gl000218	LN:161147
+@SQ	SN:chrUn_gl000219	LN:179198
+@SQ	SN:chrUn_gl000220	LN:161802
+@SQ	SN:chrUn_gl000221	LN:155397
+@SQ	SN:chrUn_gl000222	LN:186861
+@SQ	SN:chrUn_gl000223	LN:180455
+@SQ	SN:chrUn_gl000224	LN:179693
+@SQ	SN:chrUn_gl000225	LN:211173
+@SQ	SN:chrUn_gl000226	LN:15008
+@SQ	SN:chrUn_gl000227	LN:128374
+@SQ	SN:chrUn_gl000228	LN:129120
+@SQ	SN:chrUn_gl000229	LN:19913
+@SQ	SN:chrUn_gl000230	LN:43691
+@SQ	SN:chrUn_gl000231	LN:27386
+@SQ	SN:chrUn_gl000232	LN:40652
+@SQ	SN:chrUn_gl000233	LN:45941
+@SQ	SN:chrUn_gl000234	LN:40531
+@SQ	SN:chrUn_gl000235	LN:34474
+@SQ	SN:chrUn_gl000236	LN:41934
+@SQ	SN:chrUn_gl000237	LN:45867
+@SQ	SN:chrUn_gl000238	LN:39939
+@SQ	SN:chrUn_gl000239	LN:33824
+@SQ	SN:chrUn_gl000240	LN:41933
+@SQ	SN:chrUn_gl000241	LN:42152
+@SQ	SN:chrUn_gl000242	LN:43523
+@SQ	SN:chrUn_gl000243	LN:43341
+@SQ	SN:chrUn_gl000244	LN:39929
+@SQ	SN:chrUn_gl000245	LN:36651
+@SQ	SN:chrUn_gl000246	LN:38154
+@SQ	SN:chrUn_gl000247	LN:36422
+@SQ	SN:chrUn_gl000248	LN:39786
+@SQ	SN:chrUn_gl000249	LN:38502
+@SQ	SN:chrX	LN:155270560
+@SQ	SN:chrY	LN:59373566
+@PG	ID:proBAM
Binary file tool-data/procodingseq.RData has changed
Binary file tool-data/proseq.RData has changed