Mercurial > repos > galaxyp > psm_to_sam
changeset 0:c506e5dac2bb draft
planemo upload for repository https://github.com/galaxyproteomics/tools-galaxyp/tools/bumbershoot/psm_to_sam commit b37186806a83fb3a59a1b4ccb1d44667d5224277-dirty
author | galaxyp |
---|---|
date | Fri, 02 Oct 2015 14:14:15 -0400 |
parents | |
children | 34f9e847dd4e |
files | COPYING PSM2SAM.R PSM2SAM.xml README.md tool-data/exon_anno.RData tool-data/header_refseq_hg19.txt tool-data/procodingseq.RData tool-data/proseq.RData |
diffstat | 8 files changed, 731 insertions(+), 0 deletions(-) [+] |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/COPYING Fri Oct 02 14:14:15 2015 -0400 @@ -0,0 +1,121 @@ +Creative Commons Legal Code + +CC0 1.0 Universal + + CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE + LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN + ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS + INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES + REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS + PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM + THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED + HEREUNDER. + +Statement of Purpose + +The laws of most jurisdictions throughout the world automatically confer +exclusive Copyright and Related Rights (defined below) upon the creator +and subsequent owner(s) (each and all, an "owner") of an original work of +authorship and/or a database (each, a "Work"). + +Certain owners wish to permanently relinquish those rights to a Work for +the purpose of contributing to a commons of creative, cultural and +scientific works ("Commons") that the public can reliably and without fear +of later claims of infringement build upon, modify, incorporate in other +works, reuse and redistribute as freely as possible in any form whatsoever +and for any purposes, including without limitation commercial purposes. +These owners may contribute to the Commons to promote the ideal of a free +culture and the further production of creative, cultural and scientific +works, or to gain reputation or greater distribution for their Work in +part through the use and efforts of others. + +For these and/or other purposes and motivations, and without any +expectation of additional consideration or compensation, the person +associating CC0 with a Work (the "Affirmer"), to the extent that he or she +is an owner of Copyright and Related Rights in the Work, voluntarily +elects to apply CC0 to the Work and publicly distribute the Work under its +terms, with knowledge of his or her Copyright and Related Rights in the +Work and the meaning and intended legal effect of CC0 on those rights. + +1. Copyright and Related Rights. A Work made available under CC0 may be +protected by copyright and related or neighboring rights ("Copyright and +Related Rights"). Copyright and Related Rights include, but are not +limited to, the following: + + i. the right to reproduce, adapt, distribute, perform, display, + communicate, and translate a Work; + ii. moral rights retained by the original author(s) and/or performer(s); +iii. publicity and privacy rights pertaining to a person's image or + likeness depicted in a Work; + iv. rights protecting against unfair competition in regards to a Work, + subject to the limitations in paragraph 4(a), below; + v. rights protecting the extraction, dissemination, use and reuse of data + in a Work; + vi. database rights (such as those arising under Directive 96/9/EC of the + European Parliament and of the Council of 11 March 1996 on the legal + protection of databases, and under any national implementation + thereof, including any amended or successor version of such + directive); and +vii. other similar, equivalent or corresponding rights throughout the + world based on applicable law or treaty, and any national + implementations thereof. + +2. Waiver. To the greatest extent permitted by, but not in contravention +of, applicable law, Affirmer hereby overtly, fully, permanently, +irrevocably and unconditionally waives, abandons, and surrenders all of +Affirmer's Copyright and Related Rights and associated claims and causes +of action, whether now known or unknown (including existing as well as +future claims and causes of action), in the Work (i) in all territories +worldwide, (ii) for the maximum duration provided by applicable law or +treaty (including future time extensions), (iii) in any current or future +medium and for any number of copies, and (iv) for any purpose whatsoever, +including without limitation commercial, advertising or promotional +purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each +member of the public at large and to the detriment of Affirmer's heirs and +successors, fully intending that such Waiver shall not be subject to +revocation, rescission, cancellation, termination, or any other legal or +equitable action to disrupt the quiet enjoyment of the Work by the public +as contemplated by Affirmer's express Statement of Purpose. + +3. Public License Fallback. Should any part of the Waiver for any reason +be judged legally invalid or ineffective under applicable law, then the +Waiver shall be preserved to the maximum extent permitted taking into +account Affirmer's express Statement of Purpose. In addition, to the +extent the Waiver is so judged Affirmer hereby grants to each affected +person a royalty-free, non transferable, non sublicensable, non exclusive, +irrevocable and unconditional license to exercise Affirmer's Copyright and +Related Rights in the Work (i) in all territories worldwide, (ii) for the +maximum duration provided by applicable law or treaty (including future +time extensions), (iii) in any current or future medium and for any number +of copies, and (iv) for any purpose whatsoever, including without +limitation commercial, advertising or promotional purposes (the +"License"). The License shall be deemed effective as of the date CC0 was +applied by Affirmer to the Work. Should any part of the License for any +reason be judged legally invalid or ineffective under applicable law, such +partial invalidity or ineffectiveness shall not invalidate the remainder +of the License, and in such case Affirmer hereby affirms that he or she +will not (i) exercise any of his or her remaining Copyright and Related +Rights in the Work or (ii) assert any associated claims and causes of +action with respect to the Work, in either case contrary to Affirmer's +express Statement of Purpose. + +4. Limitations and Disclaimers. + + a. No trademark or patent rights held by Affirmer are waived, abandoned, + surrendered, licensed or otherwise affected by this document. + b. Affirmer offers the Work as-is and makes no representations or + warranties of any kind concerning the Work, express, implied, + statutory or otherwise, including without limitation warranties of + title, merchantability, fitness for a particular purpose, non + infringement, or the absence of latent or other defects, accuracy, or + the present or absence of errors, whether or not discoverable, all to + the greatest extent permissible under applicable law. + c. Affirmer disclaims responsibility for clearing rights of other persons + that may apply to the Work or any use thereof, including without + limitation any person's Copyright and Related Rights in the Work. + Further, Affirmer disclaims responsibility for obtaining any necessary + consents, permissions or other rights required for any use of the + Work. + d. Affirmer understands and acknowledges that Creative Commons is not a + party to this document and has no duty or obligation with respect to + this CC0 or use of the Work.
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/PSM2SAM.R Fri Oct 02 14:14:15 2015 -0400 @@ -0,0 +1,400 @@ +#!/usr/bin/env Rscript + +## begin warning handler +withCallingHandlers({ + +library(methods) # Because Rscript does not always do this + +options('useFancyQuotes' = FALSE) + +suppressPackageStartupMessages(library("optparse")) +suppressPackageStartupMessages(library("RGalaxy")) + + +option_list <- list() + +option_list$passedPSM <- make_option('--passedPSM', type='character') +option_list$XScolumn <- make_option('--XScolumn', type='character') +option_list$exon_anno <- make_option('--exon_anno', type='character') +option_list$proteinseq <- make_option('--proteinseq', type='character') +option_list$procodingseq <- make_option('--procodingseq', type='character') +option_list$header <- make_option('--header', type='character') +option_list$OutputFile <- make_option('--OutputFile', type='character') + + +opt <- parse_args(OptionParser(option_list=option_list)) + + + + +PSMtab2SAM <- function( + passedPSM_file = GalaxyInputFile(required=TRUE), + exon_anno_file = GalaxyInputFile(), + proteinseq_file = GalaxyInputFile(), + procodingseq_file = GalaxyInputFile(), + header_file = GalaxyInputFile(), + XScolumn = GalaxyCharacterParam(required=TRUE), + OutputFile = GalaxyOutput("proSAM","sam")) +{ + if (length(exon_anno_file) == 0) { exon_anno_file = "/export/src/tools-galaxyp-chambm/tools/bumbershoot/psm_to_sam/tool-data/exon_anno.RData" } + if (length(proteinseq_file) == 0) { proteinseq_file = "/export/src/tools-galaxyp-chambm/tools/bumbershoot/psm_to_sam/tool-data/proseq.RData" } + if (length(procodingseq_file) == 0) { procodingseq_file = "/export/src/tools-galaxyp-chambm/tools/bumbershoot/psm_to_sam/tool-data/procodingseq.RData" } + if (length(header_file) == 0) { header_file = "/export/src/tools-galaxyp-chambm/tools/bumbershoot/psm_to_sam/tool-data/header_refseq_hg19.txt" } + + if (!file.exists(header_file)) { gstop("failed to read header file") } + if (file.exists(OutputFile)) + { + if (file.info(OutputFile)$size > 0) { gstop("output file already exists") } + else + { + tryCatch( + { + file.remove(OutputFile) + }, error=function(err) + { + gstop("failed to read empty existing file") + }) + } + } + + suppressPackageStartupMessages(library(GenomicRanges)) + suppressPackageStartupMessages(library(Biostrings)) + + options(stringsAsFactors=FALSE) + + scoreName = XScolumn + columnName = gsub(":", "_", scoreName) + + passedPSM <- tryCatch({ + #read.delim(passedPSM_file, row.names=1) + suppressPackageStartupMessages(library(RSQLite)) + drv <- dbDriver("SQLite") + con <- dbConnect(drv, passedPSM_file) + + # do case-insensitive search for the score name + res <- dbSendQuery(con, paste("SELECT Id, Name FROM PeptideSpectrumMatchScoreName WHERE lower(Name)=lower('", scoreName, "')", sep="")) + scoreInfo = fetch(res, n=1) + scoreId = scoreInfo["Id"] + realScoreName = scoreInfo["Name"] # original case + dbClearResult(res) + + sql <- paste("SELECT ss.Name as SourceName, s.NativeID", + ", psm.ObservedNeutralMass AS precursor_neutral_mass", + ", psm.Charge AS assumed_charge", + ", s.ScanTimeInSeconds AS retention_time_sec", + ", psm.Rank AS hit_rank", + ", IFNULL(SUBSTR(pd.Sequence, pi.Offset+1, pi.Length), pep.DecoySequence) AS peptide", + ", pro.Accession AS protein", + ", COUNT(DISTINCT pro.Id) AS num_tot_proteins", + ", psm.ObservedNeutralMass - (psm.MonoisotopicMassError - ROUND(psm.MonoisotopicMassError) * 1.0026) AS calc_neutral_pep_mass", + ", psm.MonoisotopicMassError - ROUND(psm.MonoisotopicMassError) * 1.0026 AS massdiff", + ", CTerminusIsSpecific+NTerminusIsSpecific AS num_tol_term", + ", MissedCleavages AS num_missed_cleavages", + ", psm.QValue AS qvalue", + ", psmScore.Value AS ", columnName, + ", GROUP_CONCAT(DISTINCT pm.Offset || ';' || mod.MonoMassDelta) AS modification", + "FROM PeptideSpectrumMatch psm", + "JOIN Spectrum s ON psm.Spectrum=s.Id", + "JOIN SpectrumSource ss ON s.Source=ss.Id", + "JOIN PeptideInstance pi ON psm.Peptide=pi.Peptide", + "JOIN Protein pro ON pi.Protein=pro.Id", + "JOIN Peptide pep ON pi.Peptide=pep.Id", + "JOIN PeptideSpectrumMatchScore psmScore ON psmScore.PsmId=psm.Id AND ScoreNameId=", scoreId, + "LEFT JOIN ProteinData pd ON pro.Id=pd.Id", + "LEFT JOIN PeptideModification pm ON psm.Id=pm.PeptideSpectrumMatch", + "LEFT JOIN Modification mod ON pm.Modification=mod.Id", + "GROUP BY psm.Id" + ) + + res <- dbSendQuery(con, sql) + passedPSM <- fetch(res, n=-1) + dbClearResult(res) + dbDisconnect(con) + passedPSM + }, error=function(err) { + gstop("failed to read passedPSM: ", err) + }) + + tryCatch({ + load(exon_anno_file) + exon_anno <- exon + }, error=function(err) { + gstop("failed to read exon_anno: ", conditionMessage(err)) + }) + + tryCatch({ + load(proteinseq_file) + }, error=function(err) { + gstop("failed to read proteinseq: ", conditionMessage(err)) + }) + + tryCatch({ + load(procodingseq_file) + }, error=function(err) { + gstop("failed to read procodingseq: ", conditionMessage(err)) + }) + + PEP <- passedPSM[, 'peptide'] + #Spectrumid <- paste(passedPSM[, 'SourceName'], gsub(" ", "_", passedPSM[, 'NativeID']), sep="/") + Spectrumid <- paste(passedPSM[, 'SourceName'], passedPSM[, 'NativeID'], sep="/") + #PEP_SEQ <- formatPep(spectra[, 'Sequence']) + + SAM <- c() + + spectrumcount <- table(Spectrumid) + + for(i in 1:dim(passedPSM)[1]){ + #print(i) + peptide <- PEP[i] + QNAME <- Spectrumid[i] + idx <- grep(peptide, proteinseq[, 'peptide'], fixed=TRUE) + if(length(idx) == 0){ + RNAME <- '*' + MAPQ <- 255 + RNEXT <- '*' + PNEXT <- 0 + TLEN <- 0 + QUAL <- '*' + POS <- 0 + SEQ <- '*' + CIGAR <- '*' + FLAG <- 0x4 + annoted <- '?' + XA <- paste('XA:Z:', annoted, sep='') + res <- c(FLAG, RNAME, POS, MAPQ, CIGAR, RNEXT, PNEXT, TLEN, + as.character(SEQ), QUAL, XA) + res <- unique(data.frame(t(res))) + }else{ + pro <- proteinseq[idx, ] + sta_pos <- unlist(lapply(pro[, 'peptide'], function(x) + regexpr(peptide, x, fixed=TRUE))) + pep_len <- nchar(peptide) + end_pos <- sta_pos + pep_len - 1 + + coding <- procodingseq[match(pro[, 'pro_name'], + procodingseq[, 'pro_name']), ] + code_s <- (sta_pos-1) * 3 + 1 + code_e <- end_pos * 3 + codingseq <- substring(coding[, 'coding'], code_s, code_e) + + + + exonp <- lapply(pro[, 'tx_name'], function(x) + exon_anno[exon_anno[, 'tx_name']==x, ]) + + exonp <- lapply(exonp, function(x){ + if(length(unique(x[, 'tx_id'])) > 1){ + x[grep(x[1, 'tx_id'], x[, 'tx_id'], + fixed=TRUE), ] + }else x + }) + + if(passedPSM[i, 'hit_rank'] == 1) pri <- TRUE else pri <- FALSE + + res <- mapply(function(x, y, z, m) + if(dim(z)[1] == 0){ + .proteinUnannotated(x, y, z, m, primary=pri) + }else{ + if((nchar(m) != 3*pep_len) | (y > max(z[, 'cds_end'], + na.rm = TRUE))){ + #if(toString(translate(DNAString(m))) != peptide){ + .peptideUnannotated(x, y, z, m, primary=pri) + }else{ + .MapCoding2genome(x, y, z, m, primary=pri) + } + }, + code_s, code_e, exonp, codingseq) + + res <- unique(data.frame(t(res))) + + } + XL <- paste('XL:i:', as.numeric(spectrumcount[QNAME]), sep='') + NH <- paste('NH:i:', dim(res)[1], sep='') + XP <- paste('XP:Z:', peptide, sep='') + #XF <- paste('XF:f:', round(passedPSM[i, XFcolumn], digits=4), sep='') + XC <- paste('XC:i:', passedPSM[i, 'assumed_charge'], sep='') + XS <- paste('XS:f:', round(as.numeric(passedPSM[i, columnName]), + digits=4), sep='') + #XA <- paste('XA:Z:', annoted, sep='') + XN <- paste('XN:i:', passedPSM[i, 'num_missed_cleavages'], sep='') + XT <- paste('XT:i:', passedPSM[i, 'num_tol_term'], sep='') + + XM <- ifelse(is.na(passedPSM[i, 'modification']), paste('XM:Z:-'), + paste('XM:Z:', passedPSM[i, 'modification'], sep='')) + + res <- cbind(QNAME, res, NH, XL, XP, XC, XS, XM, XN, XT) + SAM <- rbind(SAM, res) + } + + file.copy(header_file, OutputFile) + write.table(SAM, file=OutputFile, sep='\t', quote=FALSE, row.names=FALSE, col.names=FALSE, append=TRUE) +} + + + +.proteinUnannotated <-function(c_sta, c_end, exon_anno, cseq, primary=TRUE, ...) +{ + RNAME <- '*' + MAPQ <- 255 + RNEXT <- '*' + PNEXT <- 0 + TLEN <- 0 + QUAL <- '*' + POS <- 0 + SEQ <- '*' + CIGAR <- '*' + annoted <- 2 + XA <- paste('XA:Z:', annoted, sep='') + FLAG <- 0x4 + + tmp <- c(FLAG, RNAME, POS, MAPQ, CIGAR, RNEXT, PNEXT, TLEN, + as.character(SEQ), QUAL, XA) + tmp +} + + + +.peptideUnannotated <- function(c_sta, c_end, exon_anno, cseq, primary=TRUE, ...) +{ + #RNAME <- as.character(exon_anno[1, 'chromosome_name']) + RNAME <- '*' + MAPQ <- 255 + RNEXT <- '*' + PNEXT <- 0 + TLEN <- 0 + QUAL <- '*' + POS <- 0 + SEQ <- '*' + CIGAR <- '*' + annoted <- 1 + XA <- paste('XA:Z:', annoted, sep='') + FLAG <- 0x4 + #if(exon_anno[1, 'strand'] == '+'){ + # FLAG <- ifelse(primary==TRUE, 0x00, 0x00+0x100) + #}else{ + # FLAG <- ifelse(primary==TRUE, 0x10, 0x10+0x100) + #} + + tmp <- c(FLAG, RNAME, POS, MAPQ, CIGAR, RNEXT, PNEXT, TLEN, + as.character(SEQ), QUAL, XA) + tmp +} + + + +.MapCoding2genome <- function(c_sta, c_end, exon_anno, cseq, primary=TRUE, ...) +{ + idxs <- intersect(which(exon_anno[, 'cds_start'] <= c_sta), + which(exon_anno[, 'cds_end'] >= c_sta)) + idxe <- intersect(which(exon_anno[, 'cds_start'] <= c_end), + which(exon_anno[, 'cds_end'] >= c_end)) + len <- c_end - c_sta + 1 + RNAME <- as.character(exon_anno[1, 'chromosome_name']) + MAPQ <- 255 + RNEXT <- '*' + PNEXT <- 0 + TLEN <- 0 + QUAL <- '*' + annoted <- 0 + XA <- paste('XA:Z:', annoted, sep='') + + if(exon_anno[1, 'strand'] == '+'){ + POS <- exon_anno[idxs, 'cds_chr_start'] + c_sta - + exon_anno[idxs, 'cds_start'] + SEQ <- DNAString(cseq) + FLAG <- ifelse(primary==TRUE, 0x00, 0x00+0x100) + + }else{ + POS <- exon_anno[idxe, 'cds_chr_start'] + + exon_anno[idxe, 'cds_end'] - c_end + SEQ <- reverseComplement(DNAString(cseq)) + FLAG <- ifelse(primary==TRUE, 0x10, 0x10 + 0x100) + } + + if(idxe == idxs){ + CIGAR <- paste(len, 'M', sep='') + }else{ + if(exon_anno[1, 'strand'] == '+'){ + #insert <- exon_anno[idxe, 'cds_chr_start'] - exon_anno[idxs, + # 'cds_chr_end']- 1 + part1 <- exon_anno[idxs, 'cds_end'] - c_sta + 1 + part2 <- c_end - exon_anno[idxe, 'cds_start'] + 1 + + insert <- unlist(lapply(1:(idxe - idxs), function(x) + paste(exon_anno[idxs + x, 'cds_chr_start'] - + exon_anno[idxs+x-1, 'cds_chr_end']- 1, 'N', sep=''))) + if(idxe-idxs >1){ + innerexon <- unlist(lapply(1:(idxe-idxs-1), function(x) + paste(exon_anno[idxs + x, 'cds_chr_end'] - + exon_anno[idxs + x, 'cds_chr_start'] + 1, 'M', + sep=''))) + }else{ innerexon <- ''} + + #ifelse(idxe-idxs >1, unlist(lapply(1:(idxe-idxs-1), function(x) + #paste(exon_anno[idxs+x, 'cds_chr_end'] - + # exon_anno[idxs+x, 'cds_chr_start']+1, + # 'M', sep=''))), '') + midpattern <- rep(NA, length(insert) + length(innerexon)) + midpattern[seq(1, length(insert) + length(innerexon), + by=2)] <- insert + midpattern[seq(2, length(insert) + length(innerexon), + by=2)] <- innerexon + midpattern <- paste(midpattern, collapse='') + + }else{ + #insert <- exon_anno[idxs, 'cds_chr_start'] - + # exon_anno[idxe, 'cds_chr_end']- 1 + part1 <- c_end- exon_anno[idxe, 'cds_start'] + 1 + part2 <- exon_anno[idxs, 'cds_end'] - c_sta + 1 + + insert <- unlist(lapply(1:(idxe-idxs), function(x) + paste(exon_anno[idxe - x, 'cds_chr_start'] - + exon_anno[idxe-x + 1, 'cds_chr_end']- 1, + 'N', sep=''))) + if(idxe-idxs >1){ + innerexon <- unlist(lapply(1:(idxe-idxs-1), function(x) + paste(exon_anno[idxe-x, 'cds_chr_end'] - + exon_anno[idxe-x, 'cds_chr_start']+1, 'M', sep=''))) + }else{ innerexon <-''} + + midpattern <- rep(NA, length(insert)+length(innerexon)) + midpattern[seq(1, length(insert) + length(innerexon), + by=2)] <- insert + midpattern[seq(2, length(insert) + length(innerexon), + by=2)] <- innerexon + midpattern <- paste(midpattern, collapse='') + + } + + CIGAR <- paste(part1, 'M', midpattern, part2, 'M', sep='') + } + + tmp <- c(FLAG, RNAME, POS, MAPQ, CIGAR, RNEXT, PNEXT, TLEN, + as.character(SEQ), QUAL, XA) + tmp +} + + +params <- list() +for(param in names(opt)) +{ + if (!param == "help") + params[param] <- opt[param] +} + +setClass("GalaxyRemoteError", contains="character") +wrappedFunction <- function(f) +{ + tryCatch(do.call(f, params), + error=function(e) new("GalaxyRemoteError", conditionMessage(e))) +} + + +suppressPackageStartupMessages(library(RGalaxy)) +do.call(PSMtab2SAM, params) + +## end warning handler +}, warning = function(w) { + cat(paste("Warning:", conditionMessage(w), "\n")) + invokeRestart("muffleWarning") +})
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/PSM2SAM.xml Fri Oct 02 14:14:15 2015 -0400 @@ -0,0 +1,61 @@ +<tool id="PSMtoSAM" name="PSM to SAM" version="1.3.2"> + <description>Generate SAM files from PSMs.</description> + <command interpreter="Rscript --vanilla">PSM2SAM.R + #if str($input).strip() != "": + --passedPSM="$input" + #end if + #if str($scoreColumn).strip() != "": + --XScolumn="$scoreColumn" + #end if + #if str($optionalUserInput.exonAnno).strip() != "None": + --exon_anno="$optionalUserInput.exonAnno" + #end if + #if str($optionalUserInput.proteinSeq).strip() != "None": + --proteinseq="$optionalUserInput.proteinSeq" + #end if + #if str($optionalUserInput.proCodingSeq).strip() != "None": + --procodingseq="$optionalUserInput.proCodingSeq" + #end if + #if str($optionalUserInput.header).strip() != "None": + --header="$optionalUserInput.header" + #end if + #if str($output).strip() != "": + --OutputFile="$output" + #end if + +2>&1</command> + <inputs> + <param name="input" type="data" format="idpdb" help="An IDPicker idpDB file to convert to SAM" label="Input PSMs"> + <validator type="empty_field" message="This field is required."/> + </param> + <param name="scoreColumn" type="text" help="The name of a PSM score to include in the SAM output (e.g. "MyriMatch:mvh")" size="60" label="Score Name"> + <validator type="empty_field" message="This field is required."/> + </param> + <section name="optionalUserInput" label="Override Default Exon Annotation and Coding Sequences"> + <param name="exonAnno" type="data" format="RData" help="A dataframe of exon annotations in an RData file" label="Exon Annotations" optional="true" /> + <param name="proteinSeq" type="data" format="RData" help="A dataframe containing protein ids and protein sequences in an RData file" label="Protein Sequences" optional="true" /> + <param name="proCodingSeq" type="data" format="RData" help="A dataframe cotaining coding sequences for each protein in an RData file" label="Protein Coding Sequences" optional="true" /> + <param name="header" type="data" format="txt" help="A text file of SAM headers to include in the output file, usually corresponding to the exon and coding sequences used." label="SAM Headers" optional="true" /> + </section> + </inputs> + <outputs> + <data format="sam" name="output" label="${input.name.rsplit('.',1)[0]}.sam"/> + </outputs> + <!--<tests> + <test> + <param name="input" value="dbo_z_20101126_JK_Res_Sens_Set2_H1819_A_07-subset.idpDB.gz" /> + <param name="scoreColumn" value="Myrimatch:MVH" /> + <output name="output" file="dbo_z_20101126_JK_Res_Sens_Set2_H1819_A_07-subset.sam" /> + </test> + <test> + <param name="input" value="Ellis_033_2700_261_07-unrefined-subset.idpDB.gz" /> + <param name="scoreColumn" value="Myrimatch:MVH" /> + <output name="output" file="Ellis_033_2700_261_07-unrefined-subset.sam" /> + </test> + </tests>--> + <help> +**Description** + +Generate SAM files from confident peptide-spectrum-matches (PSMs). +</help> +</tool> \ No newline at end of file
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.md Fri Oct 02 14:14:15 2015 -0400 @@ -0,0 +1,54 @@ +GalaxyP - PSM to SAM +=================== + +* Home: <https://github.com/galaxyproteomics/tools-galaxyp/> +* Galaxy Tool Shed: <http://toolshed.g2.bx.psu.edu/view/galaxyp/PSMtoSAM> +* Tool ID: `PSMtoSAM` + + +Description +----------- + +Generate proBAMr SAM files from IDPicker peptide-spectrum-matches. + +See: + +* <https://www.bioconductor.org/packages/release/bioc/html/proBAMr.html> + + +GalaxyP Community +----------------- + +Current governing community policies for [GalaxyP](https://github.com/galaxyproteomics/) and other information can be found at: + +<https://github.com/galaxyproteomics> + + +License +------- + +Copyright (c) 2015 Vanderbilt University and Authors listed below. + +To the extent possible under law, the author(s) have dedicated all copyright and related and neighboring rights to this software to the public domain worldwide. This software is distributed without any warranty. + +You should have received a copy of the CC0 Public Domain Dedication along with this software. If not, see <https://creativecommons.org/publicdomain/zero/1.0/>. + +You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. + + +Contributing +------------ + +Contributions to this repository are reviewed through pull requests. If you would like your work acknowledged, please also add yourself to the Authors section. If your pull request is accepted, you will also be acknowledged in <https://github.com/galaxyproteomics/tools-galaxyp/> + + +Authors +------- + +Authors and contributors: + +* Matt Chambers <matt.chambers42@gmail.com> + Vanderbilt University Medical Center + +* Xiaojing Wang + Vanderbilt University Medical Center
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tool-data/header_refseq_hg19.txt Fri Oct 02 14:14:15 2015 -0400 @@ -0,0 +1,95 @@ +@HD VN:1.0 SO:coordinate +@SQ SN:chr1 LN:249250621 +@SQ SN:chr10 LN:135534747 +@SQ SN:chr11 LN:135006516 +@SQ SN:chr11_gl000202_random LN:40103 +@SQ SN:chr12 LN:133851895 +@SQ SN:chr13 LN:115169878 +@SQ SN:chr14 LN:107349540 +@SQ SN:chr15 LN:102531392 +@SQ SN:chr16 LN:90354753 +@SQ SN:chr17 LN:81195210 +@SQ SN:chr17_ctg5_hap1 LN:1680828 +@SQ SN:chr17_gl000203_random LN:37498 +@SQ SN:chr17_gl000204_random LN:81310 +@SQ SN:chr17_gl000205_random LN:174588 +@SQ SN:chr17_gl000206_random LN:41001 +@SQ SN:chr18 LN:78077248 +@SQ SN:chr18_gl000207_random LN:4262 +@SQ SN:chr19 LN:59128983 +@SQ SN:chr19_gl000208_random LN:92689 +@SQ SN:chr19_gl000209_random LN:159169 +@SQ SN:chr1_gl000191_random LN:106433 +@SQ SN:chr1_gl000192_random LN:547496 +@SQ SN:chr2 LN:243199373 +@SQ SN:chr20 LN:63025520 +@SQ SN:chr21 LN:48129895 +@SQ SN:chr21_gl000210_random LN:27682 +@SQ SN:chr22 LN:51304566 +@SQ SN:chr3 LN:198022430 +@SQ SN:chr4 LN:191154276 +@SQ SN:chr4_ctg9_hap1 LN:590426 +@SQ SN:chr4_gl000193_random LN:189789 +@SQ SN:chr4_gl000194_random LN:191469 +@SQ SN:chr5 LN:180915260 +@SQ SN:chr6 LN:171115067 +@SQ SN:chr6_apd_hap1 LN:4622290 +@SQ SN:chr6_cox_hap2 LN:4795371 +@SQ SN:chr6_dbb_hap3 LN:4610396 +@SQ SN:chr6_mann_hap4 LN:4683263 +@SQ SN:chr6_mcf_hap5 LN:4833398 +@SQ SN:chr6_qbl_hap6 LN:4611984 +@SQ SN:chr6_ssto_hap7 LN:4928567 +@SQ SN:chr7 LN:159138663 +@SQ SN:chr7_gl000195_random LN:182896 +@SQ SN:chr8 LN:146364022 +@SQ SN:chr8_gl000196_random LN:38914 +@SQ SN:chr8_gl000197_random LN:37175 +@SQ SN:chr9 LN:141213431 +@SQ SN:chr9_gl000198_random LN:90085 +@SQ SN:chr9_gl000199_random LN:169874 +@SQ SN:chr9_gl000200_random LN:187035 +@SQ SN:chr9_gl000201_random LN:36148 +@SQ SN:chrM LN:16571 +@SQ SN:chrUn_gl000211 LN:166566 +@SQ SN:chrUn_gl000212 LN:186858 +@SQ SN:chrUn_gl000213 LN:164239 +@SQ SN:chrUn_gl000214 LN:137718 +@SQ SN:chrUn_gl000215 LN:172545 +@SQ SN:chrUn_gl000216 LN:172294 +@SQ SN:chrUn_gl000217 LN:172149 +@SQ SN:chrUn_gl000218 LN:161147 +@SQ SN:chrUn_gl000219 LN:179198 +@SQ SN:chrUn_gl000220 LN:161802 +@SQ SN:chrUn_gl000221 LN:155397 +@SQ SN:chrUn_gl000222 LN:186861 +@SQ SN:chrUn_gl000223 LN:180455 +@SQ SN:chrUn_gl000224 LN:179693 +@SQ SN:chrUn_gl000225 LN:211173 +@SQ SN:chrUn_gl000226 LN:15008 +@SQ SN:chrUn_gl000227 LN:128374 +@SQ SN:chrUn_gl000228 LN:129120 +@SQ SN:chrUn_gl000229 LN:19913 +@SQ SN:chrUn_gl000230 LN:43691 +@SQ SN:chrUn_gl000231 LN:27386 +@SQ SN:chrUn_gl000232 LN:40652 +@SQ SN:chrUn_gl000233 LN:45941 +@SQ SN:chrUn_gl000234 LN:40531 +@SQ SN:chrUn_gl000235 LN:34474 +@SQ SN:chrUn_gl000236 LN:41934 +@SQ SN:chrUn_gl000237 LN:45867 +@SQ SN:chrUn_gl000238 LN:39939 +@SQ SN:chrUn_gl000239 LN:33824 +@SQ SN:chrUn_gl000240 LN:41933 +@SQ SN:chrUn_gl000241 LN:42152 +@SQ SN:chrUn_gl000242 LN:43523 +@SQ SN:chrUn_gl000243 LN:43341 +@SQ SN:chrUn_gl000244 LN:39929 +@SQ SN:chrUn_gl000245 LN:36651 +@SQ SN:chrUn_gl000246 LN:38154 +@SQ SN:chrUn_gl000247 LN:36422 +@SQ SN:chrUn_gl000248 LN:39786 +@SQ SN:chrUn_gl000249 LN:38502 +@SQ SN:chrX LN:155270560 +@SQ SN:chrY LN:59373566 +@PG ID:proBAM