# HG changeset patch # User peterjc # Date 1478261254 14400 # Node ID bc263e94ea981ee5448795fe24722e04d320860c # Parent 6b728365569a661d292bf0a1319505d8199dfc62 Uploaded v0.2.5, ignores blank names in tabular files, based on contribution from Gildas Le Corguill? diff -r 6b728365569a -r bc263e94ea98 tools/seq_filter_by_id/README.rst --- a/tools/seq_filter_by_id/README.rst Tue May 17 05:53:52 2016 -0400 +++ b/tools/seq_filter_by_id/README.rst Fri Nov 04 08:07:34 2016 -0400 @@ -89,6 +89,8 @@ v0.2.3 - Ignore blank lines in ID file (contributed by Gildas Le Corguillé). - Defensive quoting of filenames etc in the command definition (internal change only). +v0.2.4 - Corrected error message wording. +v0.2.5 - Ignore empty names, common in R output (Gildas Le Corguillé). ======= ====================================================================== diff -r 6b728365569a -r bc263e94ea98 tools/seq_filter_by_id/seq_filter_by_id.py --- a/tools/seq_filter_by_id/seq_filter_by_id.py Tue May 17 05:53:52 2016 -0400 +++ b/tools/seq_filter_by_id/seq_filter_by_id.py Fri Nov 04 08:07:34 2016 -0400 @@ -74,7 +74,7 @@ options, args = parser.parse_args() if options.version: - print "v0.2.3" + print "v0.2.5" sys.exit(0) in_file = options.input @@ -93,7 +93,7 @@ if logic not in ["UNION", "INTERSECTION"]: sys.exit("Logic agrument should be 'UNION' or 'INTERSECTION', not %r" % logic) if options.id_list and args: - sys.exit("Cannot accepted IDs via both -t and as tabular files") + sys.exit("Cannot accept IDs via both -t in the command line, and as tabular files") elif not options.id_list and not args: sys.exit("Expected matched pairs of tabular files and columns (or -t given)") if len(args) % 2: @@ -181,7 +181,7 @@ '\r': '__cr__', '\t': '__tc__', '#': '__pd__', - } +} # Read tabular file(s) and record all specified identifiers ids = None # Will be a set @@ -206,15 +206,19 @@ continue parts = line.rstrip("\n").split("\t") for col in columns: - file_ids.add(clean_name(parts[col])) + name = clean_name(parts[col]) + if name: + file_ids.add(name) else: # Single column, special case speed up col = columns[0] for line in handle: - if not line.strip(): #skip empty lines + if not line.strip(): # skip empty lines continue if not line.startswith("#"): - file_ids.add(clean_name(line.rstrip("\n").split("\t")[col])) + name = clean_name(line.rstrip("\n").split("\t")[col]) + if name: + file_ids.add(name) print "Using %i IDs from column %s in tabular file" % (len(file_ids), ", ".join(str(col + 1) for col in columns)) if ids is None: ids = file_ids diff -r 6b728365569a -r bc263e94ea98 tools/seq_filter_by_id/seq_filter_by_id.xml --- a/tools/seq_filter_by_id/seq_filter_by_id.xml Tue May 17 05:53:52 2016 -0400 +++ b/tools/seq_filter_by_id/seq_filter_by_id.xml Fri Nov 04 08:07:34 2016 -0400 @@ -1,8 +1,7 @@ - + from a tabular file biopython - Bio diff -r 6b728365569a -r bc263e94ea98 tools/seq_filter_by_id/tool_dependencies.xml --- a/tools/seq_filter_by_id/tool_dependencies.xml Tue May 17 05:53:52 2016 -0400 +++ b/tools/seq_filter_by_id/tool_dependencies.xml Fri Nov 04 08:07:34 2016 -0400 @@ -1,6 +1,6 @@ - +