UMI-tools
note
Tools for dealing with Unique Molecular Identifiers (UMIs)/(RMTs) and scRNA-Seq barcodes.
Currently, dedup and extract commands are supported.
Sample names are extracted from log files if possible. In logs, input and output file paths are printed. However, either can be redirected from stdin/stdout:
$ umi_tools extract -I input.fastq > result.fastq
stdin : <_io.TextIOWrapper name='input.fastq' mode='r' encoding='UTF-8'>
stdout : <_io.TextIOWrapper name='<stdout>' encoding='ascii'>
$ cat input.fastq | umi_tools extract -S output.fastq
stdin : <_io.TextIOWrapper name='<stdin>' mode='r' encoding='UTF-8'>
stdout : <_io.TextIOWrapper name='result.fastq' encoding='ascii'>
umi_tools requires at least one of the -I or -S options to be specified, so we can expect either
one of those to be present in the log file, and we guess prioritizing the output file name. If this
assumption fails, we extract the sample name from the log file name.
File search patterns
umitools/dedup:
  contents: "# output generated by dedup"
  num_lines: 3
umitools/extract:
  contents: "# output generated by extract"
  num_lines: 3