bcbio-nextgen v1.1.1 Release Notes
Release Date: 2018-11-06 // over 5 years ago-
- ๐ single-cell RNA-seq: add built-in support for 10x_v2.
- ๐ Fix UMI support for small RNA. Compatible with Qiagen UMI small RNA protocol.
- Ignore .Renviron when running Rscript to head-off PATH conflicts.
- Support SRR ids to download samples with bcbio_prepare_samples script.
- 0๏ธโฃ tumor-only prioritization: do not apply LowPriority filter by default, instead
annotate with external databases. Use
tumoronly_germline_filter
to re-enable previous behavior. - 0๏ธโฃ UMIs: apply default filtering based on de-duplicated read depth. Uses
--min-reads 2
with raw de-duplicated coverage of 800 or more or--min-reads 1
otherwise. Allows error correction with UMIs for higher depth samples. - 0๏ธโฃ gemini: databases no longer created by default. Use
tools_on: [gemini]
ortools_on: [gemini_orig]
to create a database. We now use a reduced database for build 37 to match build 38 and make this forward compatible with CWL. - 0๏ธโฃ vcfanno: run gemini and somatic annotations by default, producing annotated VCFs with external information.
- ๐ alignment preparation: support a list of split files from multiple sequencing lanes, merging into a single fastq
- ๐ variant: support octopus variant caller for germline and somatic samples.
- peddy: fix bug where not all files uploaded on first pipeline run
- peddy: For somatic analyses use separate germline calls for tumor/normal, if available, or extracted germline calls from supported callers, instead of somatic variants.
- ๐ GATK: support ploidy specification during joint calling.
- GATK BQSR: bin qualities into static groups (10, 20, 30) to match GATK4 recommendations. Thanks to Severine Catreux.
- ๐ GATK: support 4.0.10.0 which does not use UCSC 2bit references for Spark tools
- ๐ variant calling: support bcftools 1.9 which is more strict about duplicated key names in INFO and FORMAT.
- seq2c: Upload global calls, coverage and read_mapping files to project directory.
- RNA-seq variant calling: Apply annotations after joint calling for GATK to avoid import errors with GenomicsDB. Thanks to Komal Rathi.
- โฌ๏ธ CWL: add
--cwl
target to bcbio_nextgen.py upgrade to add and maintain bcbio-vm. - CWL: use standard null instead of string "null" for representing None values.
- ๐ CWL: support for heterogeneity and structural variant callers that make use of variant inputs.
- ๐ CWL: support ensemble calling for combining multiple variant callers.
- ๐ ensemble: remove no-ALT ref calls that contribute to incorrect ensemble outputs
- RNA-seq: output a matrix of un-deduped UMI counts when doing single-cell/DGE
for quality control purposes. This is called
tagcounts-dupes.mtx
in the final directory. - single-cell RNA-seq: allow pre-transformed FASTQ files as input to DGE/single-cell pipeline.
- single-cell RNA-seq: only create one index per specified genome instead of per sample
- fgbio: back compatibility for older quality setting
--min-consensus-base-quality
- RNA-seq: fix for
fusion_caller
getting interpreted as a path, leading to memoization/upload issues. - RNA-seq: memoize rRNA quality calculations, speeding up reruns.
- RNA-seq: prefix
description
with an X if it starts with a number, for R compatibility. Thanks to Avinash Reddy and Dan Stetson at AstraZeneca. - single-cell RNA-seq: respect
--positional
flag with the new tag counting. Thanks to Babak Alaei at AstraZeneca. - 0๏ธโฃ RNA-seq: turn on
--seqBias
flag by default for Salmon as early-version overfitting issues have been fixed. - RNA-seq: report insert size from Salmon fragment distribution, not samtools stats.
- RNA-seq: when processing explant samples, produce a combined tx2gene.csv file from all organisms processed.