All Versions
Latest Version
Avg Release Cycle
58 days
Latest Release
73 days ago

Changelog History
Page 1

  • v1.2.4

    September 21, 2020

    1.2.4 (21 September 2020)

    • โœ‚ Remove deprecated --genomicsdb-use-vcf-codec option as this is now the default.
    • โž• Add bismark output to MultiQC.
    • ๐Ÿ›  Fix PS genotype field from octopus to have the correct type.
    • ๐Ÿ‘ Edit VarDict headers to report VCFv4.2, since htsjdk does not fully support VCFv4.3 yet.
    • Attempt to speed up bismark by implementing the parallelization strategy suggested here: FelixKrueger/Bismark#96
    • โž• Add --enumerate option to OptiType to report the top 10 calls and scores, to make it easier to decide how confident we are in
      a HLA call.
    • ๐ŸŽ Performance improvements when HLA calling during panel sequencing. This skips running bwa-kit during the initial
      mapping for consensus UMI detection, greatly speeding up panel sequencing runs.
    • ๐Ÿ‘ Allow custom options to be passed to featureCounts.
    • ๐Ÿ›  Fix race condition when running tests.
    • โž• Add TOPMed as a datatarget.
    • โž• Add predicted transcript and peptide output to arriba.
    • โž• Add mm10 as a supported genome for arriba.
    • Skip bcbioRNASeq for more than 100 samples.
    • โž• Add rRNA_pseudogene as a rRNA biotype.
    • โž• Add --genomicsdb-use-vcf-codec when running GenotypeGVCF. See GenotypeGVCFs#--genomicsdb-use-vcf-codec for
      a discussion. Thanks to @amizeranschi for finding the issue and posting the solution.
    • โšก๏ธ update VEP to v100
    • โž• Add consensus peak calling using
      to collapse overlapping peaks.
    • Pre-filter consensus peaks by removing peaks with FDR > 0.05 before performing consensus peak calling.
    • โž• Add support for Qiagen's Qiaseq UPX 3' transcriptome kit for DGE. Support for 96 and 384 well configurations
      by specifying umi_type: qiagen-upx-96 or umi_type: qiagen-upx-384.
    • โž• Add consensus peak counting using featureCounts.
    • Skip using autosomal-reference when calling ataqv for mouse/human, as this has a problem with ataqv
      ๐Ÿ‘€ (see ParkerLab/ataqv#10) for discussion and followup.
    • โž• Add pre-generated ataqv HTML report to upload directory.
    • ๐Ÿ‘Œ Support single-end reads for ATAC-seq.
    • ๐Ÿ”‹ Move featureCount output files to featureCounts directory in project directory.
    • โœ‚ Remove RNA and reads in peak stats from MultiQC table when they are not calculated for a pipeline.
    • Only show somatic variant counts in the general stats table, if germline variants are calculated.
    • โž• Add kit parameter for setting options for pipelines via just listing the kit. Currently only implemented for WGBS.
  • v1.2.3

    April 07, 2020
    • ๐Ÿš‘ Hotfix for not being able to upgrade from stable distribution.
  • v1.2.2

    April 05, 2020
    • ๐Ÿ›  Fix for not properly looking up R environment variables in the base environment.
    • โœ‚ Remove --use-new-qual-calculator which was eliminated in GATK
    • 0๏ธโƒฃ Ensure header is not written for a Series. In pandas 0.24.0 the default for header was changed from
      False to True so we have to set it explictly now.
    • โœ‚ Remove unused Dockerfile. Thanks to @matthdsm.
    • ATAC-seq: Skip peak-calling on fractions with < 1000 reads.
  • v1.2.1

    March 25, 2020
    • โšก๏ธ Update ChIP and ATAC bowtie2 runs to use --very-sensitive.
    • Properly pad TSS BED file for ataqv TSS enrichment metrics.
    • Skip bcbioRNASeq if there are less than three samples.
    • โš™ Run joint-calling with single cores to save resources.
    • ๐Ÿ‘ Re-support PureCN.
    • Skip segments with no informative SNPs when creating the LOH VCF file from PureCN output.
    • ๐Ÿ›  Fix for duplicated output for mosdepth in quality control report.
    • ๐Ÿ›  Fix for missing rRNA statistics.
  • v1.2.0

    February 07, 2020
    • ๐Ÿ›  Fix for bismark not being a supported aligner.
    • โš™ Run ataqv ( to calculate additional ATAQ-seq quality control
    • โ†ช Workaround for some bcbioRNASeq plots failing with many samples when interesting_groups is not set.
    • โž• Add known_fusions parameter for passing in known fusions to arriba.
    • ๐Ÿ›  Fix for tx2gene not working properly on some GTF files.
    • Sort MACS2 output with UNIX sort to avoid memory issues.
    • โš™ Run RiP on full peak file for ATAC-seq.
    • โš™ Run ataqv on unfiltered BAM file with the full peak file.
    • โš™ Run peddy on the population variant file, not the individual sample level file if joint calling was done.
    • โž• Add STAR to MultiQC metrics.
    • Throw an error if STAR is run on a genome with alts.
    • Don't run bcbioRNASeq if there is only one sample. Thanks to @kmendler for the suggestion.
    • ๐Ÿ‘Œ Improve arriba sensitivity by setting --peOverlapNbasesMin 10 and --alignSplicedMateMapLminOverLmate 0.5 when
      ๐Ÿ‘€ running STAR (see suhrig/arriba#41).
    • ๐Ÿ‘‰ Make TPM and counts files from tximport automatically.
    • ๐Ÿ‘‰ Use --keepDuplicates when making the Salmon index. This keeps transcripts that are identical in the index instead of
      randomly choosing one. This helps when comparing to other ways of quantifying the transcripts, ensuring all of
      the transcripts are represented.
    • โœ‚ Remove unnecessary "quant" subdirectory for Salmon runs. This allows MultiQC to properly name the samples.
    • ๐ŸŒฒ Ensure STAR log file is propagated to the upload directory.
    • Fix issue with memory not being specified properly when running
    • โš™ Run tximport automatically and store TPM in project/date/tpm and counts in project/date/counts.
    • ๐Ÿ‘€ Calculate ENCODE quality flags for ATAC-seq. See for a
      description of what the metrics mean.
    • ๐Ÿ›  Fix for command line being too long while joint genotyping thousands of samples.
    • ๐Ÿ›  Fix for command line being too long when running the CWL workflow with cromwell.
  • v1.1.9

    December 06, 2019
    • ๐Ÿ›  Fix for get VEP cache.
    • ๐Ÿ‘Œ Support Picard's new syntax for ReorderSam (REFERENCE -> SEQUENCE_DICTIONARY).
    • โœ‚ Remove mitochondrial reads from ChIP/ATAC-seq calling.
    • โž• Add documentation describing ATAC-seq outputs.
    • โž• Add ENCODE library complexity metrics for ATAC/ChIP-seq to MultiQC report
      ๐Ÿ‘€ (see for a description of the metrics)
    • โž• Add STAR sample-specific 2-pass. This helps assign a moderate number of reads per genes. Thanks
      to @naumenko-sa for the intial implementation and push to get this going.
    • ๐Ÿ›  Index transcriptomes only once for pseudo/quasi aligner tools. This fixes race conditions that
      can happen.
    • โž• Add --buildversion option, for tracking which version of a gene build was used. This is used
      during Suggested formats are source_version, so Ensembl_94,
      EnsemblMetazoa_25, FlyBase_26, etc.
    • Sort MACS2 bedgraph files before compressing. Thanks to @LMannarino for the suggestion.
    • ๐Ÿ“‡ Check for the reserved field sample in RNA-seq metadata and quit with a useful error message.
      Thanks to @marypiper for suggesting this.
    • ๐Ÿ†“ Split ATAC-seq BAM files into nucleosome-free and mono/di/tri nucleosome files, so we can call
      peaks on them separately.
    • Call peaks on NF/MN/DN/TN regions separately for each caller during ATAC-seq.
    • ๐Ÿ‘ Allow viral contamination to be assasyed on non tumor/normal samples.
    • Ensure EBV coverage is calculated when run on genomes with it included as a contig.
  • v1.1.8

    October 29, 2019
    • โž• Add antibody configuration option. Setting a specific antibody for ChIP-seq will use appropriate
      ๐Ÿ“š settings for that antibody. See the documentation for supported antibodies.
    • Add use_lowfreq_filter for forcing vardict to report variants with low allelic frequency,
      ๐Ÿ‘‰ useful for calling somatic variants in panels with high coverage.
    • ๐Ÿ›  Fix for checking for pre-existing inputs with python3.
    • โž• Add keep_duplicates option for ChIP/ATAC-seq which does not remove duplicates before peak calling.
      0๏ธโƒฃ Defaults to False.
    • โž• Add keep_multimappers for ChIP/ATAC-seq which does not remove multimappers before peak calling.
      0๏ธโƒฃ Defaults to False.
    • โœ‚ Remove ethnicity as a required column in PED files.
  • v1.1.7

    October 11, 2019


    • ๐Ÿ‘ hot fix for dataclasses not being supported in 3.6. Use namedtuple instead.
  • v1.1.6

    October 10, 2019
    • GATK ApplyBQSRSpark: avoid StreamClosed issue with GATK 4.1+
    • ๐Ÿ›  RNA-seq: fixes for cufflinks preparation due to python3 transition.
    • RNA-seq: output count tables from tximport for genes and transcripts. These
      are in bcbioRNASeq/results/date/genes/counts and
    • qualimap (RNA-seq): disable stranded mode for qualimap, as it gives incorrect
      results with the hisat2 aligner and for RNA-seq just setting it to unstranded
    • Add quantify_genome_alignments option to use genome alignments to quantify
      with Salmon.
    • โž• Add --validateMappings flag to Salmon read quantification mode.
    • VEP cache is not installing anymore from bcbio run
    • โž• Add support for Salmon SA method when STAR alignments are not available
      (for hg38).
    • โž• Add support for the new read model for filtering in Mutect2. This is
      experimental, and a little flaky, so it can optionally be turned on via:
      tools_on: mutect2_readmodel. Thanks to @lbeltrame for implementing this
      ๐Ÿ”‹ feature and doing a ton of work debugging.
    • Swap pandas from_csv call to read_csv.
    • ๐Ÿ‘‰ Make STAR respect the transcriptome_gtf option.
    • Prefix regular expression with r. Thanks to @smoe for finding all of these.
    • โž• Add informative logging messages at beginning of bcbio run. Includes the version
      ๐Ÿ”ง and the configuration files being used.
    • Swap samtools mpileup to use bcftools mpileup as samtools mpileup is being
      ๐Ÿš€ deprecated (
    • ๐Ÿ‘ Ensure locale is set to one supporting UTF-8 bcbio-wide. This may need to get
      โช reverted if it introduces issues.
    • โž• Added hg38 support for STAR. We did this by taking hg38 and removing the alts,
      decoys and HLA sequences.
    • โž• Added support for the arriba fusion caller.
    • โž• Added back missing programs from the version provenance file. Fixed formatting
      problems introduced by switch to python3.
    • โž• Added initial support for whole genome bisulfite sequencing using bismark. Thanks to
      @hackdna for implementing this and @jnhutchinson for drafting the initial
      ๐Ÿšง pipeline. This is a work in progress in collaboration with @gcampanella, who
      ๐Ÿ”€ has a similar implementation with some extra features that we will be merging
      in soon.
    • 0๏ธโƒฃ qualimap for RNA-seq runs on the downsampled BAM files by default. Set
      tools_on: [qualimap_full] to run on the full BAM files.
    • โž• Add STAR junction files to the files captured at the end of a run.
  • v1.1.5

    April 12, 2019
    • ๐Ÿ›  Fixes for Python3 incompatibilities on distributed IPython runs.
    • Numerous smaller Python3 incompatibilities with strings/unicode and types. Thanks to the community for reporting these.
    • GATK HaplotypeCaller: correctly apply skipping of marked duplicates only for amplicon runs. Thanks to Ben Liesfeld.
    • ๐Ÿ›  Fix format detection for bzip2 fastq inputs.
    • ๐Ÿ‘Œ Support latest GATK4 MuTect2 ( with changes to ploidy and reference parameters.
    • ๐Ÿ‘Œ Support changes to GATK4 for VQSR --resource specification in Thanks to Timothee Cezard.
    • ๐Ÿ‘Œ Support latest bedtools (2.28.0) which expects SAM heads for bgzipped BED inputs.