All Versions
62
Latest Version
Avg Release Cycle
65 days
Latest Release
835 days ago

Changelog History
Page 1

  • v1.2.9 Changes

    December 14, 2021
    • ๐Ÿ›  Fix vcf header bug: T/N SAMPLE lines are back - needed for import to SolveBio
    • โž• add strandedness: auto for -l A option in salmon
    • report 10x more peaks in CHIP/ATAC-seq - use 0.05 qvalue
    • ๐Ÿ›  fix misleading RNA-seq duplicated reads statistics: thanks @sib-bcf
    • reorganize conda environments
    • snpEff 5.0
    • strandedness: auto
    • document WGBS pipeline steps
    • 0๏ธโƒฃ make --local an option, not default in bismark alignment - too slow
    • โšก๏ธ bcbioRNASeq update to 0.3.44
    • โšก๏ธ pureCN update to 2.0.1
    • โšก๏ธ octopus update to 0.7.4
  • v1.2.8 Changes

    April 14, 2021
    • Set ENCODE library complexity flags properly for ChIP-seq. Thanks to @mistrm82.
    • ๐Ÿ›  Fix greylisted peaks not being propagated to the output directory. Thanks to @mistrm82.
    • ๐Ÿ‘ Better error message when no sample barcodes are found for single-cell RNA-seq.
    • ๐Ÿ‘ Better trimming for 2 wgbs kits
    • enable setting parameters for deduplicate_bismark
    • custom threading for bismark via yaml
    • reproducible WGBS user story with the data from Encode
    • While consensus peak calling, keep the highest scoring peak instead of calling the summit for the highest scoring peak and expanding the peak to 250 bases.
    • Enable consensus peak calling for broad peaks. Thanks to @mistrm82 and @yoonsquared for pointing out this was missing.
    • โœ… Re-enable ATAC-seq tests, they work now.
    • svprioritize for mm10
    • ๐Ÿš€ purecn_Dx.R - mutational signatures - still requires a manual update of deconstructsigs or release of it
    • ๐Ÿ‘‰ make sure purecn uses sv_regions bed to call variants
    • ๐Ÿ›  fix misleading disambiguation fastqc read statistics (total, hg38, mm10)
    • wgbs: nebemseq kit: add --maxins 1000 and --local to bismark align
    • WGBS: sorted indexed deduplicated bam for ready.bam
    • ๐Ÿ–จ print error message when aligner: false and hla typing is on
    • ๐Ÿ‘‰ make sure that mark_duplicates is false with collapsed UMI input
  • v1.2.7 Changes

    February 22, 2021
    • RNASeq: Add gene body coverage plots to multiqc report.
    • โช Restore ability to opt out of contamination checking via tools_off.
    • Properly invoke threading for verifybamid2.
    • ๐Ÿ›  Fix circular import issue when using bcbio functions outside of the main bcbio script.
    • Enable setting custom PureCN options via YAML file.
  • v1.2.6 Changes

    February 04, 2021
    • RNASeq: Fail more gracefully if SummarizedExperiment object cannot be created.
    • ๐Ÿ›  Fixes to handle DRAGEN BAM files from the first stage of UMI processing.
    • ๐Ÿ›  Fix issue with double-annotating with dbSNP. Separating out somatic variant annotation into it's own vcfanno configuration.
  • v1.2.5 Changes

    January 01, 2021
    • Joint calling for RNA-seq variant calling requires setting jointcaller to bring it in line with the configuration options for variant calling.
    • ๐Ÿ‘ Allow pre-aligned BAMs and gVCFs for RNA-seq joint variant calling. Thanks to @WimSpree for the feature.
    • ๐Ÿ‘ Allow CollectSequencingArtifacts to be turned off via tools_off: [collectsequencingartifacts].
    • ๐Ÿ›  Fix getiterator -> iter deprecation in ElementTree. Thanks to @smoe.
    • โž• Add SummarizedExperiment object from RNA-seq runs, a simplified version of the bcbioRNASeq object.
    • โž• Add umi_type: dragen. This enables bcbio to run with first-pass, pre-consensus called UMI BAM files from DRAGEN.
    • Turn off inferential replicate loading when creating the gene x sample RNA-seq count matrix. This allows loading of thousands of RNA-seq samples.
    • Only make isoform to gene file from express if we have run express.
    • ๐Ÿ‘ Allow "no consensus peaks found" as a valid endpoint of a ChIP-seq analysis.
    • โœ… Allow BCBIO_TEST_DIR environment variable to control where tests end up.
    • Collect OxoG and other sequencing artifacts due to damage.
    • Round tximport estimated counts.
    • Turn off consensus peak calling for broad peaks. Thanks to @lbeltrame and @LMannarino for diagnosing the broad-peaks-run-forever bug.
  • v1.2.4 Changes

    September 21, 2020

    1.2.4 (21 September 2020)

    • โœ‚ Remove deprecated --genomicsdb-use-vcf-codec option as this is now the default.
    • โž• Add bismark output to MultiQC.
    • ๐Ÿ›  Fix PS genotype field from octopus to have the correct type.
    • ๐Ÿ‘ Edit VarDict headers to report VCFv4.2, since htsjdk does not fully support VCFv4.3 yet.
    • Attempt to speed up bismark by implementing the parallelization strategy suggested here: FelixKrueger/Bismark#96
    • โž• Add --enumerate option to OptiType to report the top 10 calls and scores, to make it easier to decide how confident we are in
      a HLA call.
    • ๐ŸŽ Performance improvements when HLA calling during panel sequencing. This skips running bwa-kit during the initial
      mapping for consensus UMI detection, greatly speeding up panel sequencing runs.
    • ๐Ÿ‘ Allow custom options to be passed to featureCounts.
    • ๐Ÿ›  Fix race condition when running tests.
    • โž• Add TOPMed as a datatarget.
    • โž• Add predicted transcript and peptide output to arriba.
    • โž• Add mm10 as a supported genome for arriba.
    • Skip bcbioRNASeq for more than 100 samples.
    • โž• Add rRNA_pseudogene as a rRNA biotype.
    • โž• Add --genomicsdb-use-vcf-codec when running GenotypeGVCF. See https://gatk.broadinstitute.org/hc/en-us/articles/360040509751- GenotypeGVCFs#--genomicsdb-use-vcf-codec for
      a discussion. Thanks to @amizeranschi for finding the issue and posting the solution.
    • โšก๏ธ update VEP to v100
    • โž• Add consensus peak calling using https://bedops.readthedocs.io/en/latest/content/usage-examples/master-list.html
      to collapse overlapping peaks.
    • Pre-filter consensus peaks by removing peaks with FDR > 0.05 before performing consensus peak calling.
    • โž• Add support for Qiagen's Qiaseq UPX 3' transcriptome kit for DGE. Support for 96 and 384 well configurations
      by specifying umi_type: qiagen-upx-96 or umi_type: qiagen-upx-384.
    • โž• Add consensus peak counting using featureCounts.
    • Skip using autosomal-reference when calling ataqv for mouse/human, as this has a problem with ataqv
      ๐Ÿ‘€ (see ParkerLab/ataqv#10) for discussion and followup.
    • โž• Add pre-generated ataqv HTML report to upload directory.
    • ๐Ÿ‘Œ Support single-end reads for ATAC-seq.
    • ๐Ÿ”‹ Move featureCount output files to featureCounts directory in project directory.
    • โœ‚ Remove RNA and reads in peak stats from MultiQC table when they are not calculated for a pipeline.
    • Only show somatic variant counts in the general stats table, if germline variants are calculated.
    • โž• Add kit parameter for setting options for pipelines via just listing the kit. Currently only implemented for WGBS.
  • v1.2.3 Changes

    April 07, 2020
    • ๐Ÿš‘ Hotfix for not being able to upgrade from stable distribution.
  • v1.2.2 Changes

    April 05, 2020
    • ๐Ÿ›  Fix for not properly looking up R environment variables in the base environment.
    • โœ‚ Remove --use-new-qual-calculator which was eliminated in GATK 4.1.5.0.
    • 0๏ธโƒฃ Ensure header is not written for a Series. In pandas 0.24.0 the default for header was changed from
      False to True so we have to set it explictly now.
    • โœ‚ Remove unused Dockerfile. Thanks to @matthdsm.
    • ATAC-seq: Skip peak-calling on fractions with < 1000 reads.
  • v1.2.1 Changes

    March 25, 2020
    • โšก๏ธ Update ChIP and ATAC bowtie2 runs to use --very-sensitive.
    • Properly pad TSS BED file for ataqv TSS enrichment metrics.
    • Skip bcbioRNASeq if there are less than three samples.
    • โš™ Run joint-calling with single cores to save resources.
    • ๐Ÿ‘ Re-support PureCN.
    • Skip segments with no informative SNPs when creating the LOH VCF file from PureCN output.
    • ๐Ÿ›  Fix for duplicated output for mosdepth in quality control report.
    • ๐Ÿ›  Fix for missing rRNA statistics.
  • v1.2.0 Changes

    February 07, 2020
    • ๐Ÿ›  Fix for bismark not being a supported aligner.
    • โš™ Run ataqv (https://github.com/ParkerLab/ataqv) to calculate additional ATAQ-seq quality control
      metrics.
    • โ†ช Workaround for some bcbioRNASeq plots failing with many samples when interesting_groups is not set.
    • โž• Add known_fusions parameter for passing in known fusions to arriba.
    • ๐Ÿ›  Fix for tx2gene not working properly on some GTF files.
    • Sort MACS2 output with UNIX sort to avoid memory issues.
    • โš™ Run RiP on full peak file for ATAC-seq.
    • โš™ Run ataqv on unfiltered BAM file with the full peak file.
    • โš™ Run peddy on the population variant file, not the individual sample level file if joint calling was done.
    • โž• Add STAR to MultiQC metrics.
    • Throw an error if STAR is run on a genome with alts.
    • Don't run bcbioRNASeq if there is only one sample. Thanks to @kmendler for the suggestion.
    • ๐Ÿ‘Œ Improve arriba sensitivity by setting --peOverlapNbasesMin 10 and --alignSplicedMateMapLminOverLmate 0.5 when
      ๐Ÿ‘€ running STAR (see suhrig/arriba#41).
    • ๐Ÿ‘‰ Make TPM and counts files from tximport automatically.
    • ๐Ÿ‘‰ Use --keepDuplicates when making the Salmon index. This keeps transcripts that are identical in the index instead of
      randomly choosing one. This helps when comparing to other ways of quantifying the transcripts, ensuring all of
      the transcripts are represented.
    • โœ‚ Remove unnecessary "quant" subdirectory for Salmon runs. This allows MultiQC to properly name the samples.
    • ๐ŸŒฒ Ensure STAR log file is propagated to the upload directory.
    • Fix issue with memory not being specified properly when running bcbio_prepare_samples.py.
    • โš™ Run tximport automatically and store TPM in project/date/tpm and counts in project/date/counts.
    • ๐Ÿ‘€ Calculate ENCODE quality flags for ATAC-seq. See https://www.encodeproject.org/data-standards/terms/#library for a
      description of what the metrics mean.
    • ๐Ÿ›  Fix for command line being too long while joint genotyping thousands of samples.
    • ๐Ÿ›  Fix for command line being too long when running the CWL workflow with cromwell.