bcbio-nextgen v0.7.9 Release Notes

Release Date: 2014-05-19 // almost 10 years ago
    • Redo Illumina sequencer integration to be up to date with current code base. Uses external bcl2fastq demultiplexing and new bcbio integrated analysis server. Provide documentation on setting up automated infrastructure.
    • Perform de-duplication of BAM files as part of streaming alignment process using samblaster or biobambam's bammarkduplicates. Removes need for secondary split of files and BAM preparation unless recalibration and realignment needed. Enables pre-processing of input files for structural variant detection.
    • ๐Ÿšš Rework batched regional analysis in variant calling to remove custom cases and simplify structure. Filtering now happens explicitly on the combined batch file. This is functionally equivalent to previous filters but now the workflow is clearer. Avoids special cases for tumor/normal inputs.
    • Perform regional splitting of samples grouped by batch instead of globally, enabling multiple organisms and experiments within a single input sample YAML.
    • โž• Add temporary directory usage to enable use of local high speed scratch disk on setups with large enough global temporary storage.
    • โšก๏ธ Update FreeBayes to latest version and provide improved filtering for high depth artifacts.
    • โšก๏ธ Update VQSR support for GATK to be up to date with latest best practices. Re-organize GATK and filtering to be more modular to help with transition to GATK 3.x gVCF approaches.
    • ๐Ÿ‘Œ Support CRAM files as input to pipeline, including retrieval of reads from defined sequence regions.
    • ๐Ÿ‘Œ Support export of alignment data as CRAM instead of BAM for space storage and long term archiving.
    • ๐Ÿ”ง Provide configuration option, remove_lcr, to filter out variants in low complexity regions.
    • ๐Ÿ‘Œ Improve Galaxy upload for LIMS supports: enable upload of FastQC as PDF reports with wkhtmltopdf installed. Provide tabular summaries of mapped reads.
    • ๐Ÿ‘Œ Improve checks for pre-aligned BAMs: ensure correct sample names and provide more context on errors around mismatching reference genomes.
    • GATK HaplotypeCaller: ensure genotype depth annotation with DepthPerSampleHC annotation. Enable GATK 3.1 hardware specific optimizations.
    • ๐Ÿ‘‰ Use bgzipped VCFs for dbSNP, Cosmic and other resources to save disk space. Upgrade to Cosmic v68.
    • Avoid VCF concatenation errors when first input file is empty. Thanks to Jiantao Shi.
    • โž• Added preliminary support for oncofuse for calling gene fusion events. Thanks to @tanglingfung.