bcbio-nextgen v0.7.8 Release Notes

Release Date: 2014-03-21 // about 10 years ago
    • โž• Add a check for mis-specified FASTQ format in the sample YAML file. Thanks to Alla Bushoy.
    • โšก๏ธ Updated RNA-seq integration tests to have more specific tags (singleend, Tophat, STAR, explant).
    • ๐Ÿ›  Fix contig ordering after Tophat alignment which was preventing GATK-based tools from running.
    • ๐Ÿ‘ Allow calculation of RPKM on more deeply sampled genes by setting --max-bundle-frags to 2,000,000. Thanks to Miika Ahdesmaki.
    • Provide cleaner installation process for non-distributable tools like GATK. The --tooplus argument now handles jars from the GATK site or Appistry and correctly updates manifest version information.
    • ๐Ÿ‘‰ Use bgzipped/tabix indexed variant files throughout pipeline instead of raw uncompressed VCFs. Reduces space requirements and enables parallelization on non-shared filesystems or temporary space by avoiding transferring uncompressed outputs.
    • โฌ‡๏ธ Reduce memory usage during post-alignment BAM preparation steps (PrintReads downsampling, deduplication and realignment prep) to avoid reaching memory cap on limited systems like SLURM. Do not include for IndelRealigner which needs memory in high depth regions.
    • Provide explicit targets for coverage depth (coverage_depth_max and coverage_depth_min) instead of coverage_depth enumeration. Provide downsampling of reads to max depth during post-alignment preparation to avoid repetitive centromere regions with high depth.
    • Ensure read group information correctly supplied with bwa aln. Thanks to Miika Ahdesmaki.
    • ๐Ÿ›  Fix bug in retrieval of snpEff databases on install. Thanks to Matan Hofree.
    • ๐Ÿ›  Fix bug in normal BAM preparation for tumor/normal variant calling. Thanks to Miika Ahdesmaki.
    • General removal of GATK for variant manipulation functionality to help focus on support for upcoming GATK 3.0. Use bcftools for splitting of variants into SNPs and indels instead of GATK. Use vcflib's vcfintersection to combine SNPs and indels instead of GATK. Use bcftools for sample selection from multi-sample VCFs. Use pysam for calculation of sample coverage.
    • ๐Ÿ‘‰ Use GATK 3.0 MIT licensed framework for remaining BAM and variant manipulation code (PrintReads, CombineVariants) to provide one consistent up to date set of functionality for GATK variant manipulation.
    • Normalize input variant_regions BED files to avoid overlapping segments. Avoids out of order errors with FreeBayes caller which will call in each region without flattening the input BED.