Our new verification system, which tests for correct behavior in bioinformatics software packages. We crafted tests to unify correct behavior when tools encounter various edge cases—potentially unexpected inputs that exemplify the limits of the format. Inspired by the web standards Acid tests.
The free Polyidus software identifies the exact genomic regions for integration of a known virus. We developed Polyidus to identify viral integration sites with chimeric sequencing reads from any paired-end sequencing data. First, Polyidus aligns reads to a viral genome. It allows for partial mapping using local alignment, and removes any sequencing fragment where neither read maps to the virus. Second, Polyidus aligns the selected reads to the host genome, permitting partial mapping. Third, Polyidus identifies chimeric reads: those reads mapped partially to the host genome and partially to the virus genome. Fourth, for each chimeric read, Polyidus reports the start and strand of integration in both the host and viral genomes. Polyidus also reports the number of chimeric reads supporting each integration site. [more inside]
We have created a new method to find transcription factor motifs in ChIP-seq data using knockout controls. Available on PyPI and GitHub.
Our new method predicts transcription factor and chromatin factor locations in a cell type using new kinds of data like chromatin factor binding in other cell types and learning the association of gene expression patterns with chromatin factor binding patterns. We've made free software available and a track hub that can load our predictions for 36 chromatin factors in 33 human tissue types into the UCSC Genome Browser.
Many biological experiments use DNA sequencing as a readout. We can often map the sequenced DNA back to a specific region of the genome. Sometimes, however, we can't. Genomic data is less reliable in those regions. My lab has developed software that makes it easy to identify these regions. We also developed a new method that lets us find those regions in the context of bisulfite sequencing, a technique used to determine where DNA is chemically modified. [more inside]
I taught a learn-to-program course this summer for biologists with no previous programming experience. I got lots of requests from the participants for an online version of the course, so here it is! Learn to program in Python with no previous experience required, using lots of biological / bioinformatics examples. [more inside]
BEDOPS is a suite of tools to address common questions raised in genomic studies, mostly with regard to overlap and proximity relationships between data sets. BEDOPS aims to be scalable, flexible and performant, facilitating the efficient and accurate analysis and management of large-scale genomic data.