Skip to content
Jason L Weirather edited this page Dec 23, 2017 · 59 revisions

Welcome to AlignQC

How to cite:

Weirather JL, de Cesare M, Wang Y et al. Comprehensive comparison of Pacific Biosciences and Oxford Nanopore Technologies and their applications to transcriptome analysis [version 1; referees: awaiting peer review]. F1000Research 2017, 6:100 (http://dx.doi.org/10.12688/f1000research.10571.1)

Long read alignment analysis

Quick links:

AlignQC is a pure-python solution designed to analyze BAM format alignment data and report useful information about the composition and quality of the data including: Mappability, Error rates, Error Patterns, Transcriptome Coverage, Full/Partial-length Detection of Isoforms, Rarefraction Curves, 5' to 3' bias. AlignQC is free to use and modify under the Apache 2.0 license.

The python libraries used in AlignQC are also free to use and develop tools under the Apache 2.0 license and are available from Au-public.

Warning if you want to use this for Illumina/short read data please short reads section of the manual.

What does AlignQC do?

The purpose of AlignQC is to:

  1. Give the user as much information as possible about a long-read alignment.
  2. Be convenient to generate reports in linux with a single-step command-line-interface excuting python/R code.
  3. Make the reports EASY to share and view on any operating system with collaborators who may have less computational resources.
  4. Give users a compressed all-in-one archive of the analysis with both Command Line Interface and browser access to all of the data generated in the analysis for future use.

The way we do this was inspired by the ever-useful FastQC software: Providing users with an html output that has all the images and data URI encoded into the document.

This way we can provide collaborators with either

  • A "portable" xhtml file with just the viewable images and primary statistics in a single file.
  • A "full" xhtml file with ALL data and high resolution figures in a single file.

And the bioinformatician can continue to make use of

  • A folder with all the analysis data
  • The xhtml file with CLI access to any of the analysis data

The full xhtml may seem ideal, but as data sizes grow, this file type may be difficult for browsers to render, and slow for standard xml parsers to extract data.

Clone this wiki locally