Goby is a next-gen data management framework designed to facilitate the implementation of
    efficient next-gen data analysis pipelines.  The program is distributed under the
    
    GNU General Public License (GPL). See the
    download page for the most
    recent distribution.
Goby provides compressed file formats that are time and space efficient. It also provides
    a few utilities that support the most common secondary data analyses. Goby defines and uses
    several file formats. These formats include:
- compact reads
- An alternative to FASTA/FASTQ, which is fast to parse, unambiguous, compact, and chunckable.
    Chunkability means that a very large file can be processed in independent chunks without
    having to traverse the entire file, just the chunk of interest can be read. This property is
    leveraged by GobyWeb to support parallel alignments.
- compact alignments
- An alternative to Elan text format, MAQ, or SAM. Goby alignments are chunkable,
    compact, unambiuous, fast to parse.
- counts
- A representation of the histogram of read count along a reference sequence, at single base
    pair resolution. This representation is highly space efficient. Each count transition
    (positions where the value of the count changes along the histogram) is generally encoded
    in about 13 bits.
- count archives
- An archive of counts, one histogram per reference sequence in an alignment. Archives
    can store histogram data for a complete genome. They are very space efficient, with only
    about 20Mb needed to store a histogram of reads aligned against the human genome at base
    pair resolution. In contrast, a wiggle plot stored at 20bp resolution needs about 45Mb.
In addition to these file formats, Goby provides utilities that implement common next-gen data
    computations.  See http://goby.campagnelab.org/
    for details.