Workflows - Assembly
Sequence assembly is a major challenge in metagenomic studies. The raw reads from a typical metagenomic sample usually cannot be assembled into full genomes due to poor coverage and short read length. Higher coverage provided by the next‐generation short read sequencers (e.g. GA) is still not enough for low‐abundance species within metagenomic samples. Chimerism is another problem due to high sequence similarities and unknown variations among closely related species. In many metagenomic studies, the raw reads were directly analyzed without sequence assembly.
Assembly:
CAMERA has developed a three-step meta-assembly procedure for metagenomic data. Since pyrosequencing is the most dominant platform in metagenomics today, this study is focused on assembly of 454 sequences. We selected 7 assembly programs developed originally for single genomes that can handle typical pyrosequencing datasets. We first run these programs to generate a pool of different types of contigs, and then these contigs are further assembled again. Our results show that this meta‐assembler performs significantly better than any of its component assembly algorithms.
This workflow accepts a FASTA file as input and produces two output files. First is a FASTA file of assembled contigs and second is a read-to-contig mapping table.
Workflow Components:
The assembly workflow is composed of several components, including:
- Velvet
- ABySS
- SSAKE
- Taipan
- Newlber
- Celera
- SOAPdenovo
Notes:
- To adjust default parameters, click on the “Advanced Parameters” tab on the workflow submission form.
- Currently the assembly workflow does not have a graphical output. Please download the results to your local computer for viewing.
