June 25, 2013, 8:05 p.m. by Rosalind Team
Topics: Bioinformatics Tools, NGS
A Good Sort
Filtering out poor-quality reads from a sequencing run can have multiple benefits. Not only is it likely to improve results, but it will generally reduce the time and memory overhead required for assembly or analysis.
Poor-quality reads can be filtered out using the
FASTQ Quality Filter tool from the FASTX toolkit.
A command-line version of FASTX can be downloaded for Linux or MacOS from its website.
An online interface for the
FASTQ Quality Filter is also available here within the Galaxy web platform.
Given: A quality threshold value
Return: Number of reads in filtered FASTQ entries
20 90 @Rosalind_0049_1 GCAGAGACCAGTAGATGTGTTTGCGGACGGTCGGGCTCCATGTGACACAG + FD@@;C<AI?4BA:=>C<G=:AE=><A??>764A8B797@A:58:527+, @Rosalind_0049_2 AATGGGGGGGGGAGACAAAATACGGCTAAGGCAGGGGTCCTTGATGTCAT + 1<<65:793967<4:92568-34:.>1;2752)24')*15;1,.3*3+*! @Rosalind_0049_3 ACCCCATACGGCGAGCGTCAGCATCTGATATCCTCTTTCAATCCTAGCTA + B:EI>JDB5=>DA?E6B@@CA?C;=;@@C:6D:3=@49;@87;::;;?8+