System for generating genomics data, with adjusted quality scores, and device, method, and software product for use therein
Abstract:
Techniques for generating output genomics data. The techniques include: receiving a genome sequence read comprising at least one sequence of bases and associated quality scores; and processing the genome sequence read to generate the output genomics data at least in part by: performing a search of the at least one sequence of bases in a reference genome corpus comprising n-mers from a reference genome, based upon a similarity criterion; calculating an adjustment for one or more of the associated quality scores, based upon results of the search, the adjustment calculation for a quality score associated with a base in the genome sequence read utilising a Bayesian estimation of a likelihood of a sequencing error at the base given the sequence of the read, the Bayesian estimation utilising the results of the search; and adjusting one or more of the associated quality scores according to the calculated adjustment.
Information query
Patent Agency Ranking
0/0