Biological sequence variant characterization
Abstract:
Short fixed length sub-sequences, defined as reference sub-sequences, are extracted from a collection of reference sequences, and an index is constructed showing which short fixed length reference sub-sequence occurs in which reference sequences. Short fixed length sub-sequences, the same length as the reference sub-sequences and defined as source sub-sequences, are extracted from a collection of source sequences derived from a sample for which the signature is to be determined, and the short fixed length source sub-sequences are compiled to determine the frequency of each within the collection. The presence or absence of source sub-sequences in combination with the index is used to infer the presence or absence of reference sequences from the reference collection.
Public/Granted literature
Information query
Patent Agency Ranking
0/0