Corpus quality analysis
Abstract:
A mechanism is provided in a data processing system for corpus quality analysis. The mechanism applies at least one filter to a candidate corpus to determine a degree to which the candidate corpus supplements existing corpora for performing a natural language processing (NLP) operation. Responsive to a determination to add the candidate corpus to the existing corpora based on a result of applying the at least one filter, the mechanism adds the candidate corpus to the existing corpora to form modified corpora. The mechanism performs the NLP operation using the modified corpora.
Public/Granted literature
Information query
Patent Agency Ranking
0/0