Invention Grant
US08352469B2 Automatic generation of stop word lists for information retrieval and analysis 有权
自动生成用于信息检索和分析的停止词列表

  • Patent Title: Automatic generation of stop word lists for information retrieval and analysis
  • Patent Title (中): 自动生成用于信息检索和分析的停止词列表
  • Application No.: US12555962
    Application Date: 2009-09-09
  • Publication No.: US08352469B2
    Publication Date: 2013-01-08
  • Inventor: Stuart J Rose
  • Applicant: Stuart J Rose
  • Applicant Address: US WA Richland
  • Assignee: Battelle Memorial Institute
  • Current Assignee: Battelle Memorial Institute
  • Current Assignee Address: US WA Richland
  • Agent Allan C. Tuan
  • Main IPC: G06F7/00
  • IPC: G06F7/00 G06F17/30
Automatic generation of stop word lists for information retrieval and analysis
Abstract:
Methods and systems for automatically generating lists of stop words for information retrieval and analysis. Generation of the stop words can include providing a corpus of documents and a plurality of keywords. From the corpus of documents, a term list of all terms is constructed and both a keyword adjacency frequency and a keyword frequency are determined. If a ratio of the keyword adjacency frequency to the keyword frequency for a particular term on the term list is less than a predetermined value, then that term is excluded from the term list. The resulting term list is truncated based on predetermined criteria to form a stop word list.
Information query
Patent Agency Ranking
0/0