Invention Grant
US09026518B2 System and method for clustering content according to similarity 有权
根据相似性对内容进行聚类的系统和方法

System and method for clustering content according to similarity
Abstract:
Systems and methods for clustering content according to similarity are provided that identify and group similar content using a set of tags associated with the content. A topic model of a group of content is built, producing a probability distribution of topic membership for the content. Individual items of content are then clustered using a clustering algorithm, and a distance matrix from the probability distribution is built. Based on the distance matrix, individual items of content are labeled as “must-link” or “cannot-link” pairs with the group of content. The topic model is then embedded into successively smaller dimensions using a kernel method, until the clustering is stable with respect to both the behavioral and content domains.
Public/Granted literature
Information query
Patent Agency Ranking
0/0