String processing of clickstream data
Abstract:
A method includes assigning unique symbols respectively to potential interactions with a website. The method includes obtaining asserted interaction symbol sequences of multiple browsing sessions, respectively. Each browsing session corresponds to a visitor of the website. For each browsing session, the asserted interaction symbol sequence of the respective browsing session is a sequence of symbols, from among the unique symbols, that corresponds, respectively, to a sequence of asserted interactions with the web site visited during the respective browsing session by the corresponding visitor. The method includes generating a master string including the asserted interaction symbol sequences by concatenating the asserted interaction symbol sequences with sentinel symbols together such that at least one sentinel symbol exists between each consecutive pair of asserted interaction symbol sequences. The method includes generating a suffix array corresponding to the master string. The method includes generating a longest common prefix (LCP) array corresponding to the suffix array.
Public/Granted literature
Information query
Patent Agency Ranking
0/0