Graph representations for identifying a next word
Abstract:
Systems, methods, and computer-executable instructions for approximating a softmax layer are disclosed. A small world graph that includes a plurality of nodes is constructed for a vocabulary of a natural language model. A context vector is transformed. The small world graph is searched using the transformed context vector to identify a top-K hypothesis. A distance from the context vector for each of the top-K hypothesis is determined. The distance is transformed to an original inner product space. A softmax distribution is computed for the softmax layer over the inner product space of the top-K hypothesis. The softmax layer is useful for determining a next word in a speech recognition or machine translation.
Public/Granted literature
Information query
Patent Agency Ranking
0/0