Machine learning-based automated narrative text scoring including emotion arc characterization

    公开(公告)号:US12271694B1

    公开(公告)日:2025-04-08

    申请号:US17238368

    申请日:2021-04-23

    Abstract: Quality of a narrative is characterized by receiving data that includes a narrative text. This narrative text is then tokenized and events are extracted from the tokenized words. The extraction can use, in parallel, two or more different extraction techniques. The extracted events are then extracted so that a waveform can be generated based on the aggregated extracted events that characterizes a plurality of emotional arcs within the narrative text. Subsequently, a plurality of waveform elements are extracted from the waveform. The narrative quality (or other quality) of the narrative text is then scored based on the extracted plurality of waveform elements and using a machine learning model trained to correlate emotional arc waveforms with narrative quality scores. Related apparatus, systems, techniques and articles are also described.

    Training and domain adaptation for supervised text segmentation

    公开(公告)号:US12204856B1

    公开(公告)日:2025-01-21

    申请号:US17482548

    申请日:2021-09-23

    Abstract: Data such as unstructured text is received that includes a sequence of sentences. This received data is then tokenized into a plurality of tokens. The received data is segmented using a hierarchical transformer network model including a token transformer, a sentence transformer, and a segmentation classifier. The token transformer contextualizes tokens within sentences and yields sentence embeddings. The sentences transformer contextualizes sentence representations based on the sentence embedddings. The segmentation classifier predicts segments of the received data based on the contextualized sentence representations. Data can be provided which characterizes the segmentation of the received data. Related apparatus, systems, techniques and articles are also described.

    Systems and methods for neural content scoring

    公开(公告)号:US11790227B1

    公开(公告)日:2023-10-17

    申请号:US17148742

    申请日:2021-01-14

    CPC classification number: G06N3/08 G06F40/232 G06F40/284

    Abstract: Systems and methods are disclosed for automatically scoring a constructed response using a neural network. In embodiments, a constructed response received by a processing system may be processed to divide the constructed response into multiple series of word tokens, wherein each word token includes a sequence of characters. The constructed response may be further processed to correct one or more spelling errors. The word tokens may be encoded to generate representation vectors for the constructed response. A set of nonlinear operations may be applied to the plurality of representation vectors in a neural network to generate a single vector output. A set of predetermined network weights may be applied to the vector output of the neural network to generate a scalar output for scoring the constructed response.

    Detection of off-topic spoken responses using machine learning

    公开(公告)号:US11455999B1

    公开(公告)日:2022-09-27

    申请号:US16844439

    申请日:2020-04-09

    Abstract: Data is received that encapsulates a spoken response to a prompt text comprising a string of words. Thereafter, the received data is transcribed into a string of words. The string of words is then compared with a prompt so that a similarity grid representation of the comparison can be generated that characterizes a level of similarity between the string of words in the spoken response and the string of words in the prompt text. The grid representation is then scored using at least one machine learning model. The score indicates a likelihood of the spoken response having been off-topic. Data providing the encapsulated score can then be provided. Related apparatus, systems, techniques and articles are also described.

    Detection of plagiarized spoken responses using machine learning

    公开(公告)号:US11417339B1

    公开(公告)日:2022-08-16

    申请号:US16695348

    申请日:2019-11-26

    Abstract: Data is received that encapsulates a spoken response to a test question. Thereafter, the received data is transcribed into a string of words. The string of words is then compared with at least one source string so that a similarity grid representation of the comparison can be generated that characterizes a level of similarity between the string of words and the at least one source string. The grid representation is then scored using at least one machine learning model. The score indicates a likelihood of the spoken response having been plagiarized. Data providing the encapsulated score can then be provided. Related apparatus, systems, techniques and articles are also described.

    Systems and methods for treatment of aberrant responses

    公开(公告)号:US11049409B1

    公开(公告)日:2021-06-29

    申请号:US14974721

    申请日:2015-12-18

    Abstract: Systems and methods are provided for automatically scoring a response and statistically revaluating whether it can be considered as aberrant. In one embodiment, a constructed response is evaluated via a pre-screening stage and a post-hoc screening stage. The pre-screening stage attempts to determine whether the constructed response is aberrant based on a variety of aberration metrics and criteria. If the constructed response is deemed not to be aberrant, then the post-hoc screening stage attempts to predict a discrepancy between what score an automated scoring system would assigned and what score a human rater would assign to the response. If the discrepancy is sufficiently low, then the constructed response may be scored by an automated scoring engine. On the other hand, if the constructed response failed to pass either of the two stages, then a flag may be raised to indicate that additional human review may be needed.

    Platform for administering and evaluating narrative essay examinations

    公开(公告)号:US10885274B1

    公开(公告)日:2021-01-05

    申请号:US16014021

    申请日:2018-06-21

    Abstract: Systems and methods are provided for processing a response to essay prompts that request a narrative response. A data structure associated with a narrative essay is accessed. The essay is analyzed to generate an organization subscore, where the organization subscore is generated using a graph metric by identifying content words in each sentence of the essay and populating a data structure with links between related content words in neighboring sentences, wherein the organization subscore is determined based on the links. The essay is analyzed to generate a development subscore, where the development subscore is generated using a transition metric by accessing a transition cue data store and identifying transition words in the essay, wherein the development subscore is based on a number of words in the essay that match words in the transition cue data store. A narrative quality metric is determined based on the organization subscore and the development subscore.

    Computer-implemented systems and methods for generating a supervised model for lexical cohesion detection

    公开(公告)号:US10515314B2

    公开(公告)日:2019-12-24

    申请号:US14957769

    申请日:2015-12-03

    Abstract: Systems and methods are provided for a computer-implemented method for identifying pairs of cohesive words within a text. A supervised model is trained to detect cohesive words within a text to be scored. Training the supervised model includes identifying a plurality of pairs of candidate cohesive words in a training essay and an order associated with the pairs of candidate cohesive words based on an order of words in the training essay. The pairs of candidate cohesive words are filtered to form a set of evaluation pairs. The evaluation pairs are provided via a graphical user interface based on the order associated with the pairs of candidate cohesive words. An indication of cohesion or no cohesion is received for the evaluation pairs via the graphical user interface. The supervised model is trained based on the evaluation pairs and the received indications.

Patent Agency Ranking