-
1.
公开(公告)号:US12271694B1
公开(公告)日:2025-04-08
申请号:US17238368
申请日:2021-04-23
Applicant: Educational Testing Service
Inventor: Swapna Somasundaran , Xianyang Chen , Michael Flor
IPC: G06F40/284 , G06F40/30 , G06N20/00
Abstract: Quality of a narrative is characterized by receiving data that includes a narrative text. This narrative text is then tokenized and events are extracted from the tokenized words. The extraction can use, in parallel, two or more different extraction techniques. The extracted events are then extracted so that a waveform can be generated based on the aggregated extracted events that characterizes a plurality of emotional arcs within the narrative text. Subsequently, a plurality of waveform elements are extracted from the waveform. The narrative quality (or other quality) of the narrative text is then scored based on the extracted plurality of waveform elements and using a machine learning model trained to correlate emotional arc waveforms with narrative quality scores. Related apparatus, systems, techniques and articles are also described.
-
公开(公告)号:US12204856B1
公开(公告)日:2025-01-21
申请号:US17482548
申请日:2021-09-23
Applicant: Educational Testing Service
Inventor: Swapna Somasundaran , Goran Glavaš
IPC: G06F40/284 , G06F18/214 , G06N3/04 , G06N3/08
Abstract: Data such as unstructured text is received that includes a sequence of sentences. This received data is then tokenized into a plurality of tokens. The received data is segmented using a hierarchical transformer network model including a token transformer, a sentence transformer, and a segmentation classifier. The token transformer contextualizes tokens within sentences and yields sentence embeddings. The sentences transformer contextualizes sentence representations based on the sentence embedddings. The segmentation classifier predicts segments of the received data based on the contextualized sentence representations. Data can be provided which characterizes the segmentation of the received data. Related apparatus, systems, techniques and articles are also described.
-
公开(公告)号:US11790227B1
公开(公告)日:2023-10-17
申请号:US17148742
申请日:2021-01-14
Applicant: Educational Testing Service
Inventor: Brian W. Riordan , Kenneth Steimel , Michael Flor , Robert A. Pugh
IPC: G06N3/08 , G06F40/284 , G06F40/232
CPC classification number: G06N3/08 , G06F40/232 , G06F40/284
Abstract: Systems and methods are disclosed for automatically scoring a constructed response using a neural network. In embodiments, a constructed response received by a processing system may be processed to divide the constructed response into multiple series of word tokens, wherein each word token includes a sequence of characters. The constructed response may be further processed to correct one or more spelling errors. The word tokens may be encoded to generate representation vectors for the constructed response. A set of nonlinear operations may be applied to the plurality of representation vectors in a neural network to generate a single vector output. A set of predetermined network weights may be applied to the vector output of the neural network to generate a scalar output for scoring the constructed response.
-
公开(公告)号:US11455999B1
公开(公告)日:2022-09-27
申请号:US16844439
申请日:2020-04-09
Applicant: Educational Testing Service
Inventor: Xinhao Wang , Su-Youn Yoon , Keelan Evanini , Klaus Zechner , Yao Qian
Abstract: Data is received that encapsulates a spoken response to a prompt text comprising a string of words. Thereafter, the received data is transcribed into a string of words. The string of words is then compared with a prompt so that a similarity grid representation of the comparison can be generated that characterizes a level of similarity between the string of words in the spoken response and the string of words in the prompt text. The grid representation is then scored using at least one machine learning model. The score indicates a likelihood of the spoken response having been off-topic. Data providing the encapsulated score can then be provided. Related apparatus, systems, techniques and articles are also described.
-
公开(公告)号:US11417339B1
公开(公告)日:2022-08-16
申请号:US16695348
申请日:2019-11-26
Applicant: Educational Testing Service
Inventor: Xinhao Wang , Keelan Evanini , Yao Qian , Klaus Zechner
IPC: G10L15/26 , G10L15/197 , G10L25/51 , G10L15/16
Abstract: Data is received that encapsulates a spoken response to a test question. Thereafter, the received data is transcribed into a string of words. The string of words is then compared with at least one source string so that a similarity grid representation of the comparison can be generated that characterizes a level of similarity between the string of words and the at least one source string. The grid representation is then scored using at least one machine learning model. The score indicates a likelihood of the spoken response having been plagiarized. Data providing the encapsulated score can then be provided. Related apparatus, systems, techniques and articles are also described.
-
公开(公告)号:US11049409B1
公开(公告)日:2021-06-29
申请号:US14974721
申请日:2015-12-18
Applicant: Educational Testing Service
Inventor: Mo Zhang , Jing Chen , Andre Alexander Rupp , David Michael Williamson
Abstract: Systems and methods are provided for automatically scoring a response and statistically revaluating whether it can be considered as aberrant. In one embodiment, a constructed response is evaluated via a pre-screening stage and a post-hoc screening stage. The pre-screening stage attempts to determine whether the constructed response is aberrant based on a variety of aberration metrics and criteria. If the constructed response is deemed not to be aberrant, then the post-hoc screening stage attempts to predict a discrepancy between what score an automated scoring system would assigned and what score a human rater would assign to the response. If the discrepancy is sufficiently low, then the constructed response may be scored by an automated scoring engine. On the other hand, if the constructed response failed to pass either of the two stages, then a flag may be raised to indicate that additional human review may be needed.
-
公开(公告)号:US10885274B1
公开(公告)日:2021-01-05
申请号:US16014021
申请日:2018-06-21
Applicant: Educational Testing Service
Inventor: Swapna Somasundaran , Michael Flor , Martin Chodorow , Binod Gyawali , Hillary Molloy , Laura McCulla
IPC: G06F40/279 , G06F16/34 , G06F16/31 , G06F40/30
Abstract: Systems and methods are provided for processing a response to essay prompts that request a narrative response. A data structure associated with a narrative essay is accessed. The essay is analyzed to generate an organization subscore, where the organization subscore is generated using a graph metric by identifying content words in each sentence of the essay and populating a data structure with links between related content words in neighboring sentences, wherein the organization subscore is determined based on the links. The essay is analyzed to generate a development subscore, where the development subscore is generated using a transition metric by accessing a transition cue data store and identifying transition words in the essay, wherein the development subscore is based on a number of words in the essay that match words in the transition cue data store. A narrative quality metric is determined based on the organization subscore and the development subscore.
-
公开(公告)号:US10783873B1
公开(公告)日:2020-09-22
申请号:US16221980
申请日:2018-12-17
Applicant: Educational Testing Service
Inventor: Yao Qian , Keelan Evanini , Patrick Lange , Robert A. Pugh , Rutuja Ubale
Abstract: Systems and methods for identifying a person's native language, are presented. A native language identification system, comprising a plurality of artificial neural networks, such as time delay deep neural networks, is provided. Respective artificial neural networks of the plurality of artificial neural networks are trained as universal background models, using separate native language and non-native language corpora. The artificial neural networks may be used to perform voice activity detection and to extract sufficient statistics from the respective language corpora. The artificial neural networks may use the sufficient statistics to estimate respective T-matrices, which may in turn be used to extract respective i-vectors. The artificial neural networks may use i-vectors to generate a multilayer perceptron model, which may be used to identify a person's native language, based on an utterance by the person in his or her non-native language.
-
9.
公开(公告)号:US10585985B1
公开(公告)日:2020-03-10
申请号:US15841568
申请日:2017-12-14
Applicant: Educational Testing Service
Inventor: Michael Flor , Beata Beigman Klebanov
Abstract: Methods and systems for scoring written text based on use of idiomatic expressions, including reading pre-selected idiomatic expressions in a canonical form into memory, expanding idiomatic expressions from the canonical form, reading a written response into the memory, pre-processing the written response, searching the pre-processed written response for idiomatic expressions, and assigning a score to the written response. The score may be based at least in part on the number of idiomatic expressions in the written response. Corresponding apparatuses, systems, and methods are also disclosed.
-
10.
公开(公告)号:US10515314B2
公开(公告)日:2019-12-24
申请号:US14957769
申请日:2015-12-03
Applicant: Educational Testing Service
Inventor: Beata Beigman Klebanov , Michael Flor , Daniel Blanchard
Abstract: Systems and methods are provided for a computer-implemented method for identifying pairs of cohesive words within a text. A supervised model is trained to detect cohesive words within a text to be scored. Training the supervised model includes identifying a plurality of pairs of candidate cohesive words in a training essay and an order associated with the pairs of candidate cohesive words based on an order of words in the training essay. The pairs of candidate cohesive words are filtered to form a set of evaluation pairs. The evaluation pairs are provided via a graphical user interface based on the order associated with the pairs of candidate cohesive words. An indication of cohesion or no cohesion is received for the evaluation pairs via the graphical user interface. The supervised model is trained based on the evaluation pairs and the received indications.
-
-
-
-
-
-
-
-
-