Illinois Institute of Technology
Adjunct Professor  Computer Science
Blue Cross and Blue Shield of Illinois, Montana, New Mexico, Oklahoma & Texas
Senior Director of Data Science  Enterprise Analytics Coe
American Family Insurance Aug 2016 - Mar 2019
Data Science Manager  Identification and Adoption of Machine Learning and Data Science Methods
Civis Analytics Apr 2014 - Jul 2016
Lead Data Scientist  Complete Ownership of Unstructured Data Strategy, Processes and Implementation
Educational Testing Service (Ets) Feb 2003 - Apr 2014
Director, Nlp and Speech Research  Group Leader  Research Scientist
Education:
University of Chicago 1996 - 2002
Doctorates, Doctor of Philosophy, Linguistics
University of Chicago 1992 - 1996
Bachelors, Bachelor of Arts, Linguistics
Freie Universität Berlin 1995 - 1995
Missouri State University
Skills:
Statistics Computational Linguistics Natural Language Processing Machine Learning Research Data Mining Artificial Intelligence Text Mining Linguistics Information Retrieval Computer Science Python Semantics Analysis Latex Science Educational Technology Higher Education Data Visualization Data Analysis Teaching
Jill Burstein - Princeton NJ, US Derrick Higgins - Highland Park NJ, US Claudia Gentile - Ewing NJ, US Daniel Marcu - Hermosa Beach CA, US
Assignee:
Educational Testing Service - Princeton NJ
International Classification:
G06F 17/27
US Classification:
704 9, 704 1, 704 10
Abstract:
A method and system for determining text coherence in an essay is disclosed. A method of evaluating the coherence of an essay includes receiving an essay having one or more discourse elements and text segments. The one or more discourse elements are annotated either manually or automatically. A text segment vector is generated for each text segment in a discourse element using sparse random indexing vectors. The method or system then identifies one or more essay dimensions and measures the semantic similarity of each text segment based on the essay dimensions. Finally, a coherence level is assigned to the essay based on the measured semantic similarities.
Method And System For Assessing Pronunciation Difficulties Of Non-Native Speakers By Entropy Calculation
The present disclosure presents a useful metric for assessing the relative difficulty which non-native speakers face in pronouncing a given utterance and a method and systems for using such a metric in the evaluation and assessment of the utterances of non-native speakers. In an embodiment, the metric may be based on both known sources of difficulty for language learners and a corpus-based measure of cross-language sound differences. The method may be applied to speakers who primarily speak a first language speaking utterances in any non-native second language.
Method And System For Text Retrieval For Computer-Assisted Item Creation
A tool, method, and system for use in the development of sentence-based test items are disclosed. The tool may include a user interface that may include a database selection field, a sentence pattern entry field, an option pane, and an output pane. The tool may search a database for one or more sentences and may generate one or more responses to the one or more sentences. The one or more sentences and one or more responses may be used to produce the sentence-based test items. The tool may allow test items to be developed more quickly and easily than manual test item authoring. Accordingly, test item development costs may be lowered and test security may be enhanced.
Method And System For Text Retrieval For Computer-Assisted Item Creation
A tool, method, and system for use in the development of sentence-based test items are disclosed. The tool may include a user interface that may include a database selection field, a sentence pattern entry field, an option pane, and an output pane. The tool may search a database for one or more sentences and may generate one or more responses to the one or more sentences. The one or more sentences and one or more responses may be used to produce the sentence-based test items. The tool may allow test items to be developed more quickly and easily than manual test item authoring. Accordingly, test item development costs may be lowered and test security may be enhanced.
Jill Burstein - Princeton NJ, US Derrick Higgins - Highland Park NJ, US Claudia Gentile - Ewing NJ, US Daniel Marcu - Hermosa Beach CA, US
Assignee:
Educational Testing Service - Princeton NJ
International Classification:
G06F 17/27
US Classification:
704 9, 704 1, 704 10
Abstract:
A method and system for determining text coherence in an essay is disclosed. A method of evaluating the coherence of an essay includes receiving an essay having one or more discourse elements and text segments. The one or more discourse elements are annotated either manually or automatically. A text segment vector is generated for each text segment in a discourse element using sparse random indexing vectors. The method or system then identifies one or more essay dimensions and measures the semantic similarity of each text segment based on the essay dimensions. Finally, a coherence level is assigned to the essay based on the measured semantic similarities.
Method And System For Assessing Pronunciation Difficulties Of Non-Native Speakers
The present disclosure presents a useful metric for assessing the relative difficulty which non-native speakers face in pronouncing a given utterance and a method and systems for using such a metric in the evaluation and assessment of the utterances of non-native speakers. In an embodiment, the metric may be based on both known sources of difficulty for language learners and a corpus-based measure of cross-language sound differences. The method may be applied to speakers who primarily speak a first language speaking utterances in any non-native second language.
Paul Deane - Lawrenceville NJ, US Derrick Higgins - Highland Park NJ, US
International Classification:
G01S001/00 G09B017/00
US Classification:
434/178000, 434/322000, 434/004000
Abstract:
A method and system for using a natural language generator for automatic assessment item generation is disclosed. The natural language generator includes a document structure generator that produces an abstract document specification defining a structure for an assessment item based on user input. The abstract document specification is input into a logical schema generator, which produces a logical schema specification that creates a more detailed specification for an assessment item. Finally, a sentence generator receives the logical schema specification and creates natural language for the assessment item based on the variables defined in the logical schema specification.
Method And System For Detecting Off-Topic Essays Without Topic-Specific Training
Derrick Higgins - Highland Park NJ, US Jill Burstein - Princeton NJ, US
International Classification:
G09B 7/00
US Classification:
434362000
Abstract:
Methods and systems for detecting off-topic essays are described that do not require training using human-scored essays. The methods can detect different types of off-topic essays, such as unexpected topic essays and bad faith essays. Unexpected topic essays are essays that address an incorrect topic. Bad faith essays address no topic. The methods can use content vector analysis to determine the similarity between the essay and one or more prompts. If the essay prompt with which an essay is associated is among the most similar to the essay, the essay is on-topic. Otherwise, the essay is considered to be an unexpected topic essay. Similarly, if the essay is sufficiently dissimilar to all essay prompts, the essay is considered to be a bad faith essay.