Measuring Language Complexity Using Word Embeddings

Peter A. Whigham; Mansi Chugh; Grant Dick

doi:10.1007/978-3-030-03991-2_76

Back

Measuring Language Complexity Using Word Embeddings

Conference proceeding

Peer reviewed

Measuring Language Complexity Using Word Embeddings

Peter A. Whigham, Mansi Chugh and Grant Dick

AI 2018: Advances in Artificial Intelligence, pp.843-854

Lecture Notes in Computer Science

2018

DOI: https://doi.org/10.1007/978-3-030-03991-2_76

Handle:

https://hdl.handle.net/10523/39718

Abstract

Grammar

Language complexity

Network

Word embedding

Word2vec

The analysis of word patterns from a corpus has previously been examined using a number of different word embedding models. These models create a numeric representation of word co-occurrence and are able to capture some of the syntactic and semantic relationships of words in a document. Assessing language complexity has been considered for many years through the use of simple indexes and basic statistical properties (word frequency, etc.), however little work has been done on using word embeddings to develop language complexity measures. This paper describes preliminary work on measuring language complexity using clustered word embeddings to produce network transition models. The structural measures of these transition networks are shown to represent basic properties of language complexity and may be used to infer some aspects of the underlying generative grammar.

Metrics

6 Record Views

Details

Record Identifier: 9926549498101891
Title: Measuring Language Complexity Using Word Embeddings
Creators: Peter A. Whigham
Mansi Chugh
Grant Dick
Contributors: Tanja Mitrovic (Editor)
Bing Xue (Editor)
Xiaodong Li (Editor)
Publication Details: AI 2018: Advances in Artificial Intelligence, pp.843-854
Academic Unit: Information Science
Publisher: Springer International Publishing
Date published ; e-published: 2018
Language: English
Resource Type ; Subtype: Conference proceeding

Measuring Language Complexity Using Word Embeddings

Abstract

Related links

Metrics

Details

Usage Policy