Abstract
The analysis of word patterns from a corpus has previously been examined using a number of different word embedding models. These models create a numeric representation of word co-occurrence and are able to capture some of the syntactic and semantic relationships of words in a document. Assessing language complexity has been considered for many years through the use of simple indexes and basic statistical properties (word frequency, etc.), however little work has been done on using word embeddings to develop language complexity measures. This paper describes preliminary work on measuring language complexity using clustered word embeddings to produce network transition models. The structural measures of these transition networks are shown to represent basic properties of language complexity and may be used to infer some aspects of the underlying generative grammar.