In this page are short descriptions and links to some projects I’ve worked on.
Distilling Morphological Information from Word Embeddings in Korean
Intrinsic evaluation methods for word embeddings have largely focused on lexical semantics, common word-analogy tasks being an obvious example. Conversely, extrinsic evaluation methods of embeddings have largely focused on discovering grammatical information within embeddings. This work, done in collaboration with Taeuk Kim, devised a method of intrinsic evaluation of word embeddings in Korean which tested for grammatical information at the featural level, leveraging characteristics of Korean morphology to test for phonological, syntactic, semantic, and pragmatic information. This was done by distilling morpheme representations from word embeddings and comparing the geometric properties of these continuous embeddings with gold-standard, linguistically-inspired discrete embeddings of those same morphemes. The proceedings paper from PACLIC 33 can be found HERE.
Dependency Parsing of Korean with StAffNet:
In joint work with Karl Stratos, I devised a dynamic neural network architecture to construct word representations from underlying morphemes. The network distinguished between stems (treated as vectors) and affixes (treated as (non-)linear functions), hence the name StAffNet. Using StAffNet-constructed representations, we achieved a new state-of-the-art result on the dependency parsing task for Korean. The proceedings paper from RELNLP 2018 can be found HERE.
Distributed Morphology over Strings:
Under the assumption that structural dependencies in natural language morphology can be adequately described using no more than regular languages, Marina Ermolaeva and I sought to formalize the Distributed Morphology framework to operate over strings of features, rather than binary trees of features as is often done in the literature. That the morphological module of language has access only to strings of features rather than trees provides an immediate explanation for the apparent regularity of morphological phenomena. A work in progress currently on the back-burner. An extended abstract from the proceedings of SCiL 2018 can be found HERE.
Linearization at PF and Extraposition in Malagasy:
Linearization at PF:
In this project, Eric Potsdam and I argue that the extraposition of embedded clauses in Malagasy is the result of phonological/prosodic considerations, rather than syntactic ones. The implications of such phonologically motivated movement is that in addition to overt movement (pre-spell-out) and covert movement (between spell-out and LF), there is a third type of movement between spell-out and PF. This is a continuation of work from the paper below. The proceedings paper from NELS 2016 can be found HERE.
Extraposition in Malagasy:
Also joint work with Eric Potsdam, here we survey the unique properties of extraposition in Malagasy and provide a first explanation of mandatory CP extraposition in Malagasy in prosodic terms. The proceedings paper from AFLA 2015 can be found HERE.