Transformers: Attention in Disguise

We discuss the Transformer, a purely attention-based architecture that is more performant, more efficient, and more parallelizable than recurrent network-based models.

Deep Contextualized Word Representations with ELMo

I describe ELMo, a recently released set of neural word representations that are pushing the state-of-the-art in natural language processing pretraining methodology.

Fundamental Deep Learning Algorithms To Learn

A discussion of fundamental deep learning algorithms people new to the field should learn along with a recommended course of study.

Why All The Excitement About Artificial Intelligence

Given all the recent buzz around artificial intelligence, I discuss three reasons for why we are seeing such widespread interest in the field today.

Being a Good Machine Learning Engineer/Data Scientist

A discussion of the most important skills necessary for being an effective machine learning engineer or data scientist.

Review of SIGDial/SemDial 2017

Following my attendance at the 18th Annual Meeting on Discourse and Dialogue, I summarize the most promising directions for future dialogue research, as gleaned from discussions with other researchers at the conference.