Fri, 29 March 2019
ELMo (Embeddings from Language Models) introduced the idea of deep contextualized word representations. It extends previous ideas like word2vec and GloVe. The ELMo model is a neural network able to map natural language into a vector space. This vector space, out of box, proved to be incredibly useful in a wide variety of seemingly unrelated NLP tasks like sentiment analysis and name entity recognition.
Fri, 22 March 2019
Bilingual evaluation understudy (or BLEU) is a metric for evaluating the quality of machine translation using human translation as examples of acceptable quality results. This metric has become a widely used standard in the research literature. But is it the perfect measure of quality of machine translation?
Fri, 15 March 2019
While at NeurIPS 2018, Kyle chatted with Liang Huang about his work with Baidu research on simultaneous translation, which was demoed at the conference.
Fri, 8 March 2019
Machine transcription (the process of translating audio recordings of language to text) has come a long way in recent years. But how do the errors made during machine transcription compare to the errors made by a human transcriber? Find out in this episode!
Fri, 1 March 2019
A sequence to sequence (or seq2seq) model is neural architecture used for translation (and other tasks) which consists of an encoder and a decoder.
The encoder/decoder architecture has obvious promise for machine translation, and has been successfully applied this way. Encoding an input to a small number of hidden nodes which can effectively be decoded to a matching string requires machine learning to learn an efficient representation of the essence of the strings.
In addition to translation, seq2seq models have been used in a number of other NLP tasks such as summarization and image captioning.