Data Skeptic

ELMo (Embeddings from Language Models) introduced the idea of deep contextualized word representations. It extends previous ideas like word2vec and GloVe. The ELMo model is a neural network able to map natural language into a vector space. This vector space, out of box, proved to be incredibly useful in a wide variety of seemingly unrelated NLP tasks like sentiment analysis and name entity recognition.

Direct download: elmo.mp3
Category:general -- posted at: 8:00am PDT

Bilingual evaluation understudy (or BLEU) is a metric for evaluating the quality of machine translation using human translation as examples of acceptable quality results. This metric has become a widely used standard in the research literature. But is it the perfect measure of quality of machine translation?

Direct download: bleu.mp3
Category:general -- posted at: 9:16pm PDT

While at NeurIPS 2018, Kyle chatted with Liang Huang about his work with Baidu research on simultaneous translation, which was demoed at the conference.

Direct download: simultaneous-translation.mp3
Category:general -- posted at: 8:00am PDT

Machine transcription (the process of translating audio recordings of language to text) has come a long way in recent years. But how do the errors made during machine transcription compare to the errors made by a human transcriber? Find out in this episode!

Direct download: human-vs-machine-transcription-errors.mp3
Category:general -- posted at: 8:00am PDT

A sequence to sequence (or seq2seq) model is neural architecture used for translation (and other tasks) which consists of an encoder and a decoder.

The encoder/decoder architecture has obvious promise for machine translation, and has been successfully applied this way. Encoding an input to a small number of hidden nodes which can effectively be decoded to a matching string requires machine learning to learn an efficient representation of the essence of the strings.

In addition to translation, seq2seq models have been used in a number of other NLP tasks such as summarization and image captioning.

Related Links

Direct download: seq2seq.mp3
Category:general -- posted at: 8:00am PDT

1