Data Skeptic

Recommender systems play an important role in providing personalized content to online users. Yet, typical data mining techniques are not well suited for the unique challenges that recommender systems face. In this episode, host Kyle Polich joins Dr. Joseph Konstan from the University of Minnesota at a live recording at FARCON 2017 in Minneapolis to discuss recommender systems and how machine learning can create better user experiences. 

Direct download: recommender-systems-live-from-farcon.mp3
Category:general -- posted at: 8:00am PDT

Thanks to our sponsor

A Long Short Term Memory (LSTM) is a neural unit, often used in Recurrent Neural Network (RNN) which attempts to provide the network the capacity to store information for longer periods of time. An LSTM unit remembers values for either long or short time periods. The key to this ability is that it uses no activation function within its recurrent components. Thus, the stored value is not iteratively modified and the gradient does not tend to vanish when trained with backpropagation through time.

Direct download: long-short-term-memory.mp3
Category:general -- posted at: 8:00am PDT

Zillow is a leading real estate information and home-related marketplace. We interviewed Andrew Martin, a data science Research Manager at Zillow, to learn more about how Zillow uses data science and big data to make real estate predictions.

Direct download: zillow-zestimate.mp3
Category:general -- posted at: 8:00am PDT

Our guest Pranav Rajpurkar and his coauthored recently published Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks, a paper in which they demonstrate the use of Convolutional Neural Networks which outperform board certified cardiologists in detecting a wide range of heart arrhythmias from ECG data.

Direct download: cardiologist-level-arrhythmia-detection-with-cnns.mp3
Category:general -- posted at: 8:00am PDT

RNNs are a class of deep learning models designed to capture sequential behavior.  An RNN trains a set of weights which depend not just on new input but also on the previous state of the neural network.  This directed cycle allows the training phase to find solutions which rely on the state at a previous time, thus giving the network a form of memory.  RNNs have been used effectively in language analysis, translation, speech recognition, and many other tasks.

Direct download: recurrent-neural-networks.mp3
Category:general -- posted at: 8:00am PDT

Thanks to our sponsor Springboard.

In this week's episode, guest Andre Natal from Mozilla joins our host, Kyle Polich, to discuss a couple exciting new developments in open source speech recognition systems, which include Project Common Voice.

In June 2017, Mozilla launched a new open source project, Common Voice, a novel complementary project to the TensorFlow-based DeepSpeech implementation. DeepSpeech is a deep learning-based voice recognition system that was designed by Baidu, which they describe in greater detail in their research paper. DeepSpeech is a speech-to-text engine, and Mozilla hopes that, in the future, they can use Common Voice data to train their DeepSpeech engine.

Direct download: project-common-voice.mp3
Category:general -- posted at: 8:00am PDT

A Bayesian Belief Network is an acyclic directed graph composed of nodes that represent random variables and edges that imply a conditional dependence between them. It's an intuitive way of encoding your statistical knowledge about a system and is efficient to propagate belief updates throughout the network when new information is added.

Direct download: bayesian-belief-networks.mp3
Category:general -- posted at: 11:58pm PDT

In this episode, Tony Beltramelli of UIzard Technologies joins our host, Kyle Polich, to talk about the ideas behind his latest app that can transform graphic design into functioning code, as well as his previous work on spying with wearables.

Direct download: pix2code.mp3
Category:general -- posted at: 8:00am PDT

In statistics, two random variables might depend on one another (for example, interest rates and new home purchases). We call this conditional dependence. An important related concept exists called conditional independence. This phrase describes situations in which two variables are independent of one another given some other variable.

For example, the probability that a vendor will pay their bill on time could depend on many factors such as the company's market cap. Thus, a statistical analysis would reveal many relationships between observable details about the company and their propensity for paying on time. However, if you know that the company has filed for bankruptcy, then we might assume their chances of paying on time have dropped to near 0, and the result is now independent of all other factors in light of this new information.

We discuss a few real world analogies to this idea in the context of some chance meetings on our recent trip to New York City.

Direct download: conditional-independence.mp3
Category:general -- posted at: 8:00am PDT

Animals can't tell us when they're experiencing pain, so we have to rely on other cues to help treat their discomfort. But it is often difficult to tell how much an animal is suffering. The sheep, for instance, is the most inscrutable of animals. However, scientists have figured out a way to understand sheep facial expressions using artificial intelligence.

On this week's episode, Dr. Marwa Mahmoud from the University of Cambridge joins us to discuss her recent study, "Estimating Sheep Pain Level Using Facial Action Unit Detection." Marwa and her colleague's at Cambridge's Computer Laboratory developed an automated system using machine learning algorithms to detect and assess when a sheep is in pain. We discuss some details of her work, how she became interested in studying sheep facial expression to measure pain, and her future goals for this project.

If you're able to be in Minneapolis, MN on August 23rd or 24th, consider attending Farcon. Get your tickets today via

Direct download: estimating-sheep-pain-with-facial-recognition.mp3
Category:general -- posted at: 8:00am PDT