Data Skeptic

This episode kicks off our new theme of "Fake News" with guests Robert Sheaffer and Brad Schwartz.

Fake news is a new label for an old idea. For our purposes, we will define fake news information created to deliberately mislead while masquerading as a legitimate, journalistic source of truth. It's become a modern topic of discussion as our cultures evolve to the fledgling mechanisms of communication introduced by online platforms.

What was the earliest incident of fake news? That's a question for which we may never find a satisfying answer. While not the earliest, we present a dramatization of an early example of fake news, which leads us into a discussion with UFO Skeptic Robert Sheaffer. Following that we get into our main interview with Brad Schwartz, author of Broadcast Hysteria: Orson Welles's War of the Worlds and the Art of Fake News.

Direct download: fake-news.mp3
Category:general -- posted at: 8:00am PDT

We revisit the 2018 Microsoft Build in this episode, focusing on the latest ideas in DevOps. Kyle interviews Cloud Developer Advocates Damien Brady, Paige Bailey, and Donovan Brown to talk about DevOps and data science and databases.

For a data scientist, what does it even mean to “build”? Packaging and deployment are things that a data scientist doesn't normally have to consider in their day-to-day work. The process of making an AI app is usually divided into two streams of work: data scientists building machine learning models and app developers building the application for end users to consume.

DevOps includes all the parties involved in getting the application deployed and maintained and thinking about all the phases that follow and precede their part of the end solution. So what does DevOps mean for data science? Why should you adopt DevOps best practices?

In the first half, Paige and Damian share their views on what DevOps for data science would look like and how it can be introduced to provide continuous integration, delivery, and deployment of data science models. In the second half, Donovan and Damian talk about the DevOps life cycle of putting a database under version control and carrying out deployments through a release pipeline.

Direct download: devops-for-data-science.mp3
Category:general -- posted at: 1:23pm PDT

Logic is a fundamental of mathematical systems. It's roots are the values true and false and it's power is in what it's rules allow you to prove. Prepositional logic provides it's user variables. This episode gets into First Order Logic, an extension to prepositional logic.

Direct download: first-order-logic.mp3
Category:general -- posted at: 8:00am PDT

An intelligent agent trained in a simulated environment may be prone to making mistakes in the real world due to discrepancies between the training and real-world conditions. The areas where an agent makes mistakes are hard to find, known as "blind spots," and can stem from various reasons. In this week’s episode, Kyle is joined by Ramya Ramakrishnan, a PhD candidate at MIT, to discuss the idea “blind spots” in reinforcement learning and approaches to discover them.

Direct download: blind-spots-in-reinforcement-learning.mp3
Category:data science -- posted at: 8:00am PDT

In this week’s episode, our host Kyle interviews Gokula Krishnan from ETH Zurich, about his recent contributions to defenses against adversarial attacks. The discussion centers around his latest paper, titled “Defending Against Adversarial Attacks by Leveraging an Entire GAN,” and his proposed algorithm, aptly named ‘Cowboy.’

Direct download: defending-against-adversarial-attacks.mp3
Category:general -- posted at: 8:00am PDT

On a long car ride, Linhda and Kyle record a short episode. This discussion is about transfer learning, a technique using in machine learning to leverage training from one domain to have a head start learning in another domain.

Transfer learning has some obvious appealing features. Take the example of an image recognition problem. There are now many widely available models that do general image recognition. Detecting that an image contains a "sofa" is an impressive feat. However, for a furniture company interested in more specific details, this classifier is absurdly general. Should the furniture company build a massive corpus of tagged photos, effectively starting from scratch? Or is there a way they can transfer the learnings from the general task to the specific one.

A general definition of transfer learning in machine learning is the use of taking some or all aspects of a pre-trained model as the basis to begin training a new model which a specific and potentially limited dataset.

Direct download: transfer-learning.mp3
Category:general -- posted at: 8:00am PDT

Medical imaging is a highly effective tool used by clinicians to diagnose a wide array of diseases and injuries. However, it often requires exceptionally trained specialists such as radiologists to interpret accurately. In this episode of Data Skeptic, our host Kyle Polich is joined by Gabriel Maicas, a PhD candidate at the University of Adelaide, to discuss machine learning systems that can be used by radiologists to improve their accuracy and speed of diagnosis.

Direct download: medical-imaging-training-techniques.mp3
Category:data science -- posted at: 7:00am PDT

Thanks to our sponsor Galvanize

A Kalman Filter is a technique for taking a sequence of observations about an object or variable and determining the most likely current state of that object. In this episode, we discuss it in the context of tracking our lilac crowned amazon parrot Yoshi.

Kalman filters have many applications but the one of particular interest under our current theme of artificial intelligence is to efficiently update one's beliefs in light of new information.

The Kalman filter is based upon the Gaussian distribution. This distribution is described by two parameters: \mu (the mean) and standard deviation. The procedure for updating these values in light of new information has a closed form. This means that it can be described with straightforward formulae and computed very efficiently.

You may gain a greater appreciation for Kalman filters by considering what would happen if you could not rely on the Gaussian distribution to describe your posterior beliefs. If determining the probability distribution over the variables describing some object cannot be efficiently computed, then by definition, maintaining the most up to date posterior beliefs can be a significant challenge.

Kyle will be giving a talk at Skeptical 2018 in Berkeley, CA on June 10.

Direct download: kalman-filters.mp3
Category:general -- posted at: 12:47am PDT

There's so much to discuss on the AI side, it's hard to know where to begin. Luckily,  Steve Guggenheimer, Microsoft’s corporate vice president of AI Business, and Carlos Pessoa, a software engineering manager for the company’s Cloud AI Platform, talked to Kyle about announcements related to AI in industry.

Direct download: ms-ai.mp3
Category:data science -- posted at: 8:00am PDT

Today's interview is with the authors of the textbook Artificial Intelligence and Games.

Direct download: ai-in-games-master.mp3
Category:general -- posted at: 6:00am PDT