Fri, 28 October 2016
Platform as a service is a growing trend in data science where services like fraud analysis and face detection can be provided via APIs. Such services turn the actual model into a black box to the consumer. But can the model be reverse engineered?
Florian Tramèr shares his work in this episode showing that it can. The paper Stealing Machine Learning Models via Prediction APIs is definitely worth your time to read if you enjoy this episode. Related source code can be found in https://github.com/ftramer/Steal-ML.
Fri, 21 October 2016
For machine learning models created with the random forest algorithm, there is no obvious diagnostic to inform you which features are more important in the output of the model. Some straightforward but useful techniques exist revolving around removing a feature and measuring the decrease in accuracy or Gini values in the leaves. We broadly discuss these techniques in this episode.
Fri, 14 October 2016
As cities provide bike sharing services, they must also plan for how to redistribute bicycles as they inevitably build up at more popular destination stations. In this episode, Hui Xiong talks about the solution he and his colleagues developed to rebalance bike sharing systems.
Fri, 7 October 2016
Random forest is a popular ensemble learning algorithm which leverages bagging both for sampling and feature selection. In this episode we make an analogy to the process of running a bookstore.