Machine Learning Algorithms In Layman’s Terms, Part 1

(i.e. how to explain machine learning algorithms to your grandma) As a recent graduate of the Flatiron School’s Data Science Bootcamp, I’ve been inundated with advice on how to ace technical interviews. A soft skill that keeps coming to the forefront is the ability to explain complex machine learning algorithms to a non-technical person. https://wordstream-files-prod.s3.amazonaws.com/s3fs-public/machine-learning.png This … Read moreMachine Learning Algorithms In Layman’s Terms, Part 1

Finding Lane Lines — Simple Pipeline For Lane Detection.

Identifying lanes of the road is very common task that human driver performs. This is important to keep the vehicle in the constraints of the lane. This is also very critical task for an autonomous vehicle to perform. And very simple Lane Detection pipeline is possible with simple Computer Vision techniques. This article will describe … Read moreFinding Lane Lines — Simple Pipeline For Lane Detection.

All the Steps to Build your first Image Classifier (with code)

Now that you know the basics of the convolution, we can start building one ! Preparing the data This part is useful only if you want to use your own data, or data that can’t be found on the web easily, to build a convolutional neural network maybe more adapted to your needs. Otherwise, here is the … Read moreAll the Steps to Build your first Image Classifier (with code)

Set Your Jupyter Notebook up Right with this Extension

Solution: The Setup Jupyter Notebook Extension Rather than just complaining about the problem (it’s easy to be a critic but a lot harder to do something positive) I decided to see what could be done with Jupyter Notebook extensions. The result is an extension that on opening a new notebook automatically: Creates a template to … Read moreSet Your Jupyter Notebook up Right with this Extension

An architecture for real-time scoring with R

Let’s say you’ve developed a predictive model in R, and you want to embed predictions (scores) from that model into another application (like a mobile or Web app, or some automated service). If you expect a heavy load of requests, R running on a single server isn’t going to cut it: you’ll need some kind … Read moreAn architecture for real-time scoring with R

R Journal publication

The R Journal is the open access, refereed journal of the R project for statistical computing. It features short to medium length articles covering topics that should be of interest to users or developers of R. Christoph Weiss, Gernot Roetzer and myself have joined forces to write an R package and the accompanied paper: Forecast … Read moreR Journal publication

Bayesian state space modelling of the Australian 2019 election by @ellis2013nz

So I’ve been back in Australia for five months now. While things have been very busy in my new role at Nous Group, it’s not so busy that I’ve failed to notice there’s a Federal election due some time by November this year. I’m keen to apply some of the techniques I used in New … Read moreBayesian state space modelling of the Australian 2019 election by @ellis2013nz

Weekly Selection — Mar 1, 2019

A brief introduction to Markov chains By Joseph Rocca — 19 min read In 1998, Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd published “The PageRank Citation Ranking: Bringing Order to the Web”, an article in which they introduced the now famous PageRank algorithm at the origin of Google. Favorite

What is logistic in the logistic regression?

Suppose that you are interviewed for a data scientist role. You are asked about logistic regression, and you answer all sorts of questions: How to run it in Python, how would you perform feature selection, and how would you use it for prediction. For the last question you answer that if you have the estimated of the regression … Read moreWhat is logistic in the logistic regression?

How I created over 100,000 labeled LEGO training images

Some of the 300,000+ images I captured while leaving the machine running for a few days. If you are a hobbyist or researcher working on an AI project, it’s quite likely that you’ve run into the unfortunate situation of having to generate a large amount of labeled training data. Of course, having spent all your funding … Read moreHow I created over 100,000 labeled LEGO training images

How does Apache Spark run on a cluster?

Whether you are a Data Engineer or a Data Scientist, getting up and running with Apache Spark is a relatively easy process from a development perspective. It does require a slight change in paradigm thinking and understanding how Spark executes code and how it functions on our clusters is an important part of being efficient … Read moreHow does Apache Spark run on a cluster?