Simulating the Six Nations 2019 Rugby Tournament in R: Final Round Update

In an earlier post I blogged how I had made a Monte Carlo simulation model of the Six Nations Rugby Tournament.  With the final round of the tournament approaching this Saturday, I decided to do a quick update. Who can win at this stage?Wales, England, or Ireland can still win.  Scotland, France and Italy do not have … Read moreSimulating the Six Nations 2019 Rugby Tournament in R: Final Round Update

Paper review: The power of choice in data-aware cluster scheduling

In this post, we’ll cover a scheduler called KMN that is looking to solve scheduling I/O intensive tasks in distributed computing frameworks like Spark or MapReduce. This scheduler is different than the ones we discussed previously, as it’s emphasizing on data-aware scheduling which we’ll cover in this post. Background In today’s batch computing frameworks like … Read morePaper review: The power of choice in data-aware cluster scheduling

Introducing Students to Automation & Robotics

Photo by Andy Kelly on Unsplash This field doesn’t have to be that scary Automation has quickly become the talk of the town. Politicians are now running campaigns centering around it. Entire companies are built off of artificial intelligence (AI) and robotic algorithms. Related technology has even begun to seep into the domestic sphere. In a decade … Read moreIntroducing Students to Automation & Robotics

Visually explore Probability Distributions with vistributions

We are happy to introduce the vistributions package, a set of tools forvisually exploring probability distributions. Installation # Install release version from CRAN install.packages(“vistributions”) # Install development version from GitHub # install.packages(“devtools”) devtools::install_github(“rsquaredacademy/vistributions”) Shiny App vistributions includes a shiny app which can be launched using vdist_launch_app() or try the live version here. [embedded content] Read … Read moreVisually explore Probability Distributions with vistributions

Zotero hacks: unlimited synced storage and its smooth use with rmarkdown

Here is a bit refreshed translation of my 2015 blog post, initially published on Russian blog platform habr.com. The post shows how to organize a personal academic library of unlimited size for free. This is a funny case of a self written manual which I came back to multiple times myself and many many more … Read moreZotero hacks: unlimited synced storage and its smooth use with rmarkdown

Plotly Experiments — Column and Line Plots

Bar plots In my previous post, I had explained how to create scatterplots using Plotly with examples from the King County housing dataset. Another popular type of plot is the column plot or bar plot. Unlike a scatterplot, which is used to compare two numerical variables against each other and examine relationships, bar / column … Read morePlotly Experiments — Column and Line Plots

Logistic Regression as a Nonlinear Classifier

Logistic Regression has traditionally been used as a linear classifier, i.e. when the classes can be separated in the feature space by linear boundaries. That can be remedied however if we happen to have a better idea as to the shape of the decision boundary… Logistic regression is known and used as a linear classifier. … Read moreLogistic Regression as a Nonlinear Classifier

Unit Tests in R

I am collecting here some notes on testing in R. There seems to be a general (false) impression among non R-core developers that to run tests, R package developers need a test management system such as RUnit or testthat. And a further false impression that testthat is the only R test management system. This is … Read moreUnit Tests in R

What AI product managers are not getting right yet

Natural Intelligence teaches Artificial Intelligence the #1 method to succeed in the field my son Gabriel Versace about to attack a robot… I am writing this post while watching my fourth child, newborn Leonardo Versace, take a nap on his fifth day of life. As with all babies I have heard about or watched in action, Leonardo … Read moreWhat AI product managers are not getting right yet

Plotly Experiments — Scatterplots

Let me start this post with a somewhat unpopular opinion: Data visualization in Python is an absolute mess. Unlike R, where ggplot pretty much rules the roost when it comes to graphing, Python has too many options to choose from. This is best summarized by this picture: Courtesy: Jake VanderPlas (@jakevdp on Twitter) Undoubtedly, matplotlib is … Read morePlotly Experiments — Scatterplots

The First Question I Have For Every Data Request

And how I use it to build partnerships with cross-functional teams This is about building a data-driven culture at a company from the ground up, in this case, by building partnerships across teams. In the lives in the same context as my earlier succeeding as a data scientist in startups post. There are very few … Read moreThe First Question I Have For Every Data Request

Person Search

What is Person Search Person search is a task which aims to search the corresponding instances in the images of the gallery without bounding box annotation. The associated data is similar to that in person re-identification. The key difference is that the bounding box is unavailable in this task. Actually it can be seen as a … Read morePerson Search

Clean a complex dataset for modelling with recommendation algorithms

Overview Recently I wanted to learn something new and challenged myself to carry out an end-to-end Market Basket Analysis. To continue to challenge myself, I’ve decided to put the results of my efforts before the eyes of the data science community. And what better forum for my first ever series of posts than one of … Read moreClean a complex dataset for modelling with recommendation algorithms

Speed up predictions on low-power devices using Neural Compute Stick and OpenVINO

Getting started OpenVINO on the main machine As the first step, is required to install OpenVINO Toolkit on the main machine. Download and install OpenVINO Firstly connect to the download page. Then, register yourself and choose the Ubuntu version of OpenVINO, selecting the Full Package button. After that, download the file, placing it inside Downloads . OpenVINO Download … Read moreSpeed up predictions on low-power devices using Neural Compute Stick and OpenVINO

Computer Vision for Beginners: Part 1

Computer Vision is one of the hottest topics in artificial intelligence. It is making tremendous advances in self-driving cars, robotics as well as in various photo correction apps. Steady progress in object detection is being made every day. GANs is also a thing researchers are putting their eyes on these days. Vision is showing us … Read moreComputer Vision for Beginners: Part 1

Machine Learning Algorithms from the Easy Side (Part 2)

In the first part, we explained what is ML and some machine learning algorithms. In this new part we will go deeper and see further methods that can help us to teach the machine to effectuate a task. SVM (Support Vector Machine) Now we see another method to think about more. In the example we … Read moreMachine Learning Algorithms from the Easy Side (Part 2)

Machine Learning Algorithms from the Easy Side (Part 1)

Machine Learning (ML) is generally used for many different things, such as fraud detection, image recognition, voice recognition, video recommendations, self-driving cars, spam detection, teaching a computer how to cook or how to play a chess and much more. There are many benefits of ML, one of which is that it is not difficult to … Read moreMachine Learning Algorithms from the Easy Side (Part 1)

Generating Synthetic Classification Data using Scikit

Generating Synthetic Data Why do we need Data Generators? Data generators help us create data with different distributions and profiles to experiment on. If you are testing various algorithms available to you and you want to find which one works in what cases, then these data generators can help you generate case specific data and … Read moreGenerating Synthetic Classification Data using Scikit

Unpacking immigration collocations

As part of our road to detecting metaphors we got stuck on a simple problem: compound nouns. If you take the sentence: series of immigration policy changes Series modifies changes in reference to immigration policy, which is a compound noun. “Series of changes” is not what we would consider metaphorical usage, but our detector would label … Read moreUnpacking immigration collocations

An Interesting Subtlety of Statistics: The Hot Hand Fallacy Fallacy

Last week I stumbled across a very interesting recent Econometrica article by Joshua Miller and Adam Sanjuro. I was really surprised by the statistical result they discovered and guess the issue may even have fooled Nobel Prize winning behavioral economists. Before showing the statistical subtlety, let me briefly explain the Hot Hand Fallacy. Consider a … Read moreAn Interesting Subtlety of Statistics: The Hot Hand Fallacy Fallacy

Running Mobilenet on STM32 MCUs at the edge

By: Alessandro Capotondi & Manuele Rusci, Università di Bologna AI-powered and energy efficient edge devices Internet-of-Things is now a reality. Today, major vendors are offering dozens of cloud services through their online platforms to elaborate and store personal data. On the other side of the spectrum, technology for edge devices still presents several limitations. For instance, … Read moreRunning Mobilenet on STM32 MCUs at the edge

“X affects Y”. What does that even mean?

On my last post I gave an intuitive demonstration of what’s causal inference and how it’s different than classic ML.After receiving some feedback I realize that while the post was easy to digest, some confusion remains. Related To leave a comment for the author, please follow the link and comment on their blog: R on … Read more“X affects Y”. What does that even mean?

World population growth through time

A few months ago I have made an attempt to visualize the world population changes from 1800 to 2100: Inspired by @MaxCRoser and @jkottke, I’ve tried to visualize the world population changes from 1800 to 2100. My new blog post at https://t.co/XpBpkZLO9s describes how this animation was made using #rstats and #OpenData. pic.twitter.com/WI3gj0xUwU — Jakub … Read moreWorld population growth through time

RStudio Package Manager 1.0.6 – README

The 1.0.6 release of RStudio Package Manager helps R users understand packages.The primary feature in this release is embedded package READMEs, detailed below.If you’re new to Package Manager, it is an on-premise product built to give teams and organizations reliable and consistent package management. Download an evaluationtoday. View package READMEs in Package Manager Package READMEs … Read moreRStudio Package Manager 1.0.6 – README

Python Cheatsheet

The following gives an overview of the most basic concepts found in python. It serves as a quick reminder when not having coded in python for a while. Source: Download text file or Fork me on GitHub Main if __name__ == ‘__main__’: main() List <list> = <list>[from_inclusive : to_exclusive : step_size] <list>.append(<el>) <list>.extend(<collection>) <list> += … Read morePython Cheatsheet

R 3.5.3 now available

The R Core Team announced yesterday the release of R 3.5.3, and updated binaries for Windows and Linux are now available (with Mac sure to follow soon). This update fixes three minor bugs (to the functions writeLines, setClassUnion, and stopifnot), but you might want to upgrade just to avoid the “package built under R 3.5.4” … Read moreR 3.5.3 now available

RISE OF THE MACHINE EMPATHS

It’s not just that millennials have astonishing facility with computers, taking to every cell phone and software or video game release like birds to the sky. And it is not only that they seem more mechanically adept than other generations, with fine motor skills far beyond those of older people (with their ham-handed, clunky attempts … Read moreRISE OF THE MACHINE EMPATHS

RISE OF THE MACHINE EMPATHS

It’s not just that millennials have astonishing facility with computers, taking to every cell phone and software or video game release like birds to the sky. And it is not only that they seem more mechanically adept than other generations, with fine motor skills far beyond those of older people (with their ham-handed, clunky attempts … Read moreRISE OF THE MACHINE EMPATHS

Exploring Univariate Data

Wikipedia states that “univariate analysis is perhaps the simplest form of statistical analysis. . . The key fact is that only one variable is involved.” Because univariate analysis is so simple, it’s a good place to start in an exploratory analysis. Some questions to consider when getting started can include: How many variables do I have? Do … Read moreExploring Univariate Data

10 Ways Machine Learning will Affect your life.

The Rise of Machine Learning: How Our Daily Lives Are Revolutionizing? Back in the day, machine experiences were a drag. Hit a button, pull a lever, and get the task done. Decades later, with subsequent computing innovation, machines have transformed into their ultra-smart, self-learning, automated versions that are sweeping the human landscape. The underlying technology … Read more10 Ways Machine Learning will Affect your life.

Installing Socviz

I’ve gotten a couple of reports from people having trouble installing the socviz library that’s meant to be used with Data Visualization: A Practical Introduction. As best as I can tell, the difficulties are being caused by GitHub’s rate limits. The symptom is that, after installing the tidyverse and devtools libraries, you try install_github(“kjhealy/socviz”) and … Read moreInstalling Socviz

Uber/Lyft Maximization: More Money for The Time

Motivation Uber and Lyft who are the main ridesharing companies can make more money at a faster rate by filling their cars with passengers at a higher peak time when they are on the road. The typical Uber/Lyft driver normally have full-time jobs, full-time students, or in between jobs. Being an Uber/Lyft driver to make … Read moreUber/Lyft Maximization: More Money for The Time

10 Free tools to get started with Data Visualisation-Easily & Instantly.

2. RAWGraphs RAWGraphs is an open web tool to create custom vector-based visualizations on top of the amazing d3.js library. Primarily conceived as a tool for designers and vis geeks, RAWGraphs aims at providing a missing link between spreadsheet applications (e.g. Microsoft Excel, Apple Numbers, Google Docs) and vector graphics editors (e.g. Adobe Illustrator, Inkscape … Read more10 Free tools to get started with Data Visualisation-Easily & Instantly.

Sigmoid Neuron Learning Algorithm Explained With Math

This is the 2nd part in the series of posts discussing the working of sigmoid neuron and it’s learning algorithm: 1 | Sigmoid Neuron — Building Block of Deep Neural Networks 2 | Sigmoid Neuron Learning Algorithm Explained With Math (current story) In this post, we will discuss the mathematical intuition behind the sigmoid neuron learning algorithm … Read moreSigmoid Neuron Learning Algorithm Explained With Math

Guide to Coding a Custom Convolutional Neural Network in TensorFlow Core

Tutorial for Developing in the Low-Level API The following demonstrates how to use the low-level TensorFlow Core to create Convolutional Neural Network (ConvNet) models without high-level APIs such as Keras. The goal of this tutorial is to provide a better understanding of the background processes in a deep neural network and to demonstrate concepts on … Read moreGuide to Coding a Custom Convolutional Neural Network in TensorFlow Core