Pre-training BERT from scratch with cloud TPU

Step 1: setting up training environment First and foremost, we get the packages required to train the model. The Jupyter environment allows executing bash commands directly from the notebook by using an exclamation mark ‘!’, like this: !pip install sentencepiece!git clone https://github.com/google-research/bert I will be exploiting this approach to make use of several other bash commands … Read more Pre-training BERT from scratch with cloud TPU

How are the predicted food trends of 2019 holding up so far in the US?

Towards the end of every year, industry experts, local businesses, journalists, basically everybody will try to predict which foods will be popular in the following year. A variety of things are predicted from the next big beverage, to the new ‘kale’, and even to the new hot restaurant trends. There are extensive lists like this … Read more How are the predicted food trends of 2019 holding up so far in the US?

Metrics for Imbalanced Classification

The notion of metrics in Data Science is extremely important. If you don’t know how to estimate current results properly, you are unable to improve them either. The wrong understanding of metrics also leads to the wrong estimate of the model capacity and an insight to the state of the problem. The current story will … Read more Metrics for Imbalanced Classification

The Remarkable world of Recommender Systems

Recommender Systems Recommendation Engines try to make a product or service recommendation to people. In a way, Recommenders try to narrow down choices for people by presenting them with suggestions that they are most likely to buy or use. Recommendation systems are almost everywhere from Amazon to Netflix; from Facebook to Linkedin. In fact, a … Read more The Remarkable world of Recommender Systems

TCAV: Interpretability Beyond Feature Attribution

Working TCAV essentially learns ‘concepts’ from examples. For instance, TCAV needs a couple of examples of ‘female’, and something ‘not female’ to learn a “gender” concept. The goal of TCAV is to determine how much a concept (e.g., gender, race) was important for a prediction in a trained model. …even if the concept was not part … Read more TCAV: Interpretability Beyond Feature Attribution

Truly Understanding the Kernel Trick

Here, we learn the fundamentals behind the Kernel Trick. How it works? How the Kernel Trick does the dot product (or similarity) in infinite dimension without increase in computation? What is a Kernel Trick? In spite of its profound impact on the Machine Learning world, little is found that explains the fundamentals behind the Kernel … Read more Truly Understanding the Kernel Trick

Do not rush to code. 4 principles for AI projects in enterprise.

Think together before doing alone, an ants principle. No the AI doesn’t understand by itself. No Data Science is not automatic. Agile method doesn’t mean chaos. In a word no it’s not magic. What need to be done before rush to code ? Here I share with you 4 principles I learned from my professional and … Read more Do not rush to code. 4 principles for AI projects in enterprise.

ML Models — Prototype to Production

So you have a model, now what? Through the powers of machine learning and the promise of deep learning, today’s conferences, thought leaders and experts in ML and AI have been painting a vision of businesses powered by data. However, despite the groundbreaking research and the constant flood of new papers in the fields of … Read more ML Models — Prototype to Production

AI & Ethics: Are We Making It More Difficult On Ourselves?

Not too long ago we discussed the AI Apocalypse as it pertained to the Facebook #TenYearChallenge. Is Facebook evil? Are we evil for helping usher in our own demise? As we put it: not quite. However, AI & ethics seem inexorably linked and for good reason. This is part of an ongoing series on the … Read more AI & Ethics: Are We Making It More Difficult On Ourselves?

RStudio Addin

If you want to create your own RStudio addins, all you need to do is: Create an R package Create some R functions Create a file at inst/rstudio/addins.dcf Links 1. Create am R Package Set up tools for package development library(devtools) library(roxygen2) # getwd() # setwd(“path/to/repo”) Create Package I am mainly following: https://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/ create(“rstudio_addin”) This … Read more RStudio Addin

Robot Thinking Will Power New Frontiers in Deep Learning AI

© Agsandrew | Dreamstime.com ID 36480720 Deep learning has advanced to the point where we’re seeing computers do things that would have been considered science fiction just a few years ago. Areas such as language translation, image captioning, picture generation, and facial recognition display major advances on a regular basis. But certain artificial intelligence problems don’t … Read more Robot Thinking Will Power New Frontiers in Deep Learning AI

Rapid Computer Vision Prototyping with Azure

This post has been co-authored by Alex Akulov and Ryan Peyman from Omnia AI, Deloitte Canada’s AI practice Object-detecting trucks using a computer vision model on Azure Custom Vision Imagine building and deploying a state-of-the-art object detection model quickly and without writing a single line of code. Computer Vision tools and software have come a long … Read more Rapid Computer Vision Prototyping with Azure

Comprehensive Introduction to Turing Learning and GANs: Part 2

Building an Image GAN As we have already discussed several times, training a GAN can be frustrating and time-intensive. We will walk through a clean minimal example in Keras. The results are only on the proof-of-concept level to enhance understanding. In the code example, if you don’t tune parameters carefully, you won’t surpass this level (see … Read more Comprehensive Introduction to Turing Learning and GANs: Part 2

The ultimate guide to Google Sheets as a reliable data source

Keep calm and use lots of data validation I occasionally need to grant a non-technical colleague the ability to input information into our data warehouse on an ad-hoc basis. For example, our customer service team at Milk Bar maintains a list of special wedding cake orders in Google Sheets that we need to collect data … Read more The ultimate guide to Google Sheets as a reliable data source

Outlier Detection and Treatment: A Beginner’s Guide

One of the most important steps in data pre-processing is outlier detection and treatment. Machine learning algorithms are very sensitive to the range and distribution of data points. Data outliers can deceive the training process resulting in longer training times and less accurate models. Outliers are defined as samples that are significantly different from the … Read more Outlier Detection and Treatment: A Beginner’s Guide

In the future, you may be fired by an algorithm

Photo by Adam Fossier on Unsplash Algorithms determine the people we meet on Tinder, recognize your face to open the keyless door or fire you when your productivity drops. Machines are used to make decisions about health, employment, education, vital financial and criminal sentencing. Algorithms are used to decide, who gets a job interview, who gets … Read more In the future, you may be fired by an algorithm

Humans choose, AI does not

Artificial intelligence isn’t human “Artificial Intelligence Will Best Humans at Everything by 2060, Experts Say”. Well. First, as Yogi Berra said, “It’s tough to make predictions, especially about the future.” Where is my flying car? Second, the title reads like clickbait, but surprisingly it appears to be pretty close to the actual survey, which asked AI … Read more Humans choose, AI does not

From Error to Autonomy:

Up to this point, the idea of ​​a computational autonomy is questionable and has not reached a consensus by the scientific community. Since the beginning of computation and data analysis in the XIX century, Ada Lovelace have structured an idea that legitimize the machine to create diverse compositions, however, Lovelace considers that it would not … Read more From Error to Autonomy:

Simple Neural Network Model Using TensorFlow Eager Execution

Building the Model What I did here was to create a Python class that stores the codes responsible for weight and bias initialization, forward pass, backpropagation and updates to weights and biases. The weights and biases were initialized by sampling random values from a standard normal distribution. Random initialization of weights is typically preferred over … Read more Simple Neural Network Model Using TensorFlow Eager Execution

Visual Perception-From Human Vision to Computer Vision

A brief about the inspiring ideas from human cognition Made up of 140 million neurons, the human visual cortex is one of the most mysterious parts of the brain responsible for processing and interpreting visual data to give perception and formulate memories. Humans can tell a lot about a scene from a little information. For … Read more Visual Perception-From Human Vision to Computer Vision

How to Generate Prediction Intervals with Scikit-Learn and Python

Problem Set-Up For this walk-through, we’ll use real-world building energy data from a machine learning competition which was hosted on DrivenData. You can get the raw data here, but I’ve provided a cleaned-up version in GitHub which has energy and eight features measured at 15-minute intervals. data.head() Cleaned building energy data The objective is to predict … Read more How to Generate Prediction Intervals with Scikit-Learn and Python

An Intuitive Understanding to Neural Style Transfer

Neural style transfer is a machine learning technique that merges the “content” of one image with the “style” of another Creating a rainbow, mosaic hummingbird with neural style transfer Content: High level features describing objects and their arrangement in the image. Examples of content are people, animals, and shapes Style: The texture of the image. Examples … Read more An Intuitive Understanding to Neural Style Transfer

Creating a Bubbles Map using React-Leaflet

Look ma, a colourful world map! Introduction to Leaflet Leaflet is an open-source JavaScript library for mobile-friendly interactive maps. The above map and much more can be plotted using Leaflet. Since we are using React we will be using React-Leaflet instead which provides an abstraction of 🍃 Leaflet as ⚛️ React components. Setting up React App Let’s quickly … Read more Creating a Bubbles Map using React-Leaflet

Federal Election… Who To Vote For According To Twitter

Introduction The Australian 2019 Federal Election is almost upon us (18th of May) and as usual, I have no idea who to vote for. Well that’s not true, usually, I’m what is colloquially described as a “donkey voter” — someone who doesn’t vote. You see, in democratic Australia, it is compulsory to vote. From the Australian Electoral … Read more Federal Election… Who To Vote For According To Twitter

Mathematical Intuition behind Gradient Descent

Gradient descent is an iterative optimization algorithm for finding the minimum of a function, most commonly used in Machine Learning and Deep Learning. Introduction If you’ve ever seen or heard of the term ‘Gradient Descent’ in your life, you must have most certainly come across the following equation: Gradient Descent – parameter update step And the … Read more Mathematical Intuition behind Gradient Descent

Data Engineering — the Cousin of Data Science, is Troublesome

How to get your analysts realize the importance of expanding their toolkit? I guess I’ve found the answer. We always deem data science as the “sexiest job of the 21st century”. When it comes to the transformation from a traditional company to an analytical company, either the company or the data scientists would expect to dive … Read more Data Engineering — the Cousin of Data Science, is Troublesome

Simplified Math behind Dropout in Deep Learning

With this background, let’s dive into the Mathematics of Dropout. You may skip directly to Dropout equivalent to regularized Network section for the inferences. Math behind Dropout Consider a single layer linear unit in a network as shown in Figure 4 below. Refer [2] for details. Figure 4. A single layer liner unit out of network. This … Read more Simplified Math behind Dropout in Deep Learning

Artificial Intelligence and Robotic Process Automation

3. Design and Implementation Techniques Figure 1 shows a typical end-to-end component design depicting implementation of a fully functional AI driven Robotic Process Automation (RPA). Figure 1. Architecture of Robotic Process Automation with Conversational AI. Process Description Starting from the top, we have a user interface which will allow to capture user requests. The user … Read more Artificial Intelligence and Robotic Process Automation

Who Is a Data Engineer & How to Become a Data Engineer?

A simple guide on how to ride the waves of Data Engineering and not let them pull you under. It seems like these days everybody wants to be a data scientist. But what about Data Engineering? In its heart, it is a hybrid of sorts between a data analyst and a data scientist; Data Engineer … Read more Who Is a Data Engineer & How to Become a Data Engineer?

Neural Networks and Fibonacci Numbers

Appendix Code snippet for plotting 3-D Plots for visualizing initializations. # import statements for 3-D Weight Matrix Surface Plotfrom mpl_toolkits.mplot3d import Axes3Dimport matplotlib.pyplot as pltfrom matplotlib import cmfrom matplotlib.ticker import LinearLocator, FormatStrFormatterimport numpy as npimport random def plot_3d(X_val, Y_val, W):fig = plt.figure()ax = fig.gca(projection=’3d’)X = np.arange(0, X_val,1)Y = np.arange(0, Y_val,1)X_dash = XY_dash = YX, Y … Read more Neural Networks and Fibonacci Numbers

Complete Guide to Enterprise Chatbot Development

‘Customer service is the new marketing.’ The present-day customer has information at the tip of their fingertips. Enterprises are always on the lookout to make sure that they build a water-tight customer support process and have the right systems in place. As on today, major brands and enterprises are looking at getting started with bot … Read more Complete Guide to Enterprise Chatbot Development

AI Search Algorithms Every Data Scientist Should Know

TL;DR The post below outlines a few of the key search algorithms in AI, why they are important, what and what they are used for. While in recent years, search and planning algorithms have taken a back seat to machine and deep learning methods, better understanding these algorithms can boost the performance of your models. … Read more AI Search Algorithms Every Data Scientist Should Know

A Bird’s Eye View: How Machine Learning Can Help You Charge Your E-Scooters

Log-Scale Transformation For each feature, I plotted the distribution to explore the data for feature engineering opportunities. For features with a right-skewed distribution, where the mean is typically greater than the median, I applied these log transformations to normalize the distribution and reduce the variability of outlier observations. This approach was used to generate a … Read more A Bird’s Eye View: How Machine Learning Can Help You Charge Your E-Scooters

When Job Hunting Meets Data Science (Part 1)

Endless challenges. That’s how we grow. In our Data Science Immersive program, the last major project before the Capstone is to build predictive models for various aspects of job hunting, such as salary and job categories. The project resembles the real-world scenario: Your boss gives you a target and/or a problem statement and you find … Read more When Job Hunting Meets Data Science (Part 1)

Insight to the Fourier Transform and The Simple Implementation of It

source: https://pa1.narvii.com/6397/fbeec74f0468cf51eb46f4f869190563cf50829b_hq.gif In this post, I will not give you a detail about the derivation of the Fourier transform or Fourier series, etc. Instead, we will explore what the output and how it works from this transformation. So, the formula of Fourier transform we will discuss in this story is called Discrete Fourier Transform (DFT). … Read more Insight to the Fourier Transform and The Simple Implementation of It

A Simple Breast Cancer Classifier using ANN

I won’t repeat cliche statements like “… deep learning is the next big thing.” No. If you are here reading this article, you most certainly know what Deep Leaning or Neural Network is and how it is going to evolve. Let’s cut to the chest and build a classifier using a Neural Network that will … Read more A Simple Breast Cancer Classifier using ANN

How to prevent misinformation in data visualization?

How misinformation happens? This is not a surprise. A graphic can trigger false insights through the use of collective meaning that are often biased toward self-confirmation. Either by exposure, oversimplification or worldview, misinformation through data visualization is based on individual and various components. I’ve personally a preference for the Travis Syndrome by which a person … Read more How to prevent misinformation in data visualization?

What Do You Need To Do Before Hiring A Data Scientist?

Here are some recommendations for you to prepare before investing in Data Science function at your company: 1. Have A Clear Understanding Of Why You Want To Hire A Data Scientist You can begin by identifying the business problems and opportunities you want them to address. You don’t necessarily need to have large amounts of … Read more What Do You Need To Do Before Hiring A Data Scientist?

Scraping the Top 5 Tech Company Job Boards

How to scrape Facebook’s job board, along with Apple, Amazon, Google and Netflix. Gustave Caillebotte [Public domain] In this project, I wanted to scrape the job search results from Apple, Amazon, Facebook, Google, and Netflix to help expedite my job search. It is a tedious thing to go to each site to get all the jobs results … Read more Scraping the Top 5 Tech Company Job Boards

Tesla’s Deep Learning at Scale: Using Billions of Miles to Train Neural Networks

What Tesla can do that Waymo can’t Training data is one of the fundamental factors that determine how well deep neural networks perform. (The other two are the network architecture and optimization algorithm.) As a general principle, more training data leads to better performance. This is why I believe Tesla, not Waymo, has the most promising … Read more Tesla’s Deep Learning at Scale: Using Billions of Miles to Train Neural Networks

Tutorial: Using Deep Learning and CNNs to make a Hand Gesture recognition model

First, here’s the Github repository with the code. The project is in the format of a Jupyter Notebook, which can be uploaded to Google Colaboratory to work without environment issues. Machine Learning is very useful for a variety of real-life problems. It is commonly used for tasks such as classification, recognition, detection and predictions. Moreover, … Read more Tutorial: Using Deep Learning and CNNs to make a Hand Gesture recognition model

How I Improved Accuracy Of My Machine Learning Project?

Follow these tips to get better results Working on a machine learning project can be a tedious task, in particular when you have gathered all of the available data and yet the model yields poor results. This article should provide you with the tips that you can follow to improve the accuracy of your machine learning … Read more How I Improved Accuracy Of My Machine Learning Project?

GitHub Autocompletion with Machine Learning

Written by Óscar D. Lara Yejas and Ankit Jha As data scientists, one of the fields that comes closer our hearts is software development since, after all, we are avid users of all sorts of packages and frameworks that help us build our models. GitHub is one of the key technologies to support the software … Read more GitHub Autocompletion with Machine Learning

Lyft data scientist shares five pieces of career advice

Troy Shu shares his advice on starting a new role, becoming a better storyteller, stakeholder management, managing up, and building a consultancy Troy Shu is currently a data scientist at Lyft, a transportation company with over 23 million users. He’s located in New York, where he’s helping build out the bikes and scooters side of … Read more Lyft data scientist shares five pieces of career advice