Amazon Customer Analysis

User review networks for customer segmentation Over the past decade or two, Americans have continued to prefer payment methods that are traceable, providing retailers and vendors with a rich source of data on their customers. This data is used by data scientists to help businesses make more informed decisions with respect to inventory, marketing, and … Read more Amazon Customer Analysis

A Guide for Building Convolutional Neural Networks

Computer Vision it at the forefront of advancements in Artificial Intelligence (AI). It’s moving fast with new research coming out each and every day allowing us to do truly amazing things that we could’t do before with computers and AI. Convolutional Neural Networks (CNNs) are the driving force behind every advancement in Computer Vision research … Read more A Guide for Building Convolutional Neural Networks

The invisible workers of the AI era

50 ways to label data There are different ways to get your data labeled. Some firms label their data themselves — although this can be costly, as hiring people simply for these tasks costs firms both money and flexibility. Other companies even find ways to get people to label their data for free. Ever wonder why Google’s reCAPTCHA … Read more The invisible workers of the AI era

AI and Machine Learning: Moving from Training to Education

The debate of whether AI will ever achieve capabilities at par or beyond human intelligence is ever ongoing. It certainly has intensified with the recent advancements in AI, Machine Learning (ML), and Deep Learning (DL) with some believing that the current technologies are already capable of paving the way for Artificial General Intelligence (AGI). You … Read more AI and Machine Learning: Moving from Training to Education

Towards Ethical Machine Learning

https://initiatives.provost.uci.edu/event/philosophy-machine-learning-knowledge-causality/ I quit my job to enter an intensive data science bootcamp. I understand the value behind the vast amount of data available that enables us to create predictive machine learning algorithms. In addition to recognizing its value on a professional level, I benefit from these technologies as a consumer. Whenever I find myself in … Read more Towards Ethical Machine Learning

Parsing XML, Named Entity Recognition in One-Shot

Photo credit: Lynda.com Conditional Random Fields, Sequence Prediction, Sequence Labelling Parsing XML is a process that is designed to read XML and create a way for programs to use XML. An XML parser is the piece of software that reads XML files and makes the information from those files available to applications. While reading an … Read more Parsing XML, Named Entity Recognition in One-Shot

An introduction to web scraping with Python

Introduction As a data scientist, I often find myself looking for external data sources that could be relevant for my machine learning projects. The problem is that it is uncommon to find open source data sets that perfectly correspond to what you are looking for, or free APIs that give you access to data. In … Read more An introduction to web scraping with Python

Top Examples of Why Data Science is Not Just .fit().predict()

In this post, I’m going to review some of the top concepts I learned that turned me from a technical data scientist to a good data scientist Two months ago, I finished my second year as a data scientist at YellowRoad so I decided to do a retrospective analysis on my projects, what did I … Read more Top Examples of Why Data Science is Not Just .fit().predict()

Pew Study Answers on Artificial Intelligence and the Future of Humans

The AI future is uncertain, but generally, I think it will improve life. I was one of the 900+ futurists interviewed for The Pew Research study released yesterday, “Artificial Intelligence and the Future of Humans.” Conducted with Elon University, the study revolved around AI and the 50th anniversary of the Internet. The report asked three questions … Read more Pew Study Answers on Artificial Intelligence and the Future of Humans

Classification (Part 2) — Linear Discriminant Analysis

An explanation of Bayes’ theorem and linear discriminant analysis Photo by Jerry Kiesewetter on Unsplash Overview Previously, logistic regression was introduced for classification. Unfortunately, like any model, it presents some flaws: When classes are well separated, parameters estimate from logistic regression tend to be unstable When the data set is small, logistic regression is also unstable … Read more Classification (Part 2) — Linear Discriminant Analysis

AWS Architecture For Your Machine Learning Solutions

The Undertaking Recently, I was involved in developing a machine learning solution for one of the largest North American steel manufacturers. The company wanted to leverage the power of ML to get insights on customer segmentation, order prediction and product-volume recommendations. This article revolves around why and how we leveraged AWS for deploying our deliverables … Read more AWS Architecture For Your Machine Learning Solutions

How to tune a BigQuery ML classification model to achieve a desired precision or recall

Select the probability threshold based on the ROC curve BigQuery provides an incredibly convenient way to train machine learning models on large, structured datasets. In an earlier article, I showed you how to train a classification model to predict flight delays. Here’s the SQL query that will predict whether a flight is going to be late … Read more How to tune a BigQuery ML classification model to achieve a desired precision or recall

Implementing Defensive Design in AI Deployments

A series of insights and battle scars from the world of medical device design With the upcoming launch of one of our AI products, there has been a repeating question that clients kept asking. This same question also shows up once in a while with our consulting engagements, to a lesser degree, but still demands an … Read more Implementing Defensive Design in AI Deployments

Object detection and tracking in PyTorch

Detecting multiple objects in images and tracking them in videos In my previous story, I went over how to train an image classifier in PyTorch, with your own images, and then use it for image recognition. Now I’ll show you how to use a pre-trained classifier to detect multiple objects in an image, and later track … Read more Object detection and tracking in PyTorch

10 Lessons Learned From Participating in Google AI Challenge

Key Points of My Work Disclaimers: I will present only a portion of the code I wrote for this competition, my teammates are absolutely not responsible for my awful and buggy code. A portion of this code is inspired by great Kagglers sharing their insights and code in Kaggle kernels and forums. I hope I did … Read more 10 Lessons Learned From Participating in Google AI Challenge

AI: the silver bullet to stop Technical Debt from sucking you dry

You’ve heard a lot about student debt, but what about technical debt? It’s Friday evening in the Bahamas. You’re relaxing under a striped red umbrella with a succulent glass of wine and your favorite book — it’s a great read and you love the way the ocean breeze moves the pages like leaves on a tree. As … Read more AI: the silver bullet to stop Technical Debt from sucking you dry

Pitching Artificial Intelligence to Business People

From silver bullet syndrome to silver linings In this article I plan to share with you our recent experience pitching AI to business folk, and what lessons we learned along the way. As a small firm of AI experts, we follow an awareness marketing approach. Rather than relying solely on one marketing channel, we attend conferences … Read more Pitching Artificial Intelligence to Business People

A Thought on Using Machine Learning Models

During my training classes, after/during discussion on the common machine learning models I will usually bring up a topic and that is the usage of insights from these models or the implementation of the model into business /organization process. For instance, we can get the most accurate model where its very good at ‘predicting’ which … Read more A Thought on Using Machine Learning Models

Improving Patient Flows With Data Science And Analytics

Reducing Costs By Improving Processes Our team was recently asked how data analytics and data science can be used to improve bottlenecks and patient flows in hospitals. Healthcare providers and hospitals can have very complex patient flows. Many steps can intertwine, resources have to shift in between tasks all the time, and severity of patients … Read more Improving Patient Flows With Data Science And Analytics

How a High School Junior Made a Self-Driving Car

Questions related to this repository from a project I created almost three years ago are among the most numerous questions I receive. The repository itself is really nothing too special, just an implementation of an Nvidia paper that was released about a year prior. A graduate student later managed to implement my code in an … Read more How a High School Junior Made a Self-Driving Car

Word Representation in Natural Language Processing Part II

In the previous part (Part I) of the word representation series, I talked about fixed word representations that make no assumption about semantics (meaning) and similarity of words. In this part, I will describe a family of distributed word representations. The main idea is to represent words as feature vectors. Each entry in vector stands … Read more Word Representation in Natural Language Processing Part II

AlphaZero implementation and tutorial

A walk-through of implementing AlphaZero using custom TensorFlow operations and a custom Python C module I describe here my implementation of the AlphaZero algorithm, available on Github, written in Python with custom Tensorflow GPU operations and a few accessory functions in C for the tree search. The AlphaZero algorithm has gone through three main iterations, first … Read more AlphaZero implementation and tutorial

To all Data Scientists — The one Graph Algorithm you need to know

Dec 8, 2018 Photo by Alina Grubnyak on Unsplash Graphs provide us with a very useful data structure. They can help us to find structure within our data. With the advent of Machine learning and big data, we need to get as much information as possible about our data. Learning a little bit of graph theory … Read more To all Data Scientists — The one Graph Algorithm you need to know

A Data Analysis of Riding The Bus

What should I expect before a round of the popular drinking game? Recommended equipment for Ride The Bus College. It’s a time for things like exploring your personality, finding your values, and making lifelong friends. Those are all well and good, but college is also a time for drinking games! There’s plenty of time in the … Read more A Data Analysis of Riding The Bus

Building a molecular charge classifier

The intersection of Chemistry and A.I A.I has seen unprecedented growth in the past couple years. Although machine learning architectures like Neural Networks (NN) have been known for a long time thanks to breakthroughs from top researchers like Geoffrey Hinton, only recently have NNs become powerful tools in an A.I specialist’s toolbox. This is credited mainly … Read more Building a molecular charge classifier

Comparative study on Classic Machine learning Algorithms

2. Logistic Regression Just like linear regression, Logistic regression is the right algorithm to start with classification algorithms. Eventhough, the name ‘Regression’ comes up, it is not a regression model, but a classification model. It uses a logistic function to frame binary output model. The output of the logistic regression will be a probability (0≤x≤1), … Read more Comparative study on Classic Machine learning Algorithms

How to determine the best model?

Machine learning models play a critical role in many aspects of today’s business. The use of a predictive model can improve the business bottom line, and a slightly improved model can result in an increase of millions of dollars. Although you may not know all the popular algorithms (and more powerful algorithms in the future), … Read more How to determine the best model?

Image Processing Class (EGBE443) #3 — Point Operation

The implement of the point operation affect on the histogram. Raising the brightness shift the histogram to right and increasing the contrast of the image expand the histogram. These point operations map the intensity by the mapping function contained the constant which is image content such as the highest intensity and the lowest intensity. Automatic … Read more Image Processing Class (EGBE443) #3 — Point Operation

Part 2: Gradient descent and backpropagation

Dec 3, 2018 In this article you will learn how a neural network can be trained by using backpropagation and stochastic gradient descent. The theories will be described thoroughly and a detailed example calculation is included where both weights and biases is updated. This is the second part in a series of articles: I assume … Read more Part 2: Gradient descent and backpropagation

Machine Learning Introduction: A Comprehensive Guide

Dec 3, 2018 This is the first of a series of articles in which I will describe machine learning concepts, types, algorithms and python implementations. The main goals of this series are: Creating a comprehesive guide towards machine learning theory and intuition. Sharing and explaining machine learning projects, developed in python, to show in a … Read more Machine Learning Introduction: A Comprehensive Guide

Abstract:

Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates Abstract: This post provides an overview of a phenomenon called “Super Convergence” where we can train a deep neural network in order of magnitude faster compared to conventional training methods. One of the key elements is training the network using “One-cycle policy” with maximum … Read more Abstract:

The Hidden Dangers in Algorithmic Decision Making

A robot judge in Futurama was all fun and games, until COMPAS was created. The quiet revolution of artificial intelligence looks nothing like the way movies predicted; AI seeps into our lives not by overtaking our lives as sentient robots, but instead, steadily creeping into areas of decision-making that were previously exclusive to humans. Because it … Read more The Hidden Dangers in Algorithmic Decision Making

Image Processing Class #0.2 — Digital Image

Image File Format Any images are stored in memory, raster image contain pixel values arrange in regular matrix. Conversely, vector image represent geometric objects using continuous coordinates. If you scale up the raster image,resolution of the image will be lost but it does not happen in vector image. Raster image and Vector image Tagged Image File Format … Read more Image Processing Class #0.2 — Digital Image

Full-Fledged Recommender System

Nov 29, 2018 The rapid rise in AI applications, decreasing processor and memory costs have allowed the last decade to show incredible progress with Recommender Systems. Given their rising importance in the retail industry, they are undoubtedly one of the more popular topics in Artificial Intelligence. https://thedatascientist.com/wp-content/uploads/2018/05/recommender_systems.png However, creating a full-fledged, ready-for-production, recommender system can … Read more Full-Fledged Recommender System

Attention Seq2Seq with PyTorch: learning to invert a sequence

Nov 29, 2018 TL;DR: In this article you’ll learn how to implement sequence-to-sequence models with and without attention on a simple case: inverting a randomly generated sequence. You might already have come across thousands of articles explaining sequence-to-sequence models and attention mechanisms, but few are illustrated with code snippets. Below is a non-exhaustive list of articles … Read more Attention Seq2Seq with PyTorch: learning to invert a sequence

Neural Networks II: First Contact

Gentle introduction on Neural Networks Nov 29, 2018 This series of posts on Neural Networks are part of the collection of notes during the Facebook PyTorch Challenge, previous to the Deep Learning Nanodegree Program at Udacity. Contents Introduction Forward Pass Backward Propagation Learning Testing Conclusion 1. Introduction In the next illustration, an Artificial Neural Network is … Read more Neural Networks II: First Contact

Part 1: A neural network from scratch — Foundation

Nov 27, 2018 In this series of articles I will explain the inner workings of a neural network. I will lay the foundation for the theory behind it as well as show how a competent neural network can be written in few and easy to understand lines of Java code. This is the first part … Read more Part 1: A neural network from scratch — Foundation

Map the solar system to a place near you –A NatGeo’s MARS inspired Shiny web app

Nov 27, 2018 I recently leveled up to fatherhood. That’s why I am currently on 5 months of parental leave (thank’s to the awesome team @store2be for going along with this!). Every morning at around 5am, I leave the bedroom with my son for the kitchen so his mom can have two real hours of … Read more Map the solar system to a place near you –A NatGeo’s MARS inspired Shiny web app

Being a Machine Learning Engineer: 7-months in

What kind of data is there? Is it only numerical? Are there categorical features which could be incorporated into the model? Heads up, categorical features can be considered any type of data which isn’t immediately available in numerical form. In the problem of trying to predict housing prices, you might have number of bathrooms as … Read more Being a Machine Learning Engineer: 7-months in

The Power of Data

Reflections on how data (or lack thereof) helps (or fails) policy makers in developing countries Foreword When I stood up to speak last Friday at the Steering Committee meeting between the Ministry of Education of Ivory Coast and TRECC — a partnership for transforming Education in cocoa producing regions, led by the Jacobs Foundation –, it had … Read more The Power of Data

Neural Networks I: Notation and building blocks

Gentle introduction on Neural Networks Nov 25, 2018 This series of posts on Neural Networks are part of the collection of notes during the Facebook PyTorch Challenge, previous to the Deep Learning Nanodegree Program at Udacity. Contents Neurons Connections Layers — Neurons vs Connections 3.1 Layers of Neurons 3.2. Layers of Connections — PyTorch Example 4. Notation ambiguity: Y = … Read more Neural Networks I: Notation and building blocks

Exploratory Data Analysis (EDA) techniques for kaggle competition beginners

A hands on guide for beginners on EDA and Data Science competitions Exploratory Data Analysis (EDA) is an approach to analysing data sets to summarize their main characteristics, often with visual methods. Following are the different steps involved in EDA : Data Collection Data Cleaning Data Preprocessing Data Visualisation Data Collection Data collection is the process … Read more Exploratory Data Analysis (EDA) techniques for kaggle competition beginners

Blockchain can be the new paradigm of the net

The popularization of blockchain will not depend on the users understanding its operation but on the existence of friendly and effective applications that solve real problems. Nov 22, 2018 Historically, each paradigm of the internet has had its killer application: before the web, it was email, with the original web it was Google and with … Read more Blockchain can be the new paradigm of the net

Combating media bias with AWS Comprehend

Nov 19, 2018 Photo by Randy Colas on Unsplash In the world of fake news and ideology-driven subjective media coverage, it is questionable which sources of journalism can be considered “reliable”. It happens many times that two different news outlets share two completely different takes on the same story. “Experts” point out different consequences of events … Read more Combating media bias with AWS Comprehend

Becoming An Analytics Manager Isn’t A Promotion.

(Photo by rawpixel on Unsplash) Nov 18, 2018 It’s A Career Change. Starting out as a data scientist may be the modern version of becoming a rock star but no-one really seems to be talking about what happens a few years further into your career. Analysing big data sets. Building models. Connecting data pipelines. The challenges … Read more Becoming An Analytics Manager Isn’t A Promotion.