‘I want to learn Artificial Intelligence and Machine Learning. Where can I start?’

I bought a plane ticket to the US with no return flight. I’d been studying for a year and I figured it was about time I started putting my skills into practice. My plan was to rock up to the US and get hired. Then Ashlee messaged me on LinkedIn, “Hey I’ve seen your posts … Read more ‘I want to learn Artificial Intelligence and Machine Learning. Where can I start?’

Object Detection using Deep Learning Approaches: An End to End Theoretical Perspective

Fast RCNN So the next idea from the same authors: Why not create convolution map of input image and then just select the regions from that convolutional map? Do we really need to run so many convnets? What we can do is run just a single convnet and then apply region proposal crops on the … Read more Object Detection using Deep Learning Approaches: An End to End Theoretical Perspective

Image Processing Class (EGBE443) #2 -Histogram

Computing the histogram In this section, the histogram was calculated by implementation of python programming code (Python 3.6). For python 3.6, There are a lot of common modules using in image processing such as Pillow, Numpy, OpenCV, etc. but in this program Pillow and Numpy module was used. To import the image from your computer, … Read more Image Processing Class (EGBE443) #2 -Histogram

Get system metrics for 5 min with Docker, Telegraf, Influxdb and Grafana

Hi all, there is a very quick guide how to configure a system monitoring for one or more servers using a modern stack of technologies like Grafana, Docker and Telegraf with Influxdb. The main goal for this article is to show how to start geting system metrics from your servers quick and easy, without spending … Read more Get system metrics for 5 min with Docker, Telegraf, Influxdb and Grafana

How to get fbprophet working on AWS Lambda

Solving package size issues of fbprophet serverless deployment Adi Goldstein / Unsplash I assume you’re reading this post because you’re looking for ways to use the awesome fbprophet (Facebook open source forecasting) library on AWS Lambda and you’re already familiar with the various issues around getting it done. I will be using a python 3.6 … Read more How to get fbprophet working on AWS Lambda

Machine Learning – Particle Swarm Optimization (PSO) and Twitter

We all live in a world where analyzing a massive set of unstructured data is becoming a business need. And the time we spend on the internet is basically the time we spend on social media. Even our daily life is affected by the people around us. And we are tending to change our opinions … Read more Machine Learning – Particle Swarm Optimization (PSO) and Twitter

Multi-Layer perceptron using Tensorflow

Sep 11, 2018 In this blog, we are going to build a neural network(multilayer perceptron) using TensorFlow and successfully train it to recognize digits in the image. Tensorflow is a very popular deep learning framework released by, and this notebook will guide for build a neural network with this library. If you want to understand … Read more Multi-Layer perceptron using Tensorflow

Diving into K-Means…

Sep 9, 2018 We have completed our first basic supervised learning model i.e. Linear Regression model in the last post here. Thus in this post we get started with the most basic unsupervised learning algorithm- K-means Clustering. Let’s get started without further ado! Background: K-means clustering as the name itself suggests, is a clustering algorithm, … Read more Diving into K-Means…

3 approaches for backtesting historical data

Reading and processing data for statistical and quantitative analysis in trading Sep 8, 2018 Anyone interested in the statistical analysis of financial markets has the need to process historical data. Historical data is needed in order to backtest or train: Quantitative trading. Statistical trading. Price action replay/walkthrough. Each need comes from different goals. 3 examples on … Read more 3 approaches for backtesting historical data

Why feature weights in a machine learning model are meaningless

Don’t make decisions based on the weights of an ML model Aug 31, 2018 As I see our customers fall in love with BigQuery ML, an old problem rises its head — I find that they can not resist the temptation to assign meaning to feature weights. “The largest weight in my model to predict customer lifetime value,” … Read more Why feature weights in a machine learning model are meaningless

Doing XGBoost hyper-parameter tuning the smart way — Part 1 of 2

Aug 29, 2018 Picture taken from Pixabay In this post and the next, we will look at one of the trickiest and most critical problems in Machine Learning (ML): Hyper-parameter tuning. After reviewing what hyper-parameters, or hyper-params for short, are and how they differ from plain vanilla learnable parameters, we introduce three general purpose discrete optimization … Read more Doing XGBoost hyper-parameter tuning the smart way — Part 1 of 2

Automatic Image Quality Assessment in Python

Aug 28, 2018 Image quality is a notion that highly depends on observers. Generally, it is linked to the conditions in which it is viewed; therefore, it is a highly subjective topic. Image quality assessment aims to quantitatively represent the human perception of quality. These metrics are commonly used to analyze the performance of algorithms in … Read more Automatic Image Quality Assessment in Python

The One Probability Review That You Need

Probability and statistics are everywhere: from finance and demographic projections to casino games, these disciplines help us make sense of the world. They also underlie much of the machine learning apparatus that is the rage nowadays. What resources should we turn to, if we were to dust off our knowledge of them? (Disclaimer: I received … Read more The One Probability Review That You Need

Mapping the UK’s Traffic Accident Hotspots

While looking for some interesting geographical data to work with, I came across the Road Safety Data published by the UK government. This is a very comprehensive road accident data set that includes the incident’s geographical coordinates, as well as other related data such as the local weather conditions, visibility, police attendance and more. There … Read more Mapping the UK’s Traffic Accident Hotspots

What Does It Really Mean to Operationalize a Predictive Model?

It is not enough to just stand up a web service that can make predictions. Aug 13, 2018 Original Image Source — Meme overlay by Imgflip In a 2017 SAS survey, 83% of organizations have made moderate-to- significant investments in big data, but only 33% say they have derived value from their investments. Other more recent surveys have … Read more What Does It Really Mean to Operationalize a Predictive Model?

Practical tips for class imbalance in binary classification

4. Class weighted / cost sensitive learning Without resampling the data, one can also make the classifier aware of the imbalanced data by incorporating the weights of the classes into the cost function (aka objective function). Intuitively, we want to give higher weight to minority class and lower weight to majority class. scikit-learn has a … Read more Practical tips for class imbalance in binary classification

Feature Engineering for Healthcare Fraud Detection

The nature of the problem: medical fraud and abuse The U.S. department of health and human services in a pamphlet Avoiding Medicare Fraud and Abuse: A Roadmap for Physicians states “most physicians strive to work ethically, render high-quality medical care to their patients, and submit proper claims for payment,” yet “the presence of some dishonest … Read more Feature Engineering for Healthcare Fraud Detection

Math Behind Reinforcement Learning, the Easy Way

Aug 2, 2018 Photo by JESHOOTS.COM on Unsplash Look at this equation: Value function of Reinforcement Learning If it does not intimidate you, then you are a mathematical savvy and there is no point in reading this article 🙂 This article is not about teaching Reinforcement Learning (RL) but about explaining the math behind it. So it … Read more Math Behind Reinforcement Learning, the Easy Way

Cooking with Machine Learning: Dimension Reduction

Recently I came across this cooking recipes data set in Kaggle, and it inspired me to combine 2 of my main interests in life. Food and machine learning. What makes this data set special is that it contains recipes from 20 different cuisines, 6714 different ingredients, but only 26648 samples. Some cuisines have way fewer … Read more Cooking with Machine Learning: Dimension Reduction

An In-depth Review of Andrew Ng’s deeplearning.ai Speciliazation

So you’ve seen the recent news about how artificial intelligence (AI) is changing everything. However, the idea of AI has been around for a long time. Machines that think and talk like humans have been the inspiration for movies and stories for decades. But what’s the deal? Why has AI been getting better and better … Read more An In-depth Review of Andrew Ng’s deeplearning.ai Speciliazation

An Advanced Example of Tensorflow Estimators Part (1/3)

Estimators were introduced in version 1.3 of the Tensorflow API, and are used to abstract and simplify training, evaluation and prediction. If you haven’t worked with Estimators before I suggest to start by reading this article and get some familiarity as I won’t be covering all of the basics when using estimators. In no means … Read more An Advanced Example of Tensorflow Estimators Part (1/3)

Hypothesis Analysis Explained

Jul 19, 2018 Hypothesis analysis is a widely known concept and is used extensively by researchers, statisticians and quantitative analysts. It allows them to follow a set of formal steps to perform calculated analysis on their data. It is also widely used in machine learning and artificial intelligence. In this article, I will be explaining core concepts of … Read more Hypothesis Analysis Explained

PySpark ML and XGBoost full integration tested on the Kaggle Titanic dataset

Jul 8, 2018 In this tutorial we will discuss about integrating PySpark and XGBoost using a standard machine learing pipeline. We will use data from the Titanic: Machine learning from disaster one of the many Kaggle competitions. Before getting started please know that you should be familiar with Apache Spark and Xgboost and Python. The … Read more PySpark ML and XGBoost full integration tested on the Kaggle Titanic dataset

Acoustic Noise Cancellation by Machine Learning

DIY Noise-Cancellation System prototype made with TensorFlow. Jun 25, 2018 Image by TheDigitalArtist on Pixabay In this post I describe how I built an active noise cancellation system by means of neural networks on my own. I’ve just got my first results which I am sharing, but the system looks like a ravel of scripts, binaries, … Read more Acoustic Noise Cancellation by Machine Learning

Finding Good Learning Rate and The One Cycle Policy.

Introduction Learning rate might be the most important hyper parameter in deep learning, as learning rate decides how much gradient to be back propagated. This in turn decides by how much we move towards minima. The small learning rate makes model converge slowly, while the large learning rate makes model diverge. So, the learning rate … Read more Finding Good Learning Rate and The One Cycle Policy.

Recommendation Systems — Models and Evaluation

I’ve been involved in building several different types of recommendation systems, and one thing I’ve noticed is that each use case is different from the next, as each aims to solve a different business problem. Let’s consider a few examples: Movie/Book/News Recommendations — Suggest new content that increases user engagement. The aim is to introduce users to … Read more Recommendation Systems — Models and Evaluation

R vs Python: Image Classification with Keras

Many data professionals are strict on the language to be used for ANN models limiting their dev. environment exclusively to Python. I decided to test performance of Python vs. R in terms of time required to train a convolutional neural network based model for image recognition. As the starting point, I took the blog post … Read more R vs Python: Image Classification with Keras

IoT Made Easy: ESP-MicroPython-MQTT-ThingSpeak

Using MQTT protocol, we will get captured data from sensors, logging them to an IoT service, ThingSpeak.com and to a mobile App, Thingsview. 1. Introduction In my previous article, MicroPython on ESP using Jupyter, we learned how to install and run MicroPython on an ESP device. Using Jupyter Notebook as our development environment, we also … Read more IoT Made Easy: ESP-MicroPython-MQTT-ThingSpeak

From Git to Colab, via SSH

When you are using Google’s Colaboratory (Colab) for running your Deep Learning models the most obvious way to access the large datasets is by storing them on Google Drive and then mounting Drive onto the Colab environment. But a lot of open sourced large datasets that are available for research purposes, are hosted on Github/Gitlab. … Read more From Git to Colab, via SSH

DevOps: To do or not to do?

Over the past few decades, four key change initiatives have been taking place in the organizations: strategic planning, re-engineering, total quality management and downsizing. The aim of these initiatives was to achieve economic effectiveness, but around 75% of them failed or created problems that were serious enough to threaten organization’s survival (1). It has been … Read more DevOps: To do or not to do?

The Mathematics of Decision Trees, Random Forest and Feature Importance in Scikit-learn and Spark

This post attempts to consolidate information on tree algorithms and their implementations in Scikit-learn and Spark. In particular, it was written to provide clarification on how feature importance is calculated. There are many great resources online discussing how decision trees and random forests are created and this post is not intended to be that. Although … Read more The Mathematics of Decision Trees, Random Forest and Feature Importance in Scikit-learn and Spark

Implementing QANet (Question Answering Network) with CNNs and self attentions

Apr 15, 2018 In this post, we will tackle one of the most challenging yet interesting problems in Natural Language Processing, aka Question Answering. We will implement Google’s QANet in Tensorflow. Just like its machine translation counterpart Transformer network, QANet doesn’t use RNNs at all which makes it faster to train / test. I’m assuming … Read more Implementing QANet (Question Answering Network) with CNNs and self attentions

What I wish I’d done differently as a data science manager

On centralizing siloed data Apr 12, 2018 I still get nostalgic looking at the very first Pebbles. (Photo courtesy of Pebble’s first Kickstarter) In 2014, I joined Pebble, the smartwatch maker later acquired by Fitbit, to lead their data science & analytics team. I was interested in the challenges of managing a data organization at a … Read more What I wish I’d done differently as a data science manager

Machine Learning for People Who Don’t Care About Machine Learning

Greg Lamp, previous co-founder of the data science startup Yhat, and current co-founder & CTO of Waldo shares his thoughts on Machine Learning for those of us who just don’t care about Machine Learning. What is Machine Learning? The definition I have come up with for Machine Learning is as follows… machine learning is using … Read more Machine Learning for People Who Don’t Care About Machine Learning

Hierarchical Clustering on Categorical Data in R

Dissimilarity MatrixArguably, this is the backbone of your clustering. Dissimilarity matrix is a mathematical expression of how different, or distant, the points in a data set are from each other, so you can later group the closest ones together or separate the furthest ones — which is a core idea of clustering. This is the step where … Read more Hierarchical Clustering on Categorical Data in R

Which Leading Artificial Intelligence Course Should You Take and What Should You Do After?

4. Course Content The content of each course is world-class. I’ve frequently stated in my videos these are the best courses I’ve ever taken. The DLND is broken into six parts with five of the parts having significant projects attached. 1. Introduction2. Neural Networks — creating your first neural network.3. Convolutional Neural Networks — building … Read more Which Leading Artificial Intelligence Course Should You Take and What Should You Do After?

Automatic GPUs

A reproducible R / Python approach to getting up and running quickly on GCloud with GPUs in Tensorflow “A high view of a sea of clouds covering a mountain valley in the Dolomites” by paul morris on Unsplash Backstory After completing Google’s excellent Data Engineering Certified Specialization on Coursera recently (*which I highly recommend), I … Read more Automatic GPUs

Crossing Your Data Science Chasm

An analytics roadmap for growth Scenario — You’re an up-and-coming ecommerce/SaaS startup. You’ve got your site up, you have A/B tested your message, and you’ve got your SEO, and social ad buys. You’ve set up your email drip campaign and reminders. You also have basic BI reporting telling you channel traffic and conversions. Traffic is … Read more Crossing Your Data Science Chasm

Quick implementation of Yolo V2 with Keras!

Feb 22, 2018 I do not hold ownership to any of the above pictures. These are merely used for educational purposes to describe the concepts. Real time multiple object localization remains a grand debate in the field of digital image processing since many years. With the invent of Deep Learning and convolutional neural networks, the … Read more Quick implementation of Yolo V2 with Keras!

Ordinal Logistic Regression

An overview and implementation in R Feb 19, 2018 Fig 1: Performance of an individual — Poor, Fair, Excellent Can you guess what is the common link in the variables mentioned below: Job satisfaction level — Dissatisfied, Satisfied, Highly Satisfied Performance of an individual — Poor, Fair, Excellent Impact of a regulation on bank’s performance — Positive, Neutral, Negative The variables are not only … Read more Ordinal Logistic Regression

Writing Custom Keras Generators

The idea behind using a Keras generator is to get batches of input and corresponding output on the fly during training process, e.g. reading in 100 images, getting corresponding 100 label vectors and then feeding this set to the gpu for training step. The problem I faced was memory requirement for the standard Keras generator. … Read more Writing Custom Keras Generators

Tips for Using Data to Solve Company Issues that You Can Master Today

It’s not enough to do data analysis (Credit rawpixel: Unsplash) Stop management from ignoring your analysis Feb 3, 2018 To a data analyst, there is nothing more exciting than the data revealing insights about real organizational issues. However, it is completely deflating to present the insights to management and accomplish nothing. From talking to other data … Read more Tips for Using Data to Solve Company Issues that You Can Master Today

On the importance of DSLs in ML and AI

4) Under the hood: expressing computations TensorFlow could be considered a programming system and runtime, not just a “library” in the traditional sense: TensorFlow’s graph even supports constructs like variable scoping and control flow — but rather than using Python syntax, you manipulate these constructs through an API. (Innes2017) TensorFlow and similar tools present themselves as “just … Read more On the importance of DSLs in ML and AI

Bootstrapping microservices — your microservice architecture ready

Jan 13, 2018 The computing world has seen increasing attention on microservices software architecture in order to enhance software scalability and efficiency. Microservices brings many benefits for tech organizations. However, it is also clear that despite the benefits of modularization and containerization, many organizations continue to struggle with microservices. The microservices-based application comprises of numerous … Read more Bootstrapping microservices — your microservice architecture ready

Reinforcement Learning Series — 01 (Key Concepts)

Reinforcement Learning (RL) is one of the most happening field of Machine Learning (ML) and Artificial Intelligence (AI).Though RL existed for many decades, only recently the giant has awaken after explosion in Neural Network based Deep Learning. This blog is an attempt to explain basic concepts of Reinforcement Learning using simple example and explanation that … Read more Reinforcement Learning Series — 01 (Key Concepts)

ToneNet : A Musical Style Transfer

Nov 27, 2017 By: Team Vesta, University of Southern California. CSCI:599 Deep Learning and Its Applications Suraj Jayakumar ([email protected]), Rakesh Ramesh ([email protected]), Pradeep Thalasta ([email protected]) Introduction: The recent success of Generative Adversarial Networks (GANs) in vision domain such as style transfer inspired us to experiment with these techniques in musical domain. Music generation mainly delves … Read more ToneNet : A Musical Style Transfer