Introduction to gradient boosting on decision trees with Catboost

Today I would like to share my experience with open source machine learning library, based on gradient boosting on decision trees, developed by Russian search engine company — Yandex. Github profile according to the 12th of February Library is released under Apache license and offered as a free service. ‘Cat’, by the way, is a shortening of ‘category’, … Read more

Deeper into DCGANs

My last post about DCGANs was primarily focused on the idea of replacing fully connected layers with convolutions and implementing upsampling convolutions with Keras. This article will further explain the architectural guidelines mentioned by Raford et al. [1], as well as additional topics mentioned in the paper such as Unsupervised Feature Learning with GANs, GAN … Read more

FastAI Image Classification

Creating model and initial training The FastAI library is designed to let you create models (FastAi calls them learners) with only a few lines of code. They provide a method called create_cnn, which can be used to create a convolutional neural network. The method needs two arguments, the data and the architecture, but also supports many … Read more

Finding Your Flavor of Data Science Career

Three Approaches to Guide You in Choosing Your Path Does your concept of a Data Scientist look something like a fictional super hero, possessing such a broad and deep skillset that it is simply humanly impossible? And yet, does that unrealistic image make you sometimes feel like a data science imposter? (It’s really worth reading that … Read more

Building fully custom machine learning models on AWS SageMaker: a practical guide

AWS SageMaker is a cloud machine learning SDK designed for speed of iteration, and it’s one of the fastest-growing toys in the Amazon AWS ecosystem. Since launching in late 2017 SageMaker’s growth has been remarkable — last year’s AWS re:Invent stated that there are now over 10,000 companies using SageMaker to standardize their machine learning processes. SageMaker … Read more

Learning from Graph data using Keras and Tensorflow

Motivation : There is a lot of data out there that can be represented in the form of a graph in real-world applications like in Citation Networks, Social Networks (Followers graph, Friends network, … ), Biological Networks or Telecommunications. Using Graph extracted features can boost the performance of predictive models by relying of information flow between neighboring nodes. … Read more

Performing multidimensional matrix operations using numpy’s broadcasting

Numpy’s broadcasting feature can be somewhat confusing for new users of this library, but as it allows for very clean, elegant and FUN coding. It is definitely worth the effort of getting used to. In this short article, I wanted to show a nice implementation of broadcasting to save some for loops and even computation … Read more

Machine Learning: Regularization and Over-fitting Simply Explained

http://www.discovery.com/tv-shows/mythbusters/about-this-show/physics-of-seesaws/ I am going to give intuitive understanding of Regularization method in as simple words as possible. Firstly, I will discuss some basic ideas, so if you think you are already families with those, feel free to move ahead. A Liner Model A liner model is the one that follows a straight line in the prediction … Read more

Limitations of Deep Learning in AI Research

Related articles: References: [1] Deep Learning Review| Yann LeCun, Yoshua Bengio, Geoffrey Hinton | http://pages.cs.wisc.edu/~dyer/cs540/handouts/deep-learning-nature2015.pdf [2] 30 Amazing Applications of Deep Learning | Yaron Hadad | http://www.yaronhadad.com/deep-learning-most-amazing-applications/ [3] Introduction to Deep Learning | Bhiksha Raj | Carnegie Mellon University | http://deeplearning.cs.cmu.edu/ [4] Understanding LSTM Networks | Christopher Olah | http://colah.github.io/posts/2015-08-Understanding-LSTMs/ [5] Memory Augmented Neural-Networks | … Read more

Winning Blackjack using Machine Learning

Genetic Algorithm Configurations One of the unusual aspects to working with a GA is that it has so many settings that need to be configured. The following items can be configured for a run: Population Size Selection Method Mutation Rate and Impact Termination Conditions Varying each of these gives different results. The best way to … Read more

Predicting presence of Heart Diseases using Machine Learning

Machine Learning is used across many spheres around the world. The healthcare industry is no exception. Machine Learning can play an essential role in predicting presence/absence of Locomotor disorders, Heart diseases and more. Such information, if predicted well in advance, can provide important insights to doctors who can then adapt their diagnosis and treatment per … Read more

KubernetesExecutor for Airflow

Scale Airflow natively on Kubernetes In the 1.10 release, Airflow introduced a new executor to run workers at scale: the Kubernetes executor. In this article we’ll look into: What is Airflow and which problem it solves The Kubernetes executor and how it compares to the Celery executor An example deployment on minikube TL;DR Airflow has … Read more

What makes an active timebank?

In this article, I explain my process for collecting data, feature engineering, and using linear regression to identify predictors of active timebanks. But first… What’s a timebank? Without fail, when I say I am working on a project about timebanks, the first question someone asks is, “What’s a timebank?” As explained on the TimeBanks USA … Read more

Around the world in 90.414 kilometers

This article compares several search algorithms applied to a Traveling Salesman Problem of 85 cities. The goal is to show intuition behind some well known and effective search algorithms to people new to the subject of optimization. I chose to build less complex algorithms and attempted to describe them as understandable as possible. If you … Read more

Deep Compression: Optimization Techniques for Inference & Efficiency

As technology cozies up to the physical limits of Moore’s law, computation is increasingly limited by heat dissipation rather than the number of transistors that can be packed onto a given area of silicon. Modern chips already routinely idle whole sections of their area forming what’s referred to as “dark silicon,” referring to design constraining … Read more

Predict malignancy in breast cancer tumors with your own neural network and the Wisconsin Dataset

In the final part of this series, we predict malignancy in breast cancer tumors using the network we coded from scratch. In part 1 of this series, we understood in depth the architecture of our neural network. In part 2, we built it using Python. We also understood in depth back-propagation and the gradient descent optimization … Read more

Quantum rush — New light on pushing AI & Machine learning boundaries

Quantum computers, It’s a successor to the classical -digital computer, Which adds an extra dimension to existing computers which lacks in solving optimization problems & doing things in parallel. Quantum computers work on the principle of quantum mechanics Quantum mechanics is the body of scientific laws that describe the motion and interaction of photons, electrons, … Read more

Towards Fast Neural Style Transfer

The seminal paper of Neural Style Transfer presented by Gatys et al. [1] demonstrates a remarkable characteristic of Deep Convolutional Neural Networks. The sequential representations learned from layers of parametric convolutions can be separated into ‘content’ and ‘style’. The fundamental idea behind Style Transfer is that pre-trained DCNNs on tasks such as ImageNet classification can … Read more

Embeddings-free Deep Learning NLP model

Word embeddings (e.g. word2vec, GloVe) are introduced several years ago and changing NLP tasks fundamentally. Having embeddings, we do not need one-hot encoding which causing very high-dimensional feature in most of the NLP tasks. We can use 300 dimensions to represent over 1 million words. Different kinds of embeddings such as character embedding, sentence embeddings … Read more

How to make GDPR and ONA work together?

Searching online for ONA (Organizational Network Analysis) gets you various definitions, including those with curly math symbols and graph theory. However, in a nutshell, ONA is about who communicates to who in an organization. Although ONA is regarded as one of late buzzwords, it can be traced way back at least in the 80’s. In … Read more

How to “farm” Kaggle in the right way

This article describes the advice and approaches on how to effectively use Kaggle as a competition platform to improve practitioner skills in Data Science with maximum efficiency and profitability. farm (farming) — gaming tactic where a player performs repetitive actions to gain experience, points or some form of in-game currency. Description These methods helped me with getting … Read more

Where should you live in San Francisco?

Moving to San Francisco would be awesome except for the cost of rent. I use Zillow data in this project to find cheap/high-value rental opportunities in San Francisco. The following analysis was scripted in R, and analysis is neighborhood-based. Credits to Ken Steif and Keith Hassel for their great tutorial on exploring home prices in … Read more

Cultural overfitting and underfitting. Or why the “Netflix Culture” won’t work in your company.

A couple of weeks back, I gave a talk in the SFELC (San Francisco Engineering Leadership Conference). As I was preparing the slides for the talk I reflected on something pretty interesting: I have been managing technical teams for over 25 years. I have also been giving public talks for about the same time. However, … Read more

Holy Grail for Bias-Variance tradeoff, Overfitting & Underfitting

There are two more important terms related to bias and variance that we must understand now- Overfitting and Underfitting. I am again going to use a real life analogy here. I have referred to the blog of Machine learning@Berkeley for this example. There is a very delicate balancing act when machine learning algorithms try to … Read more

An Introduction on Time Series Forecasting with Simple Neura Networks & LSTM

How to develop Artificial Neural Networks and LSTM recurrent neural networks for time series prediction in Python with the Keras deep learning network The purpose of this article is to explain Artificial Neural Network (ANN) and Long Short-Term Memory Recurrent Neural Network (LSTM RNN) and enable you to use them in real life and build the … Read more

Why isn’t out-of-time validation more ubiquitous?

Train, validate and test partitions for out-of-time performance take planning and thought The purpose of supervised machine learning is to classify unlabeled data. We want algorithms to tell us whether a borrower will default, a customer make a purchase, an image contains a cat, dog, malignant tumor or a benign polyp. The algorithms “learn” how to … Read more

XLM — Enhancing BERT for Cross-lingual Language Model

Cross-lingual Language Model Pretraining Attention models, and BERT in particular, have achieved promising results in Natural Language Processing, in both classification and translation tasks. A new paper by Facebook AI, named XLM, presents an improved version of BERT to achieve state-of-the-art results in both types of tasks. XLM uses a known pre-processing technique (BPE) and … Read more

Explaining Feature Importance by example of a Random Forest

Source: https://unsplash.com/photos/BPbIWva9Bgo In many (business) cases it is equally important to not only have an accurate, but also an interpretable model. Oftentimes, apart from wanting to know what our model’s house price prediction is, we also wonder why it is this high/low and which features are most important in determining the forecast. Another example might … Read more

Train neural networks using AMD GPUs and Keras

Train a neural network with Keras In the last section, of this tutorial, we will train a simple neural network on the MNIST dataset. We will firstly build a fully connected neural network. Fully connected neural network Let’s create a new notebook, by selecting Python3 from the upper-right menu in Jupyter root directory. The upper-right menu in … Read more

Tikhonov regularization. An example other than L2.

Example: Exploiting the underlying structure of spectrometer data Consider a machine learning model that uses spectrometer data as its input features. Example of spectrometer readings of skin A scientist starts the process of collecting data, and after a while she has , say 10 datapoints with spectrometer readings from 400 to 700 nm with a spacing of … Read more

A Comprehensive Introduction to Different Types of Convolutions in Deep Learning

Towards intuitive understanding of convolutions through visualizations If you’ve heard of different kinds of convolutions in Deep Learning (e.g. 2D / 3D / 1×1 / Transposed / Dilated (Atrous) / Spatially Separable / Depthwise Separable / Flattened / Grouped / Shuffled Grouped Convolution), and got confused what they actually mean, this article is written for … Read more

Predicting Friendship

Who’s Most Likely Your Next Friend? Source: https://gph.is/16bpHh9 Introduction Friendship is extremely important. Having friends as a child can increase your chances of happiness as an adult (Holder and Coleman, 2009). At the same time, we aren’t taught a lot about friendship. Friendships form due to circumstance and chance and develop intuitively. But do they really? … Read more

Inference using EM algorithm

Learn in-depth about the magic of EM algorithm and start training your own graphical models What happens when your inference algorithm is not good ?[Source: https://bit.ly/2RSKuoI] Introduction The goal of this post is to explain a powerful algorithm in statistical analysis: the Expectation-Maximization (EM) algorithm. It is powerful in the sense that it has the … Read more

Test Time Augmentation (TTA) and how to perform it with Keras

What is Test Time Augmentation ? Similar to what Data Augmentation is doing to the training set, the purpose of Test Time Augmentation is to perform random modifications to the test images. Thus, instead of showing the regular, “clean” images, only once to the trained model, we will show it the augmented images several times. We … Read more

Deceiving Your Mind With Your Eyes

A few months ago, I was asked to assist in keeping an eye on a specific group of people on Twitter because there was a perception they might present a danger to others. I have a trigger point for people who do bad things so I figured I would do what I do best and … Read more

6 steps to understanding a heap with Python

Today I will explain the heap, which is one of the basic data structures. Also, the famous search algorithms like Dijkstra’s algorithm or A* use the heap. A* can appear in the Hidden Malkov Model (HMM) which is often applied to time-series pattern recognition. Please note that this post isn’t about search algorithms. I’ll explain … Read more

Hands-On Big Data Streaming, Apache Spark at scale

Introduction In a world where data is being generated at an extremely fast rate, the correct analysis of the data and providing useful and meaningful results at the right time can provide helpful solutions for many domains dealing with data products. This can be applied in Health Care and Finance to Media, Retail, Travel Services … Read more

The magic behind the perceptron network

Preface note This story is part of a series I am creating about neural networks. This chapter is dedicated to a very simple type of neural network whose creation is attributed to Frank Rosenblatt after his research during the 50s and 60s. I am going to talk about the perceptron. The perceptron neuron The perceptron … Read more

Pump up the Volumes: Data in Docker

Spices Pushing the food metaphor running through these articles to the breaking point, let’s compare data in Docker to spices. Just as there are many spices in the world, there are many ways to save data with Docker. Quick FYI: this guide is current for Docker Engine Version 18.09.1 and API version 1.39. Data in … Read more

Predicting bankruptcy

Vikram Devatha & Devashish Dhiman The economic meltdown of 2008, initiated a conversation about market sustainability, and the tools that can be used to predict it. The need for better predictive models become apparent, in order to avoid such devastating events in the future. Bankruptcy of companies and enterprises effects the financial market at multiple … Read more

Training Regression Models

Training a Regression Model — Deciding loss function as an evaluation metric for Regression Models Assume that we have a training data set of employees happiness index and employee’s productivity, and plotting them yields the following graph. The graph shows that the underlying pattern of training data is a Linear Relationship between the two variables. Hence, training … Read more

Creating a Smart Scale with Tensorflow

XiaowenBlockedUnblockFollowFollowing Feb 10 Looking to get your feet wet with machine learning and build something useful at the same time? Here’s a fun weekend project to use TensorFlow to automatically read your weight from pictures of your scale and chart it over time. You’ll learn the basics of the TensorFlow Object Detection API and be … Read more

Modern Parallel and Distributed Python: A Quick Tutorial on Ray

Starting Ray The ray.init() command starts all of the relevant Ray processes. On a cluster, this is the only line that needs to change (we need to pass in the cluster address). These processes include the following: A number of worker processes for executing Python functions in parallel (roughly one worker per CPU core). A … Read more

Linear Regression

Likelihood Function To apply maximum likelihood, we first need to derive the likelihood function. First, let’s rewrite our model from above as a single conditional distribution given x: Given x, y is drawn from a Gaussian centered on our line. This is equivalent to pushing our x through the equation of the line and then adding … Read more

Finding Familiar Faces with a Tensorflow Object Detector, Pytorch Feature Extractor, and Spotify’s…

In a few different posts I have put together facial recognition pipelines or something similar using object detectors and something like a Siamese network for feature extraction to find similar images but I haven’t really dug into how to use those feature vectors in a more practical larger scale way. What I mean here is … Read more

Transform Grayscale Images to RGB Using Python’s Matplotlib

Learn about image data structures while adding two dimensions for computer vision & deep learning pipelines R, G, & B — Arabic numeral ‘3’ Data pre-processing is critical for computer vision applications, and properly converting grayscale images to the RGB format expected by current deep learning frameworks is an essential technique. What does that mean? Understanding Color Image … Read more