Presidential Charisma: Who Should You Vote for?

Be it a presidential candidate or TV show host, people with charisma get their message across, communicate better with others and gain their trust. But for a president, does Charisma give away anything about how good of a president they’ll likely be? When you decide to vote for a candidate, do you vote for the … Read more

Reflections on the State of AI: 2018

Today, with hundreds of companies deeply engaged in the AI space, and even more working to figure out their strategy as related to the field, it might be a bit hard to pinpoint specific players that are best positioned to lead the way in the future. Still, if we look at any of the many … Read more

The Other Type of Machine Learning

A brief introduction to Reinforcement Learning This is the first in a series of articles on reinforcement learning and OpenAI Gym. Introduction Suppose you’re playing a video game. You enter a room with two doors. Behind Door 1 are 100 gold coins, followed by a passageway. Behind Door 2 is 1 gold coin, followed by … Read more

The power of Brain-Computer interface: use your brain to play your video game

Here we show one application of machine learning and signal processing in Neuroscience; translating thoughts into actions with our Game-Based Brain-Computer interface (BCI). Video-Game based BCI. The gaming part increases user engagement and makes it easier to acquire the new skill of controlling the BCI device. BCI enables direct control of brain activity over external … Read more

What A.I. Isn’t

(¹) A lazy leap into the future It isn’t intuitive, creative, inspired, generalized, or conscious. Will it ever be like us? Will it ever think like us? As I study data science I learn a little more about artificial intelligence each day. I practice wielding the tools in my machine learning tool box, and I read … Read more

PySpark in Google Colab

Creating a simple linear regression model with PySpark in Colab Photo by Ashim D’Silva on Unsplash With broadening sources of the data pool, the topic of Big Data has received an increasing amount of attention in the past few years. Besides dealing with the gigantic data of all kinds and shapes, the target turnaround time of the … Read more

Why do Data Visualizations Fail?

Easy … Charts masquerading as useful insights Photo by Braydon Anderson on Unsplash When a data visualization fails, there can be multiple reasons. The most common reason is that the author didn’t understand the message. Thus the meaning in the data is unclear or even hidden. The author didn’t consider what question the audience was asking. Consider the … Read more

Interpreting Data through Visualization with Python Matplotlib

What did the IBM Data Visualization Course Teach Me ? Matplotlib even though is aging, still remains as one of the most vital tools for data visualization, and this post is about using matplotlib effectively, to gain knowledge from a data-set. The IBM data science professional certificate program, which I have started taking around a month … Read more

Breaking neural networks with adversarial attacks

Are the machine learning models we use intrinsically flawed? As many of you may know, Deep Neural Networks are highly expressive machine learning networks that have been around for many decades. In 2012, with gains in computing power and improved tooling, a family of these machine learning models called ConvNets started achieving state of the … Read more

Time Series in Python — Exponential Smoothing and ARIMA processes

ARIMA ARIMA models (which include ARMA, AR and MA models) are a general class of models to forecast stationary time series. ARIMA models are made of three parts: A weighted sum of lagged values of the series (Auto-regressive (AR) part) A weighted sum of lagged forecasted errors of the series (Moving-average (MA) part) A difference … Read more

Security & Privacy in Artificial Intelligence & Machine Learning — Part-6: Up close with Privacy

Note: This is part-6 of a series of articles on ‘Security and Privacy in Artificial Intelligence & Machine Learning’. Here are the links to all articles (so far): Photo by Jason Blackeye on Unsplash In the previous article of the series, we looked at the nature and extent of damage that attackers can inflict if they, … Read more

Hyperparameters in Deep Learning

1. Optimizer Hyperparameters They are related more to the optimization and training process 1.1 Learning rate: The single most important hyperparameter and one should always make sure that has been tuned — Yoshua Bengio Good starting point = 0.01 If our learning rate is too small than optimal value then it would take a much longer time (hundreds … Read more

Why The World Needs Trustworthy Chatbots

Trust is such a human trait; to advance we need to learn how to trust bots Photo by Jehyun Sung on Unsplash The notion of trust underpins so much of society, whether we realise it or not. In modern times, trust is driving the success of new decentralised business models. Trust expert, Rachel Botsman, describes how businesses … Read more

Succeeding as a data scientist in small companies/startups

It’s nothing like at a big mature company. This’ll probably be an unbounded series of posts that spawned from this question that came across the awesome community that is the data-nerd twitter cluster: Some Background I’ve spent almost 12 years now at companies sized between 15–150 wearing various hats of “data analyst, engineer, and occasionally, … Read more

Can you Solve TED’s Frog Riddle? Can TED?

Using Bayes’ Rule to Solve a Controversial Problem As part of its riddle series TED-Ed, the youth and education initiative of TED, released a video called “Can you solve the frog riddle?” The video presents a riddle about conditional probability and solves it in a simple way. But, is the solution correct? Critics argue that … Read more

Set Theory — Functions

Today we’re going to expand on functions within the world of set theory. Similar to previous concepts introduced, the nomenclature for standard functions within sets is slightly different than other branches of math, & therefore requires reviewing. There are quite a few terms to introduce, so let’s jump right in! This first table of function … Read more

Word Level English to Marathi Neural Machine Translation using Seq2Seq Encoder-Decoder LSTM Model

A Guide to build Sequence to sequence models using LSTM Table of Contents Introduction Prerequisites Encoder — Decoder Architecture Encoder LSTM Decoder LSTM — Training mode Decoder LSTM — Inference mode Code Walk through Results and Evaluation Future Work End Notes References 1. Introduction Recurrent Neural Networks (or more precisely LSTM/GRU) have been found to be very effective in solving complex sequence related … Read more

Natural Language Processing Using Stanford’s CoreNLP

Introduction Analyzing text data using Stanford’s CoreNLP makes text data analysis easy and efficient. With just a few lines of code, CoreNLP allows for the extraction of all kinds of text properties, such as named-entity recognition or part-of-speech tagging. CoreNLP is written in Java and requires Java to be installed on your device but offers … Read more

Predictive Modeling: Picking the best model

Testing out different types of models on the same data Whether you are working on predicting data in an office setting or just competing in a Kaggle competition, it’s important to test out different models to find the best fit for the data you are working with. I recently had the opportunity to compete with some … Read more

Policy Based Reinforcement Learning, the Easy Way

Step by step approach to understanding Policy Based methods in Reinforcement Learning Photo by Jomar on Unsplash Introduction Suppose you are in a new town and you have no map nor GPS, and you need to reach downtown. You can try assess your current position relative to your destination, as well the effectiveness (value) of each … Read more

Deep Learning & Handwritten Arabic Digits

Using the library to classify the AHCD at 99% accuracy! photo: Morocco, 2000 The ‘hello world’ of deep learning is often the MNIST handwritten number dataset, and I wanted to apply the same techniques to a more interesting application: the Arabic Handwritten Characters Dataset (AHCD), a dataset developed by the American University in Cairo.¹ In … Read more

K-Means Clustering

Data set and Code As I mentioned before, we are going to be using text data and in particular, we will be taking a look at the Enron email data set which is available on Kaggle. For those of you that don’t know the story/scandal surrounding Enron, I would suggest checking out the smartest guys in … Read more

AnzoGraph: A W3C Standards-Based Graph Database

Introduction In this interview, I’m catching up with Barry Zane, Vice President at Cambridge Semantics. Barry is creator of AnzoGraph™, a native, massively parallel processing (MPP) distributed graph database. Barry has had quite a journey in database world. He served as Vice President of Technology of Netezza Corporation from 2000 to 2005, and was responsible … Read more

Superhuman “cell-sight” with Deep Learning

Using “in silico labeling” to predict fluorescent labels in unlabeled images and cell morphology, components, and structures. An analysis of the paper In Silico Labeling: Predicting Fluorescent Labels in Unlabeled Images published in Cell. Fluorescently tagged neuronal cell culture. Source Take a look at this image, and tell me what you see. Figure 1. Source: Finkbeiner … Read more

Artificial Intelligence and Business Value

Digital technologies are pervasive. Nearly 5 billion people in the world now have a mobile phone connection and more than 7 billion mobile phones are in use (some people have more than one phone). Approximately 2.5 billion of the phones are smartphones. Cell phone penetration is now approaching that of electricity — about 88% of the world’s … Read more

NLP Learning Series: Part 2 — Conventional Methods for Text Classification

NLP Learning Series (Part 2) Teaching Machines to Learn Text This is the second post of the NLP Text classification series. To give you a recap, recently I started up with an NLP text classification competition on Kaggle called Quora Question insincerity challenge. And I thought to share the knowledge via a series of blog posts on … Read more

Review: YOLOv3 — You Only Look Once (Object Detection)

Improved YOLOv2, Comparable Performance with RetinaNet, 3.8× Faster! YOLOv3 In this story, YOLOv3 (You Only Look Once v3), by University of Washington, is reviewed. YOLO is a very famous object detector. I think everybody must know it. Below is the demo by authors: YOLOv3 As author was busy on Twitter and GAN, and also helped out … Read more

Data Science with Optimus. Part 1: Intro.

Breaking down data science with Python, Spark and Optimus. Don’t worry if you don’t know what these logos are, I’ll explain them in next articles 🙂 Data science has reached new levels of complexity and of course awesomeness. I’ve been doing this for years now, I’m what I want for people is to have a clear and … Read more

Web scraping with Python — A to copy Z

Handling BeautifulSoup, avoiding blocks, enriching with API, storing in a DB and visualizing the data Photo by michael podger on Unsplash Introduction What is web scraping and when would you want to use it? The act of going through web pages and extracting selected text or images. An excellent tool for getting new data or enriching your … Read more

Naive Bayes: Intuition and Implementation

Introduction: What Are Naive Bayes Models? In a broad sense, Naive Bayes models are a special kind of classification machine learning algorithms. They are based on a statistical classification technique called ‘Bayes Theorem’. Naive Bayes model are called ‘naive’ algorithms becaused they make an assumption that the predictor variables are independent from each other. In other … Read more

Clustered Globe

Setting Constraints & Variables First, we’re gonna set the boundaries of what detail we are going to cluster. At this stage I want to keep countries separated and only cluster activities within a single country. Therefore, by the nature of clustering, small countries will probably become a single cluster. And although there could be cross-border … Read more

Using NLP to build a search & discovery app for Regulators

Regulations need to be updated constantly in this era of rapid socio-economic and technological change. Regulators spend a substantial amount of time assessing the current stock of Acts to identify inconsistent use of language or markers that don’t support innovation and create a burden for businesses. Given the large number of Acts and their complex … Read more

Hybrid Humans and Conscious Robots

Musings on the intersection of Artificial Intelligence, Consciousness, and Reinforcement Learning At what level are you conscious? Staring into the eyes of a comatose loved one, many of us have agonized over whether the patient was conscious of caresses received or whispered prayers. Increasing we will have answers to such questions, thanks in a large … Read more

People Tracking using Deep Learning

Doing cool things with data! Introduction Object Tracking is an important domain in computer vision. It involves the process of tracking an object which could be a person, ball or a car across a series of frames. For people tracking we would start with all possible detections in a frame and give them an ID. In … Read more

Supervised Machine Learning: Model Validation, a Step by Step Approach

Model validation is the process of evaluating a trained model on test data set. This provides the generalization ability of a trained model. Here I provide a step by step approach to complete first iteration of model validation in minutes. The basic recipe for applying a supervised machine learning model are: Choose a class of model … Read more

BigQuery without a credit card: Discover, learn and share

If you ever had trouble signing up for BigQuery, worry no more — now it’s easier than ever to sign up and start querying. The new sandbox mode even includes free storage, no credit card required. See the official blog post “Query without a credit card: introducing BigQuery sandbox” for more details. Here we are going to … Read more

Learn Enough Python to be Useful Part 2

How to Use if __name__ == “__main__ “ This article is one in a series to help you become comfortable in Python scripting land. It’s for data scientists and anyone new to Python programming. if __name__ == “__main__”: is one of those things you see in Python scripts that often isn’t explained. You might have … Read more

Intuitive Deep Learning Part 1a: Introduction to Neural Networks

As mentioned above, Deep Learning is simply a subset of the architectures (or templates) that employs “neural networks” which we can specify during Step 1. “Neural networks” (more specifically, artificial neural networks) are loosely based on how our human brain works, and the basic unit of a neural network is a neuron. At the basic … Read more

Fashion Science takes on Seasonal Color Analysis

Turns out ‘seasons’ just aren’t found in the data Wear the right color clothes and be more attractive! That’s the allure of seasonal color analysis. By appropriately placing you into one of four seasons — spring, summer, autumn, and winter — each has its own palette of colors appropriate for you. This paper applies Fashion Science to explore a simple … Read more

Image Classification for E-commerce [Part I]

System Requirements Download or clone the ResNet model from Facebook’s Github link. Install the Torch ResNet dependencies on Ubuntu 14.04+: Install Torch on a machine with CUDA GPU (NVIDIA GPU with compute capability 3.5 or above) Install cuDNN v4 or v5 and the Torch cuDNN bindings See the installation instructions for a step-by-step guide. Let’s … Read more

Perfume Recommendations using Natural Language Processing

Introduction Natural Language Processing(NLP) has many intriguing applications to Recommender Systems and Information Retrieval. As a perfume lover and a Data Scientist, the unusual and highly descriptive language used in the niche perfume community inspired me to use NLP to create a model to help me discover perfumes I might want to purchase. “Niche” perfumes … Read more

Hand Keypoints Detection

Detect the keypoint positions on hand images with small train data set. How many labelled images are needed to train a network to accurately predict fingers and palm lines locations? I was inspired by this blog post where the author reported 97.5% classification accuracy to classify if a human was wearing glasses or not with … Read more