Estimating Intervention Effects using Baysian Models in R

Measuring the effect of an intervention on some metric is an important problem in many areas of business and academia. Imagine, you want to know the effect of a recently launched advertising campaign on product sales. In an ideal setting, you would have a treatment and a control group so that you can measure the … Read moreEstimating Intervention Effects using Baysian Models in R

A Framework to tackle tough Data Science Problems

One of the things I particularly like about working in data science, is the science part: Figuring out the right questions to ask, how to frame a problem correctly and finally trying to solve it. While there are many problems that you can simply solve by library(caret) or from sklearn import * and dumping your … Read moreA Framework to tackle tough Data Science Problems

The Mathematics of Decision Trees, Random Forest and Feature Importance in Scikit-learn and Spark

This post attempts to consolidate information on tree algorithms and their implementations in Scikit-learn and Spark. In particular, it was written to provide clarification on how feature importance is calculated. There are many great resources online discussing how decision trees and random forests are created and this post is not intended to be that. Although … Read moreThe Mathematics of Decision Trees, Random Forest and Feature Importance in Scikit-learn and Spark