Data Science Austria

10 Steps to Teaching Data Science Well

1. Ensure everyone in your class feels comfortable to participate. Create an inclusive learning environment by ensuring everyone feels like they belong to the classroom and can participate, whether it’s through answering questions in front of everyone or approaching the instructor one-on-one. Prioritize people of color, women, LGBTQIA+, and folks … Read more10 Steps to Teaching Data Science Well

Will your income be more than $50K/yr? Machine Learning can tell

Machine learning is breaking grounds in numerous fields including Finance. What if we could use Machine Learning models to identify incomes of individuals? I found just the right dataset for this, called Census Income Dataset. I used the information in the dataset to predict if someone would earn an income … Read moreWill your income be more than $50K/yr? Machine Learning can tell

Building a Content Based Recommender System for Hotels in Seattle

Photo Credit: Pixabay How to use description of a hotel to recommend similar hotels. The cold start problem is a well known and well researched problem for recommender systems, where system is not able to recommend items to users. due to three different situation i.e. for new users, for new products and … Read moreBuilding a Content Based Recommender System for Hotels in Seattle

Monotonic Binning with GBM

In addition to monotonic binning algorithms introduced in my previous post (https://statcompute.wordpress.com/2019/03/10/a-summary-of-my-home-brew-binning-algorithms-for-scorecard-development), two more functions based on Generalized Boosted Regression Models have been added to my GitHub repository, gbm_bin() and gbmcv_bin(). The function gbm_bin() estimates a GBM model without the cross validation and tends to generate a more granular binning … Read moreMonotonic Binning with GBM

Multivariate Time Series Forecasting Using Random Forest

Introduction In my earlier post (Understanding Entity Embeddings and It’s Application) [1], I’ve talked about solving a forecasting problem using entity embeddings — basically using tabular data that have been represented as vectors and using them as input to a neural network based model to solve a forecasting problem. This time around … Read moreMultivariate Time Series Forecasting Using Random Forest