Data Science Austria

Pandas Index Explained

The following Notebook is very easy to follow and also has small tips and tricks to make daily work a little better. adult = pd.read_csv(“https://archive.ics.uci.edu/ml/machine- learning-databases/adult/adult.data”, names = [‘age’,’workclass’,’fnlwgt’, ‘education’, ‘education_num’,’marital_status’,’occupation’,’relationship’,’race’,’sex’,’capital_gain’,’capital_loss’, ‘hours_per_week’, ‘native_country’,’label’], index_col = False) print(“Shape of data{}”.format(adult.shape)) adult.head() Dataset has 32561 rows and 15 features, the leftmost series … Read morePandas Index Explained

How to Perform Exploratory Data Analysis with Seaborn

Data Preparation Data preparation is the first step of any data analysis to ensure data is cleaned and transformed in a form that can be analyzed. We will be performing EDA on the Ames Housing dataset. This dataset is popular among those beginning to learn data science and machine learning … Read moreHow to Perform Exploratory Data Analysis with Seaborn

Customer Segmentation Analysis with Python

In this article I’ll explore a data set on mall customers to try to see if there are any discernible segments and patterns. Customer segmentation is useful in understanding what demographic and psychographic sub-populations there are within your customers in a business case. By understanding this, you can better understand … Read moreCustomer Segmentation Analysis with Python

Jump Out of the Jupyter Notebook with nbconvert

Easily Convert Notebooks to Python Scripts and Sharable Files If you’re a data scientist, nbconvert is a great tool to add to your tool belt. With nbconvert you can easily turn your Jupyter Notebook into a Python script from the command line. It also allows you to turn your Jupyter notebook … Read moreJump Out of the Jupyter Notebook with nbconvert

Time Series Forecasting with Prophet

Learn how to use Facebook’s Prophet to predict air quality Photo by Frédéric Paulussen on Unsplash Producing high quality forecasts is hard for many machine learning engineers. It requires a substantial amount of experience and and very specific skills. Also, other forecasting tools were too inflexible to incorporate useful assumptions. For those … Read moreTime Series Forecasting with Prophet

Setting up Email Updates for Your Scraper using Python and a Gmail Account

Photo by Jamie Street on Unsplash Very often when building web scrapers to collect data, you’ll run into one of these situations: You want to send the program’s results to someone else You’re running the script on a remote server and you want automatic, real-time reports on results (e.g. updates on … Read moreSetting up Email Updates for Your Scraper using Python and a Gmail Account

What to do when your data fails OLS Regression assumptions

Regression analysis falls in the realm of inferential statistics. Consider the following equation: y ≈ β0 + β1x + e The approximate equals sign indicates that there is an approximate linear relationship between x and y. The error e term indicates this model isn’t going to fully reflect reality via … Read moreWhat to do when your data fails OLS Regression assumptions