Data Science Austria

Modeling Price with Regularized Linear Model & Xgboost

Developing statistical models for predicting individual house prices We would like to model the price of a house, we know that the price depends on the location of the house, square footage of a house, year built, year renovated, number of bedrooms, number of garages, etc. So those factors contribute to … Read moreModeling Price with Regularized Linear Model & Xgboost

Jupytext 1.0 highlights

In version 1.0 the jupytext command was extended with new modes: –sync to synchronize the multiple representations of a notebook –set-formats (and optionally, –sync), to set or change the pairing of a notebook or a text file –pipe to pipe the text representation of a notebook into another program. Perhaps … Read moreJupytext 1.0 highlights

Pandas Index Explained

The following Notebook is very easy to follow and also has small tips and tricks to make daily work a little better. adult = pd.read_csv(“https://archive.ics.uci.edu/ml/machine- learning-databases/adult/adult.data”, names = [‘age’,’workclass’,’fnlwgt’, ‘education’, ‘education_num’,’marital_status’,’occupation’,’relationship’,’race’,’sex’,’capital_gain’,’capital_loss’, ‘hours_per_week’, ‘native_country’,’label’], index_col = False) print(“Shape of data{}”.format(adult.shape)) adult.head() Dataset has 32561 rows and 15 features, the leftmost series … Read morePandas Index Explained

How to Perform Exploratory Data Analysis with Seaborn

Data Preparation Data preparation is the first step of any data analysis to ensure data is cleaned and transformed in a form that can be analyzed. We will be performing EDA on the Ames Housing dataset. This dataset is popular among those beginning to learn data science and machine learning … Read moreHow to Perform Exploratory Data Analysis with Seaborn

Series, New Authors, and My Ratings I noticed as I ran the algorithm that books that were part of a Series dominated the results even as I played around with the inputs. It’s logical — when someone reads and enjoys book 1 in a series, they’re likely to read book 2. If … Read moreWhat should I Read Next?

Customer Segmentation Analysis with Python

In this article I’ll explore a data set on mall customers to try to see if there are any discernible segments and patterns. Customer segmentation is useful in understanding what demographic and psychographic sub-populations there are within your customers in a business case. By understanding this, you can better understand … Read moreCustomer Segmentation Analysis with Python

Jump Out of the Jupyter Notebook with nbconvert

Easily Convert Notebooks to Python Scripts and Sharable Files If you’re a data scientist, nbconvert is a great tool to add to your tool belt. With nbconvert you can easily turn your Jupyter Notebook into a Python script from the command line. It also allows you to turn your Jupyter notebook … Read moreJump Out of the Jupyter Notebook with nbconvert

Time Series Forecasting with Prophet

Learn how to use Facebook’s Prophet to predict air quality Photo by Frédéric Paulussen on Unsplash Producing high quality forecasts is hard for many machine learning engineers. It requires a substantial amount of experience and and very specific skills. Also, other forecasting tools were too inflexible to incorporate useful assumptions. For those … Read moreTime Series Forecasting with Prophet

How to PyTorch in Production

Photo by Sharon McCutcheon on Unsplash ML is fun, ML is popular, ML is everywhere. Most of the companies use either TensorFlow or PyTorch. There are some oldfags who prefer caffe, for instance. Mostly it’s all about Google vs Facebook battle. Most of my experience goes to PyTorch, even though most … Read moreHow to PyTorch in Production