Data Science Austria

Exploring Toronto Bike Share Ridership using Python

Analysis and Visualization Personally, I prefer to create a new Jupyter notebook for analysis only. In the new notebook, I first imported the libraries and the cleaned data, then created new pandas Categorical datatypes for the day of the week (Monday, Tuesday, etc.) and month names to ensure fixed sorting … Read moreExploring Toronto Bike Share Ridership using Python

Predicting Airbnb prices with deep learning, part 1: how to clean up Airbnb data

Project aims and background Airbnb is a home-sharing platform that allows home-owners and renters (‘hosts’) to put their properties (‘listings’) online, so that guests can pay to stay in them. Hosts are expected to set their own prices for their listings. Although Airbnb and other sites provide some general guidance, … Read morePredicting Airbnb prices with deep learning, part 1: how to clean up Airbnb data

Scalable Python Code with Pandas UDFs: A Data Science Application

Source: https://pxhere.com/en/photo/1417846 Making Python code run at massive scale in the cloud PySpark is a really powerful tool, because it enables writing Python code that can scale from a single machine to a large cluster. While libraries such as MLlib provide good coverage of the standard tasks that a data scientists … Read moreScalable Python Code with Pandas UDFs: A Data Science Application

Classification and Regression Analysis with Decision Trees

The Fundamentals of Decision Trees A decision tree is constructed by recursive partitioning — starting from the root node (known as the first parent), each node can be split into left and right child nodes. These nodes can then be further split and they themselves become parent nodes of their resulting children nodes. … Read moreClassification and Regression Analysis with Decision Trees

Practical Statistics & Visualization With Python & Plotly

Photo credit: Pixabay How to use Python and Plotly for statistical visualization, inference, and modeling One day last week, I was googling “statistics with Python”, the results were somewhat unfruitful. Most literature, tutorials and articles focus on statistics with R, because R is a language dedicated to statistics and has more statistical … Read morePractical Statistics & Visualization With Python & Plotly

Get started with Object Oriented Programming in Python: Classes and Instances

New to OOP? Learn how to write a class and create instances in Python There are a lot of articles popping up on object-oriented programming in Python at the moment. Many data scientists, myself included, find ourselves in roles that focus on writing functional code, often in small scripts or prototypes. … Read moreGet started with Object Oriented Programming in Python: Classes and Instances

Challenges in sentiment analysis: a case for word clouds (for now)

Exploring simple python code visualizations for marketing Machine understanding and capability get merged together in popular culture. When I think about artificial intelligence, I get into this tricky habit of mixing understanding with capability. I imagine that there are ways we can tell how much a machine knows by what … Read moreChallenges in sentiment analysis: a case for word clouds (for now)