Data Science Austria

A Complete Exploratory Data Analysis and Visualization for Text Data

How to combine visualization and NLP in order to generate insights in an intuitive way Visually representing the content of a text document is one of the most important tasks in the field of text mining. As a data scientist or NLP specialist, not only we explore the content of … Read moreA Complete Exploratory Data Analysis and Visualization for Text Data

How to use Test Driven Development in a Data Science Workflow

TDD Example: Tweet Preparation for NLP Tasks For this example, I used pytest instead of unittest from the standard Python library. If you look for an introduction to the latter, see at the bottom of this post for a link. To walk you through the TDD process, I chose a simple … Read moreHow to use Test Driven Development in a Data Science Workflow

Optimizing Jupyter Notebook: Tips, Tricks, and nbextensions

nbextensions The benefits of this extension are that it changes the defaults. To install nbextensions, execute below commands in Anaconda Prompt: conda install -c conda-forge jupyter_contrib_nbextensionsconda install -c conda-forge jupyter_nbextensions_configurator Alternatively, you can also install nbextensions using pip: pip show jupyter_contrib_nbextensions Run pip show jupyter_contrib_nbextensions to find where notebook extensions … Read moreOptimizing Jupyter Notebook: Tips, Tricks, and nbextensions

Bayesian Modeling of Pro Overwatch Matches with PyMC3

Photo by AC De Leon on Unsplash Professional eSports are becoming increasingly popular, and the industry is growing rapidly. Many of these professional game leagues are based on games that have two teams that battle it out. Call of Duty, League of Legends, and Overwatch are all examples. Although these are … Read moreBayesian Modeling of Pro Overwatch Matches with PyMC3

Six Recommendations for Aspiring Data Scientists

Source: https://www.maxpixel.net/Art-Colourful-Gears-Creativity-Cogs-Colorful-1866468 Building experience before landing a job Data science is a field with a huge demand, in part because it seems to require experience as a data scientist to be hired as a data scientist. But many of the best data scientists I’ve worked with have diverse backgrounds ranging from … Read moreSix Recommendations for Aspiring Data Scientists

How to setup the PySpark environment for development, with good software engineering practices

In this article we will discuss about how to set up our development environment in order to create good quality python code and how to automate some of the tedious tasks to speed up deployments. We will go over the following steps: setup our dependencies in a isolated virtual environment … Read moreHow to setup the PySpark environment for development, with good software engineering practices

Let’s build an Article Recommender using LDA

Due to keen interest in learning new topics, I decided to work on a project where a Latent Dirichlet Allocation (LDA) model can recommend Wikipedia articles based on a search phrase. This article explains my approach towards building the project in Python. Check out the project on GitHub below. Structure … Read moreLet’s build an Article Recommender using LDA