Data Science Austria

Perfume Recommendations using Natural Language Processing

Introduction Natural Language Processing(NLP) has many intriguing applications to Recommender Systems and Information Retrieval. As a perfume lover and a Data Scientist, the unusual and highly descriptive language used in the niche perfume community inspired me to use NLP to create a model to help me discover perfumes I might … Read morePerfume Recommendations using Natural Language Processing

Using Mode definitions (and dbt) to bootstrap your data model

Penny model. Oh, if our network graphs were so simple. At Landed, we track a variety of customer journeys through a variety of lifecycle stages across a variety of products. To explore all this variety, our teams rely on a collection of analytics dashboards to help them anticipate the needs of … Read moreUsing Mode definitions (and dbt) to bootstrap your data model

A Journey through a Buyer’s life and Shop similarity

A friend recently pointed me toward “Interactive Map of Reddit and Subreddit Similarity Calculator” and asked me if a similar approach could be used to cluster online shops and get a shop similarity calculator. The idea was interesting enough for me to spend the last few weeks exploring it. Working … Read moreA Journey through a Buyer’s life and Shop similarity

Algorithms for Text Classification — Part 1: Naive Bayes

Next, let’s see how to run this algorithm using Python with real data: import pandas as pdimport numpy as np spam_data = pd.read_csv(‘spam.csv’) spam_data[‘target’] = np.where(spam_data[‘target’]==’spam’,1,0)print(spam_data.shape)spam_data.head(10) from sklearn.model_selection import train_test_split#Split data into train and test sets X_train, X_test, y_train, y_test = train_test_split(spam_data[‘text’],spam_data[‘target’],random_state=0) from sklearn.feature_extraction.text import CountVectorizerfrom sklearn.naive_bayes import MultinomialNBfrom sklearn.metrics … Read moreAlgorithms for Text Classification — Part 1: Naive Bayes

How Do I Write About Data Science On Medium

5 Core Principles to Write about Data Science, and Beyond (Source) 1. Be conversational Your articles are always read by individual readers — one reader at any given time. What this means is that readers mostly read your articles individually without anyone beside them. Therefore, to really attract and engage with readers, your … Read moreHow Do I Write About Data Science On Medium

Community Forums Meets Data Science

Analysis of forum members’ activity, posts, and behavior SummaryAs a community builder and strategist with a passion for data science, I have found that the use of data science techniques has deepened my understanding of the communities I manage, allowing me to make better strategic and operational decisions. In this article, … Read moreCommunity Forums Meets Data Science

Community detection of survey responses based on Pearson correlation coefficient with Neo4j

Just a few days ago a new version of Neo4j graph algorithms plugin was released. With the new release come new algorithms and Pearson correlation algorithm is one of them. To demonstrate how to use Pearson correlation algorithm in Neo4j we will use the data from “Young People Survey” Kaggle … Read moreCommunity detection of survey responses based on Pearson correlation coefficient with Neo4j