How A.I Will Enhance Content Marketing in the Future

Advances in A.I software and how they could improve the content marketing industry Image Source: UnSplash.com Artificial intelligence is starting to shape critical industries across the world on a considerable scale; this is evident in the marketing, gaming, healthcare, tech, and finance industries. The profits gained from utilizing A.I technology can be monumental, this can … Read moreHow A.I Will Enhance Content Marketing in the Future

hompson Sampling For Multi-Armed Bandit Problems (Part 1)

Using Bayesian Updating For Online Decision Making “Multi-armed bandit” is perhaps the coolest term in data science, excluding financial applications to “naked European call options” or “short iron butterflies”. They are also among the most commonly encountered practical applications. The term has a helpful motivating story: a “one armed bandit” refers to a slot machine — pull the … Read morehompson Sampling For Multi-Armed Bandit Problems (Part 1)

Auria Kathi Powered by Microsoft Azure Machine Learning Pipelines

On January first this year, Fabin Rasheed and I have launched Auria Kathi, the AI Poet Artist living in the cloud. Auria writes a poem, draw an image according to the poem, then color it with a random mood. All these creative actions are carried out without any human intervention. Auria Kathi is an anagram … Read moreAuria Kathi Powered by Microsoft Azure Machine Learning Pipelines

Behind the Models: Beta, Dirichlet, and GEM Distributions

Building Blocks For Non-Parametric Bayesian Models In a future post I want to cover non-parametric Bayesian models — these models are infinite-dimensional and allow for expansive online learning. But first I want to cover some of the building blocks: Beta, Dirichlet, and GEM distributions. These distributions have several helpful properties that provide for a wide variety of machine … Read moreBehind the Models: Beta, Dirichlet, and GEM Distributions

Getting an environment’s name in R: the envnames package

Looking for an object in nested environments The following picture shows an environment space that highlights the connections between package and system environments (child -> parent relationships) and in particular the use of user-defined environments (outer_env and nested_env), which are part of the global environment and may be regarded as nested environments (within the global … Read moreGetting an environment’s name in R: the envnames package

5 Questions to Ask Before Building a Readmissions Model

1. What Intervention? Before you jump to exact details, think about the big picture for a moment. Brainstorm how you’re going to use these predictions. Does your organization have interventions in place for patients that are deemed “high-risk” for readmissions? Will the patients be assigned a dedicated nurse while they are in the hospital? Will … Read more5 Questions to Ask Before Building a Readmissions Model

Making a Command Line HTML Rendering Script for “The Art of the Command Line” (in R)

The Feedly category I have setup for git-stalking has indicated a fairly massive interest in Joshua Levy’s The Art of the Command Line. What is “The Art of the Command Line”? To quote the author(s): Fluency on the command line is a skill often neglected or considered arcane, but it improves your flexibility and productivity … Read moreMaking a Command Line HTML Rendering Script for “The Art of the Command Line” (in R)

Full EARL London 2019 agenda available

Once again, we are delighted to announce a stellar line up of speakers for this year’s EARL Conference; from Retail and Insurance to Media, Manufacturing and Pharmaceutical, the range of industries now using R stats in their workflow continues to grow. If you are interested to hear why companies such as BBC News, BMW Group, Arla … Read moreFull EARL London 2019 agenda available

Interested in AI Policy? Start writing

Photo by Glenn Carstens-Peters on Unsplash Recently, OpenAI’s Amanda Askell, Miles Brundage, and Jack Clark joined Rob Wiblin on the 80,000 hours podcast to discuss a wide range of topics related to AI philosophy. policy, and publication norms. During the conversation, they also discussed where to start if you’re trying to understand AI and AI … Read moreInterested in AI Policy? Start writing

What is Wavelet and How We Use It for Data Science

source: https://ak6.picdn.net/shutterstock/videos/28682146/thumb/1.jpg Hello, this is my second post for the signal processing topic. For now, I’m interested in learning more about signal processing to understand a certain paper. And to be honest for me, this wavelet thing is harder to understand than Fourier Transform. After I felt quite understanding about this topic, I realize something. … Read moreWhat is Wavelet and How We Use It for Data Science

Norms, Penalties, and Multitask learning

Introduction A regularizer is commonly used in machine learning to constrain a model’s capacity to cerain bounds either based on a statistical norm or on prior hypotheses. This adds preference for one solution over another in the model’s hypothesis space, or the set of functions that the learning algorithm is allowed to select as being … Read moreNorms, Penalties, and Multitask learning

Data Analysis: predicting the housing market using Python

Home sales in the second half of 2018 and the first half of 2019 by bedroom size What do the sold homes tell us? I use Python to calculate the numbers of bedrooms and sold price so as to observe any relationship between the number and price. I came up with three numbers for the sold … Read moreData Analysis: predicting the housing market using Python

Role of Machine Learning in redefining Retail Banking

Banking industry is going through a transformational journey with the comprehensive usage of Advanced Analytics algorithms in day to day business of core banking. Customer acquisition through various channels, existing customer engagement, predicting defaulters on credit card or loan applications etc are few of the areas where analytics is doing a tremendous job. I will … Read moreRole of Machine Learning in redefining Retail Banking

End-to-end learning, the (almost) every purpose ML method

Can E2E be used to solve every Machine Learning problems? Photo by Su San Lee on Unsplash One of the most important skills for those who work with Machine Learning is to know which method is the right choice for a given problem. Some choices are trivial (e.g. supervised or unsupervised, regression or classification), because they … Read moreEnd-to-end learning, the (almost) every purpose ML method

Why you should Double-DIP for Natural Image Decomposition

“Double-DIP”: Unsupervised Image Decomposition via Coupled Deep-Image-Priors The key aspect of Double-DIP is inherent in the fact that the distribution of small patches within each decomposed layer is “simpler” (more uniform) than in the original mixed image. Let’s simplify it with an example; Let’s Observe the illustrative example in Figure 3a. Two different textures, X … Read moreWhy you should Double-DIP for Natural Image Decomposition

K-Means Clustering with scikit-learn

Fundamentals of K-Means Clustering As we will see, the k-means algorithm is extremely easy to implement and is also computationally very efficient compared to other clustering algorithms, which might explain its popularity. The k-means algorithm belongs to the category of prototype-based clustering. Prototype-based clustering means that each cluster is represented by a prototype, which can … Read moreK-Means Clustering with scikit-learn

What is machine learning and deep learning?

A series of the Fundamentals of Machine Learning and Deep Learning The best introduction ever that you can get about machine learning and deep learning. (extracted from here) During this series, will be provided links where you can find more information about the subjects exposed. Feel free to explore during or after the reading. I was searching … Read moreWhat is machine learning and deep learning?

How to start a new package with testing in R

# Navigate where you want your folder to be locatedsetwd(“C:/Users/chief/Documents/Github”)# Assumes usethis is installedusethis::create_package(“foo”)# Say yes or no to next (annoying) popup window, it doesn’t matter.# Add a test environmentsetwd(“foo”)usethis::use_testthat()# Add first test function to at least get something in that folder.# Go to foo\tests\testthat# and add this file.context(“foo”)library(foo)test_that(“I’m testing something”, {  # do something … Read moreHow to start a new package with testing in R

More Bayes and multiple comparisons

In my lastpostI had a little fun comparing perspectives among Bayesian, frequentist andprogrammer methodologies. I took a nice post from AnindyaMozumdarfrom the R Bloggers feed and investigated theworld’s fastest man. I’ve found that in writing these posts two things alwayshappen. I learn a lot, and I have follow-on questions or thoughts. This time is noexception, … Read moreMore Bayes and multiple comparisons

An Introduction to Virtual Adversarial Training

Virtual Adversarial Training is an effective regularization technique which has given good results in supervised learning, semi-supervised learning, and unsupervised clustering. This is a re-post of the original post: https://divamgupta.com/unsupervised-learning/semi-supervised-learning/2019/05/31/introduction-to-virtual-adversarial-training.html Get the source code used in this post from here Virtual adversarial training has been used for: Improving supervised learning performance Semi-supervised learning Deep unsupervised … Read moreAn Introduction to Virtual Adversarial Training

78th #TokyoR Meetup Roundup!

With the arrival of summer, another TokyoR UserMeetup! On May 25th, useRsfrom all over Tokyo (and some even from further afield – including KanNishida of Exploratory, all the way fromCalifornia!) flocked to Jimbocho, Tokyo for another jam-packed sessionof R hosted by Mitsui Sumitomo InsuranceGroup. Like my previous round up posts (for TokyoR#76 andTokyoR #77) I … Read more78th #TokyoR Meetup Roundup!

A Beginner’s Guide to Word Embedding with Gensim Word2Vec Model

1. Introduction of Word2vec Word2vec is one of the most popular technique to learn word embeddings using a two-layer neural network. Its input is a text corpus and its output is a set of vectors. Word embedding via word2vec can make natural language computer-readable, then further implementation of mathematical operations on words can be used to … Read moreA Beginner’s Guide to Word Embedding with Gensim Word2Vec Model

Creating Azure Logic Apps from R using httr

Logic Apps is a serverless framework in Azure quite similar to IFTTT (if this, then that) and Zapier that allows you to connect different services and create workflows. You can define different types of triggers based on: time and events (e.g. http requests, messages received, …) to start workflows. Logic Apps can be created using a … Read moreCreating Azure Logic Apps from R using httr

Hands on Graph Neural Networks with PyTorch & PyTorch Geometric

In my last article, I introduced the concept of Graph Neural Network (GNN) and some recent advancements of it. Since this topic is getting seriously hyped up, I decided to make this tutorial on how to easily implement your Graph Neural Network in your project. You will learn how to construct your own GNN with … Read moreHands on Graph Neural Networks with PyTorch & PyTorch Geometric

Reinventing Personalization For Customer Experience

Why? What? How? Atif M.BlockedUnblockFollowFollowing May 30 “Remember that a person’s name is, to that person, the sweetest and most important sound in any language.” — Dale Carnegie, How to Win Friends and Influence People When it comes to building good relationships with customers, learning their names is an essential step for businesses at any level. Consumers expect … Read moreReinventing Personalization For Customer Experience

How to use ggplot2 in Python

Introduction Thanks to its strict implementation of the grammar of graphics, ggplot2 provides an extremely intuitive and consistent way of plotting your data. Not only does ggplot2’s approach to plotting ensure that each plot comprises certain basic elements but it also simplifies the readability of your code to a great extent. However, if you are … Read moreHow to use ggplot2 in Python

Introduction to Latent Matrix Factorization Recommender Systems

Latent Factors are “Hidden Factors” unseen in the data set. Lets use their power. Image URL: https://www.3dmgame.com/games/darknet/tu/ Latent Matrix Factorization is an incredibly powerful method to use when creating a Recommender System. Ever since Latent Matrix Factorization was shown to outperform other recommendation methods in the Netflix Recommendation contest, its been a cornerstone in building … Read moreIntroduction to Latent Matrix Factorization Recommender Systems

Which 2020 Candidate is the Best at Twitter?

A Data Analysis of the 2020 Democratic Candidate Twitter Accounts Photo by George Pagan III on Unsplash The contest for the 2020 Democratic party nomination will be fought in many arenas. Before the first debates in a month, before the campaign rallies in key states, and even before prime time TV interviews, the fight for the nomination … Read moreWhich 2020 Candidate is the Best at Twitter?

An Easy Introduction to SQL for Data Scientists

SQL (Structured Query Language) is a standardised programming language designed for data storage and management. It allows one to create, parse, and manipulate data fast and easy. With the AI-hype of recent years, technology companies serving all kinds of industries have been forced to become more data driven. When a company that serves thousands of … Read moreAn Easy Introduction to SQL for Data Scientists

A Step-by-Step Implementation of Gradient Descent and Backpropagation

One example of building a neural network from scratch The original intention behind this post was merely me brushing upon mathematics in neural network, as I like to be well versed in the inner workings of algorithms and get to the essence of things. I then think I might as well put together a story rather … Read moreA Step-by-Step Implementation of Gradient Descent and Backpropagation

Databricks: How to Save Files in CSV on Your Local Computer

3. Download the CSV file on your local computer In order to download the CSV file located in DBFS FileStore on your local computer, you will have to change the highlighted URL to the following: https://westeurope.azuredatabricks.net/files/df/Sample.csv/part-00000-tid-8365188928461432060–63d7293d-3b02–43ff-b461-edd732f9e06e-4704-c000.csv?o=3847738880082577 As you noticed, the CSV path in bold (df/Sample.csv/part-00000-tid-8365188928461432060–63d7293d-3b02–43ff-b461-edd732f9e06e-4704-c000.csv) is from step 2. The number (3847738880082577) is from the original … Read moreDatabricks: How to Save Files in CSV on Your Local Computer

RoboSomm Chapter 3: Wine Embeddings and a Wine Recommender

One of the cornerstones of previous chapters of the RoboSomm series has been to extract descriptors from professional wine reviews, and to convert these into quantitative features. In this article, we will explore a way of extracting features from wine reviews that combines the best of the existing RoboSomm series and academic literature on this … Read moreRoboSomm Chapter 3: Wine Embeddings and a Wine Recommender

Quick and easy t-SNE analysis in R

t-SNE is a useful dimensionality reduction method that allows you to visualise data embedded in a lower number of dimensions, e.g. 2, in order to see patterns and trends in the data. It can deal with more complex patterns of Gaussian clusters in multidimensional space compared to PCA. Although is not suited to finding outliers … Read moreQuick and easy t-SNE analysis in R

xaibot – conversations with predictive models!

If you could talk to a predictive machine learning model, what would you ask for? Try! Michał Kuźba is developing a mind-blowing project – xai chat-bot. Dialog based system that helps to explore and understand predictive models through natural language conversations (type, speak or phone the model ). For example, imagine that you have a … Read morexaibot – conversations with predictive models!

Hypothesis testing visualized

Literally seeing how stat tests work In this article, we’ll get an intuitive, visual feel for hypothesis testing. While there are many articles online that explain it in words, there aren’t nearly enough that rely primarily on visuals; which is surprising since the subject lends itself quite well to exposition through pictures and movies. But before … Read moreHypothesis testing visualized

How to Automate Hyperparameter Optimization

In the machine learning and deep learning paradigm, model “parameters” and “hyperparameters” are two frequently used terms where “parameters” define configuration variables that are internal to the model and whose values can be estimated from the training data and “hyperparameters” define configuration variables that are external to the model and whose values cannot be estimated … Read moreHow to Automate Hyperparameter Optimization

Cognitive capitalism chapter reworked

The Cognitive capitalism chapter of my evidence-based software engineering book took longer than expected to polish; in fact it got reworked, rather than polished (which still needs to happen, and there might be more text moving from other chapters). Changing the chapter title, from Economics to Cognitive capitalism, helped clarify lots of decisions about the … Read moreCognitive capitalism chapter reworked

What I Learned from (Two-time) Kaggle Grandmaster Abhishek Thakur

Drawing insights from Abhishek Thakur’s NLP kernel Photo by Georgie Cobbs on Unsplash Quick Bio Before his many data scientist stints in companies scattered throughout Germany, Abhishek Thakur earned his bachelor’s in electrical engineering at NIT Surat and his master’s in computer science at the University of Bonn. Currently, he holds the title of Chief Data Scientist … Read moreWhat I Learned from (Two-time) Kaggle Grandmaster Abhishek Thakur

Buyers beware, Fake product reviews are plaguing the internet.

Spotting Fake Product Reviews using Machine Learning Opinion spamming is a situation that is aggravating, for instance, CBS News reports that 52% of product reviews posted in Walmart.com are “inauthentic or unreliable”, while at least 30% of reviews posted at Amazon are fake. The problem of identifying opinion spamming remains an open topic, despite the fact … Read moreBuyers beware, Fake product reviews are plaguing the internet.

April 2019: “Top 40” New CRAN Packages

One hundred eighty-seven new packages made it to CRAN in April. Here are my picks for the “Top 40”, organized into ten categories: Biotechnology, Data, Econometrics, Machine Learning, Medicine, Science, Statistics, Time Series, Utilities, and Visualization. Biotechnology genpwr v1.00: Provides functions for power and sample size calculations for genetic association studies allowing for mis-specification of … Read moreApril 2019: “Top 40” New CRAN Packages

Projects

Currently I am primarily working on the following projects: How to implementing decision engineering, aka automated decision making in a large company at scale from both a cultural and an engineering perspective Time series learning at scale by combining methods from statistics (state-space-models) with machine learning techniques (gradient boosting mostly) I plan to put some … Read moreProjects