Interpreting recurrent neural networks on multivariate time series

A guide on how to get theoretically sound explanations from complex deep learning models trained on multivariate time series In this article, we’ll explore a state-of-the-art method of machine learning interpretability and adapt it to multivariate time series data, a use case which it wasn’t previously prepared to work on. You’ll find explanations to core … Read more

Creating word clouds with python

During a recent NLP project, I came across an article where word clouds were created in the shape of US Presidents using words from their inauguration speeches. Whilst I had used word clouds to visualise the most frequent words in a document, I’d not considered using this with a mask to represent the topic or … Read more

Understanding Decision Trees for Classification (Python)

This tutorial goes into extreme detail about how decision trees work. Decision trees are a popular supervised learning method for a variety of reasons. Benefits of decision trees include that they can be used for both regression and classification, they are easy to interpret and they don’t require feature scaling. They have several flaws including … Read more

Humans vs. machines. What does it mean in retail?

Last month I took part in a debate on the ‘future of technology in retail and whether technology could replace humans….’ Subsequently, I’ve pulled together some thoughts for Inside Retail (also published here). For the past decade, advancements in technology have been disrupting what seems like most facets of the human experience. And with what … Read more

Sentiment Analysis on raw text using Amazon, IMDB, and Yelp!

This post in in a direct continuation of my previous post about Text Preprocessing. This is a practical implementation of some important text preprocessing steps used before they are fed to a machine learning model. Instead of using conventional preprocessing and learning methods through coding scripts, I’ve used a tool called Knime. The dataset is … Read more

What is Probability?

The Subjective interpretation takes a different approach from the previous two. Instead of concerning itself with frequencies or counts, the Subjective approach posits that probabilities stem from a person’s personal (subjective) degree of belief that particular event will occur, based on all relevant information available to them. This perspective aligns well with Bayesian statistics, which … Read more

The How of Explainable AI: Post-modelling Explainability

In the first two parts of our overview of the How of XAI, we looked into pre-modelling explainability and explainable modelling methodologies, which focus on explainability at the dataset stage and during model development. Yet these are relatively minor areas of interest compared with explainability after the fact, and post-modelling explainability is where the majority … Read more

The How of Explainable AI: Explainable Modelling

In the first part of our overview of the How of Explainable AI, we looked a pre-modelling explainability. However, the true scope of explainability is much broader. Explainability can be considered at all stages of AI development, namely, pre-modelling, model development, and post-modelling. The majority of AI explainability literature aims at explaining a black-box model … Read more

The How of Explainable AI: Pre-modelling Explainability

AI explainability is a broad and multi-disciplinary domain, being studied in several fields including machine learning, knowledge representation and reasoning, human-computer interaction, and the social sciences. Accordingly, XAI literature includes a large and growing number of methodologies. There are many factors that could contribute to how an AI model operates and makes its predictions, and … Read more

Getting started with Tensorflow, Keras in Python and R

The Pale Blue Dot “From this distant vantage point, the Earth might not seem of any particular interest. But for us, it’s different. Consider again that dot. That’s here, that’s home, that’s us. On it everyone you love, everyone you know, everyone you ever heard of, every human being who ever was, lived out their … Read more

Categories R Tags ExcerptFavorite

Causality in model explanations and in the real world

You can’t always change a human’s input to see the output. At Fiddler Labs, we place great emphasis on model explanations being faithful to the model’s behavior. Ideally, feature importance explanations should surface and appropriately quantify all and only those factors that are causally responsible for the prediction. This is especially important if we want … Read more

The Complete Guide to Unsupervised Learning

Understand principal component analysis (PCA) and clustering methods, and implement each algorithm in two mini projects Unsupervised learning is a set of statistical tools for scenarios in which there is only a set of features and no targets. Therefore, we cannot make predictions, since there are no associated responses to each observation. Instead, we are … Read more

Similarity measures in Recommendation Systems

In a previous article introducing Recommendation Systems, we mentioned several times the concept of ‘similarity measures’. Why? Because in Recommendation Systems, both Content-Based filtering and Collaborative filtering algorithms, use some specific similarity measure to find how equal two vectors of users or items are in between them. So in the end, a similarity measure is … Read more

What did Puerto Rico say after its governor resigned? A Twitter data analysis

Interpreting tweets containing the hashtag #RickyRenunció using spaCy, Google Cloud, and NLP. The island of Puerto Rico and its people are currently making history. On July 13, the Puerto Rico’s Center for Investigative Journalism published a document consisting of 889 pages of Telegram messages interchanged between the governor, Ricardo Roselló, and inner members of his … Read more

Deep Latent Variable Models: Unravel Hidden Structures

A central problem in machine learning is to learn a complicated probability distribution p(x) with only a limited set of high-dimensional data points x drawn from this distribution. For example, to learn the probability distribution over images of cats we need to define a distribution which can model complex correlations between all pixels which form … Read more

Support Vector Machine Explained

Theory, Implementation, and Visualization Support Vector Machine (SVM) is probably one of the most popular ML algorithms used by data scientists. SVM is powerful, easy to explain, and generalizes well in many cases. In this article, I’ll explain the rationales behind SVM and show the implementation in Python. For simplicity, I’ll focus on binary classification … Read more

Modeling Motivation and Emotion using Feedback Loops

If you’re anything like me, you probably set a lot of goals. Whether it’s to finish a paper by the end of the summer or to spend more time with friends and family, goals are what help motivate us to do something. Goals are also intimately tied to our feelings. You may have had the … Read more

Categories R Tags ExcerptFavorite

Validating Type I and II Errors in A/B Tests in R

In this post, we seek to develop an intuitive sense of what type I (false-positive) and type II (false-negative) errors represent when comparing metrics in A/B tests, in order to gain an appreciation for “peeking”, one of the major problems plaguing the analysis of A/B test today. To better understand what “peeking” is, it helps … Read more

Categories R Tags ExcerptFavorite


The mlr-org team is very proud to present the initial release of the mlr3 machine-learning framework for R. mlr3 comes with a clean object-oriented-design using the R6 class system.With this, it overcomes the limitations of R’s S3 classes.It is a rewrite of the well-known mlr package which provides a convenient way of accessing many algorithms … Read more

Categories R Tags ExcerptFavorite

Network model trees

The effect of covariates on correlations in psychometric networks is assessed with either model-based recursive partitioning (MOB) or conditional inference trees (CTree). Citation Jones PJ, Mair P, Simon T, Zeileis A (2019). “Network Model Trees”, OSF ha4cw, OSF Preprints. doi:10.31219/ Abstract In many areas of psychology, correlation-based network approaches (i.e., psychometric networks) have become a … Read more

Categories R Tags ExcerptFavorite

Cluster Analysis: Create, Visualize and Interpret Customer Segments

customer segmentation Exploring methods for cluster analysis, visualizing clusters through dimensionality reduction and interpreting clusters through exploring impactful features. Although we have seen a large influx of supervised machine learning techniques being used in organizations these methods suffer from, typically, one large issue; a need for labeled data. Fortunately, many unsupervised methods exist for clustering … Read more

Trolls and bots are disrupting social media — here’s how AI can stop them (Part 1)

Trolls and bots have a huge and often unrecognized influence on social media. They are used to influence conversations for commercial or political reasons. They allow small hidden groups of people to promote information supporting their agenda and a large scale. They can push their content to the top of people’s news feeds, search results, … Read more

Identifying trolls and bots on Reddit with machine learning (Part 2)

Trolls and bots are widespread across social media, and they influence us in ways we are not always aware of. Trolls can be relatively harmless, just trying to entertain themselves at others’ expense, but they can also be political actors sowing mistrust or discord. While some bots offer helpful information, others can be used to … Read more

Deep Learning Analysis Using Large Model Support

Optimize your Deep Learning model memory consumption with IBM Large Model Support. Memory management is now a really important topic in Machine Learning. Because of memory constraints, it is becoming quite common to train Deep Learning models using cloud tools such as Kaggle and Google Colab thanks to their free NVIDIA Graphical Processing Unit (GPU) … Read more

The machine learning lifecycle

According to Gartner, companies struggle to operationalize machine learning models: “The Gartner Data Science Team Survey of January 2018 found that over 60% of models developed with the intention of operationalizing them were never actually operationalized.” Based on our experience working together with various clients, we believe that this inability to operationalize is partly due … Read more

ConvNet Playground: An Interactive Visualization Tool for Exploring Convolutional Neural Networks

ConvNet Playground is focused on the task of semantic image search using CNNs. Our (rather simple) approach is implemented in two stages (i.) we extract features from all images in our datasets using a pre-trained CNN (think VGG16, InceptionV3, etc. pre-trained on imageNet) (ii.) We compute similarity as a measure of the distance between these … Read more

Microsoft ML Server 9.4 now available

Related To leave a comment for the author, please follow the link and comment on their blog: Revolutions. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, … Read more

Categories R Tags ExcerptFavorite

Pruning Deep Neural Networks

TL; Different approaches of pruning, DR: By pruning, a VGG-16 based classifier is made 3x faster and 4x smaller Deep Learning models these days require a significant amount of computing, memory, and power which becomes a bottleneck in the conditions where we need real-time inference or to run models on edge devices and browsers with … Read more

Dash has gone full R

This is a reblog from the “Announcing Dash for R” announcement originally published July 10. Dash, the fastest growing framework for building analytic web applications on top of Python models, is now available for the R programming language. Installation | Documentation | GitHub | Gallery Dash was released in 2017 as the latest evolution in Plotly’s open-source analytics tools. At the time, Plotly was … Read more

Categories R Tags ExcerptFavorite

Have you Optimized your Deep Learning Model Before Deployment?

Use NVIDIA TensorRT to optimize and speed up inference time on GPU. Illustration on an AI-based Computer Vision with YOLO. This article is organized as follows: Introduction What is NVIDIA TensorRT? Setup the Development Environment using docker Computer Vision Application: Object detection with YOLOv3 model References Conclusion This document presents how to use TensorRT to optimize a … Read more

AI Often Adds To Bias In Recruiting — But There’s A New Approach That Could Change The Game

Most people aren’t trying to be biased, but bias is inherent — it influences how we view any situation, often unconsciously. When you think of bias, characteristics like race, gender, and religion likely come to mind. But there’s a much broader context of what bias can actually be. Bias comes in many forms. For example, … Read more

Grades Aren’t Normal

This article is also available in PDF form. A while back someone posted on Reddit about the grading policies of their academic department. Specifically, the department chair made a statement claiming that grades should be Normally distributed with a C average. I responded, claiming that no statistician would ever take the idea that grades follow … Read more

Categories R Tags ExcerptFavorite

Some Notes on GNU Licenses in R Packages

I was recently asked if Win-Vector LLC would move the R wrapr package from a GPL-3 license to an LGPL license. In the end I decided to move wrapr distribution to a “GPL-2 | GPL-3” license. This means the package is now available under both GPL-2 and GPL-3 licensing, allowing the user to pick which … Read more

Categories R Tags ExcerptFavorite

Learn Classification with Decision Trees in R

When it comes to classification, using a decision tree classifier is one of the easiest to use. Incredibly easy to interpret It handles missing data & outliers very well and as such requires far less up front cleaning You get to forego the categorical variable encoding as decision trees handle categoricals well! Without diving into … Read more

EARLy bird ticket offer ends tomorrow!

R fans, you have just one more day to get your hands on discounted EARL London 2019 tickets. Our early bird offer gets you £100 off the full price ticket, so it makes persuading your boss easier! Visit the EARL website for more details and see 2018’s highlights below: [embedded content] Related offers daily … Read more

Categories R Tags ExcerptFavorite

How to become an AI-driven company

AI is transforming how we do business at an unprecedented pace, but the transition to becoming AI-driven is easier than you think. Now is the time to invest and remain at the top of your game. A few weeks ago Artificial Intelligence was thrown into the spotlight as the winners of this year’s Turing award, … Read more

The Fastest Way to Learn to Code? Be Invested in Your Numbers

Learning programming by interpreting numbers that matter to you. Curiosity is a universal human trait. Every single person asks questions. Every single person has interests. Every single person wants to know more about the way the world works — not necessarily for any personal gain, but just to know a bit more about the world … Read more

useR!2019 Toulouse recap

Conferences like userR & EARL are the R events to attend every year and personally, and as a company, I can’t imagine skipping one. It’s an important place to be if you want to be up-to-date with the R technology and build up your presence in the community. Our team have given rave reviews after … Read more

Categories R Tags ExcerptFavorite

I Placed 4th in my First AI Competition. Takeaways from the Unity Obstacle Tower Competition

Over the last few years, most of my spare time has been spent tinkering, learning, and researching machine learning, specifically reinforcement learning and digital actors. Recently I decided to participate in the Obstacle Tower Challenge. To my surprise, my early efforts briefly topped the table, and I placed 2nd in the first round as my … Read more

The Future of Lending Money Is Deep Learning

Deep learning is becoming the standard way of assessing credit risk and soon will surpass human decision-making. According to Wikipedia, a bank is a “financial institution that accepts deposits from the public and creates credit” which means that one of the two main responsibilities of a bank is to lend money to commercial and corporate … Read more


The RIGHT JOIN keyword Just like you’d expect, the RIGHT JOIN is similar to the LEFT JOIN. This join returns all of the rows of the table on the right side of the join and matching rows for the table on the left side of the join. And, for any rows where there are now … Read more

Model Tuning & Feature Engineering using XGBoost

In this article I will be discussing data-preprocessing techniques, which add on to my previous series of Tensorflow for beginners posts as an extension. The first stage to constructing an AI or machine learning model is to preprocess the data to ensure proper representation by the model. This stage is the most critical part which … Read more

From Research to Production: Containerized Training Jobs

Until now, we have have some training code that outputs a tested, robust ML model that we now want to somehow persist (or productionalize) and possibly deploy as a service. Further, let’s suppose that the model’s test performance meets our expectations and we conclude the research phase. For this demo, I’ll adopt the training procedure … Read more

Review: IDW-CNN — Learning from Image Descriptions in the Wild Dataset Boosts the Accuracy…

In this story, IDW-CNN, by Sun-Yat-sen University, The Chinese University of Hong Kong, SenseTime Group (Limited), is briefly reviewed. Segmentation accuracy is increased by learning from an Image Descriptions in the Wild (IDW) dataset. Unlike previous image captioning datasets, where captions were manually and densely annotated, images and their descriptions in IDW are automatically downloaded … Read more

Summer Break: A Look back… and ahead

It was November last year when I seriously started blogging and it is time to share with you some experiences and highlights before the summer break… so read on! The first thing that really surprised me (and still surprises me) is the popularity of my blog – and I say this without false modesty: when … Read more

Categories R Tags ExcerptFavorite