Building Big Shiny Apps — A Workflow (1/2)
During the rstudio::conf(2019L), I’ve presented an eposter called “Building Big Shiny Apps — A Workflow”. You can find the poster here, and this blog post is an attempt at a … Read more
During the rstudio::conf(2019L), I’ve presented an eposter called “Building Big Shiny Apps — A Workflow”. You can find the poster here, and this blog post is an attempt at a … Read more
As of late I’ve been reading a lot on entity embeddings after being tasked to work on a forecasting problem. The task at hand was to predict the salary of … Read more
Since quite some time I was getting round to playing with Google Colab (yes, free access to GPU…). I think this is a really awesome initiative, which enables people with … Read more
Intuitive explanation of Degrees of Freedom and How Degrees of Freedom affects Sudoku Source : Pixabay A lot of aspiring Data Scientists take courses on statistics and get befuddled with … Read more
Matt.0BlockedUnblockFollowFollowing Jan 27 Photo by Pakata Goh on Unsplash Clustering is one of the most common unsupervised machine learning problems. Similarity between observations is defined using some inter-observation distance measures or … Read more
A higher-order function is a function that takes one or more functions as arguments, and\or returns a function as its result. This can be super handy in programming when you … Read more
PCA & Linear Algebra(Advance) Photo by Antoine Dautry “You can’t build great building on a weak foundation”. This quote truly justifies what I am trying to do here, you cannot learn … Read more
Lessons learned from a hate speech detection task to improve supervised NLP models Note: this post is mainly targeted at an audience unfamiliar with Natural Language Processing and will hence cover … Read more
At work on Friday, I was trying to figure out the best way to display some rank data. What I had were rankings from 1-5 for 10 factors considered most … Read more
Learning to Drive in Minutes — The Updated Approach Although Wayve.ai technique may work in principle, it has some issues that needs to be addressed to apply it to a self-driving RC car. … Read more
The emerging AI market model is dominated by tech giants such as Google, Amazon and Microsoft, who offer cloud-based AI solutions and APIs. This model offers users little control over … Read more
A modularized notebook to tune and compare 11 regression algorithms with minimal coding in a control panel fasion This article summarizes and explains key modules of my regression block (One … Read more
A peek into the design, training, loss functions and arithmetic behind GANs Let’s say we have a dataset of images of bedrooms and an image classifier CNN that was trained on … Read more
Machine Learning ~ Applied Mathematics https://bit.ly/2Wns7eN Roadmap Goal: First and foremost machine learning carries with it this connotation that it is extremely complex. While it is mathematically rigorous it is … Read more
[source: https://tensorflow.org] Like most machine learning libraries, TensorFlow is “concept-heavy and code-lite”. The syntax is not very difficult to learn. But it is very important to understand its concepts. What … Read more
To be honest, I planned on writing a review of this past weekend’srstudio::conf 2019,but several other people have already done a great job of doingthat—just check out Karl Broman’s aggregation … Read more
At a high level intuition, a Markov Decision Process(MDP) is a type of mathematics model that is very useful for machine learning, reinforcement learning to be specific. The model allows … Read more
I am working on moderately large financial price data sets. By moderately large I mean less than 4 million rows per asset. 4 million rows can cover the last 20 … Read more
Part 2: Applying Language Models to Real Data Data Source and Pre-Processing For this demonstration, we will be using the IMDB large movie review dataset made available by Stanford. The data … Read more
A summarized view of the challenges in implementing recommender systems from an industry point of view Most of the times data science projects stop at achieving some satisfactory accuracy based on … Read more
Logistic Regression Prediction A basic logistic regression using demographic variables can correctly predict about 60% of zip codes that have a Starbucks and 90% of those that don’t. Given the … Read more
I was starting a project where I had to quickly check if a package, Flask, worked with the Python installed on my machine. As I ran the command to install … Read more
The seventh release of littler as a CRAN package is now available, following in the now more than twelve-year history as a package started by Jeff in 2006, and joined … Read more
From “alpha” to “z-score” TL;DR — Jump to glossary Why So Weary? When I try to read about statistics I get mired in the jargon. Even just moving past the phrase, “For a given … Read more
This tutorial builds artificial neural network in Python using NumPy from scratch in order to do an image classification application for the Fruits360 dataset. Everything (i.e. images and source codes) … Read more
If you are looking for an interactive way to run your Python script, say you want to start a machine learning project with a couple of friends, look no further — Google … Read more
And how to create your own bag of words classifier With the nifty extractive text summarizer we created in Part 1, we were able to take news articles and cut … Read more
And how to create your own extractive text summarizer Imagine you are given an assignment from school or work that involves A LOT of research. You spend all night grinding … Read more
Know Your Medium Post Better with Data Origin As a regular writer on Medium as well as a data geek, after the busy year of 2018, I’d like to reflect what … Read more
The ggraph package provides a ggplot-like grammar for plotting graphs and as such youcan produce very neat network visualizations. But as with ggplot, it takes a while to getused to … Read more
Improving productivity of a hardware design and manufacturing professional with an advanced AI tool Authors: Partha Deka and Rohit Mittal What is a citizen data scientist? Expert data scientists rely … Read more
Optimization of arbitrary functions on Cloud ML Engine Google Cloud ML Engine offers a hyper-parameter tuning service that uses Bayesian methods. It is not restricted to TensorFlow or scikit-learn. In fact, … Read more
Welcome to part 2 of Creating an AI for GameBoy! If you missed Part 1: Coding a Controller, click here to catch up. In this edition, I will be going … Read more
Probability Distribution Functions A probability distribution is a function that describes the likelihood of an event or outcome. We will now delve into the different types of distributions, in terms … Read more
Let’s Specialize and Break it Up! I would not be opposed to downplaying the term “data science” and breaking it up into specialized disciplines. Do not misunderstand, I think the global … Read more
How It Works Autonomous cars are vehicles that drive without a user. At its core, it is a machine that intakes inputs from various sensors and cameras, interprets the inputs, and … Read more
Monte Carlo’s can be used to simulate games at a casino (Pic courtesy of Pawel Biernacki) This is the first of a three part series on learning to do Monte … Read more
This article will explain the experiments and theory behind an interesting paper that converts natural language text descriptions such as “A small bird has a short, point orange beak and … Read more
We are pleased to announce the very first Satrday event in Newcastle upon Tyne (and England). satRdays Newcastle is a one-day, low-cost, community organised R conference in the heart of … Read more
In the previous article I shared about my little toolbox for data cleaning after realizing that some codes are applicable for most common scenarios of messy data. In other words, … Read more
A guide for managers and business owners. ‘Big data’ is all the rage right now. It is one of those phrases that has for a number of years enjoyed a comfortable … Read more
The Facts on Public Water Fluoridation With any health topic, especially one that has attracted controversy, we must be careful about where we get our data. Even studies in peer-reviewed … Read more
Noise Reduction Since the mathematics involved behind the scene are mainly based on derivatives (cf. Step 2: Gradient calculation), edge detection results are highly sensitive to image noise. One way … Read more
Attn: Illustrated Attention By Raimi Karim — 12 min read For decades, Statistical Machine Translation has been the dominant translation model, until the birth of Neural Machine Translation (NMT). NMT is an … Read more
A serious, decades-long attempt to understand different peoples’ values David Hood (@Thoughfulnz) has been posting some interesting snippets of analysis using the World Values Survey data (like this example). This … Read more
Hannah Fry, in her book “Hello World”, talks of how computers can be programmed to mimic music, nearly perfectly. A program was written that perfectly mimicked Bach’s musical lexicon-right down … Read more
One of the most frustrating things about Tell Your Children, the anti-marijuana tract from former New York Times reporter (and spy novelist!) Alex Berenson, is that its most interesting points … Read more
Last year we gave you a price break down of some of the most popular R conferences around the globe for 2017. We’re going to do it again for 2018. … Read more
Authors:David Flórez Fernández, Data and AI Solution Architect @ Microsoft Pablo Peris, Digital Architect @ Microsoft Companies struggle to thrive with Analytics projects In the present days of data accumulation there … Read more
Introduction Data can take on a variety of forms. For processing visual information, images are extremely common. Images store a two-dimensional grid of pixels that often represent our three-dimensional world. … Read more