Quick Hit: Using seymour to Subscribe to your Git[la|hu]b Repo Issues in Feedly

The seymour Feedly API package has been updated to support subscribing to RSS/Atom feeds. Previously the package was intended to just treat your Feedly as a data source, but there was a compelling use case for enabling subscription support: subscribing to code repository issues. Sure, there’s already email notice integration for repository issues on most … Read moreQuick Hit: Using seymour to Subscribe to your Git[la|hu]b Repo Issues in Feedly

A Dog Detector and Breed Classifier

In a field like physics, things keep getting harder, to the point that it’s very difficult to understand what’s going on at the cutting edge unless it’s in highly simplified terms. In computer science though, and artificial intelligence in particular, knowledge built up slowly over 70+ years by people all over the world is still … Read moreA Dog Detector and Breed Classifier

Build a Pipeline for Harvesting Medium Top Author Data

Nuts and Bolts One key requirement was to make deployment of my Luigi workflow very simple. I wanted to assume only one thing about the deployment environment; that the Docker daemon would be available. With Docker, I wouldn’t need to be concerned with Python version mismatches or other environmental discrepancies. It took me a little while … Read moreBuild a Pipeline for Harvesting Medium Top Author Data

New R package: load and chart oceanic storms

Mapping historical storms data is now a little bit easier. Off the back of this blog, I have authored an R package (available at basilesimon/noaastorms) that downloads, cleans and parses NOAA IBtrack data for you. The National Oceanic and Atmospheric Administration releases datasets known as International Best Track Archive for Climate Stewardship. These datasets are … Read moreNew R package: load and chart oceanic storms

Time Travel with RStudio Package Manager 1.0.4

We all love packages. We don’t love when broken package environments prevent usfrom reproducing our work. In version 1.0.4 of RStudio Package Manager,individuals and teams can navigate through repository checkpoints,making it easy to recreate environments and reproduce work. The new release alsoadds important security updates, improvements for Git sources, further access toretired packages, and beta … Read moreTime Travel with RStudio Package Manager 1.0.4

December 2108: “Top 40” New CRAN Packages

By my count, 157 new packages stuck to CRAN in December. Below are my “Top 40” picks in ten categories: Computational Methods, Data, Finance, Machine Learning, Medicine, Science, Statistics, Time Series, Utilities and Visualization. This is the first time I have used the Medicine category. I am pleased that a few packages that appear to … Read moreDecember 2108: “Top 40” New CRAN Packages

Power BI

Using Power BI and R Tutorial here: Run R scripts in Power BI Desktop The only twist that I want to add is an idea on how to enable users without admin access to run R code. This can be achieved by storing a portable r installation on a mountable file storage. R Download the … Read morePower BI

How Does Back-Propagation in Artificial Neural Networks Work?

Our Neural Network Let’s finally draw a diagram of our long-awaited neural net. It should look something like this: The leftmost layer is the input layer, which takes X0 as the bias term of value 1, and X1 and X2 as input features. The layer in the middle is the first hidden layer, which also takes … Read moreHow Does Back-Propagation in Artificial Neural Networks Work?

Pix2Pix

Shocking result of Edges-to-Photo Image-to-Image translation using the Pix2Pix GAN Algorithm This article will explain the fundamental mechanisms of a popular paper on Image-to-Image translation with Conditional GANs, Pix2Pix, following is a link to the paper: Article Outline I. Introduction II. Dual Objective Function with Adversarial and L1 Loss III. U-Net Generator IV. PatchGAN Discriminator … Read morePix2Pix

Probability — Fundamentals of Machine Learning (Part 1)

The Mathematics of Probability In the beginning, I suggested that probability theory is a mathematical framework. As with any mathematical framework there is some vocabulary and important axioms needed to fully leverage the theory as a tool for machine learning. Probability is all about the possibility of various outcomes. The set of all possible outcomes … Read moreProbability — Fundamentals of Machine Learning (Part 1)

Why visual literacy is essential to good data visualization

We know data literacy matters. But visual literacy matters too. Here’s why. Photo by Markus Spiske on Unsplash Data is all around us, and the way people work has changed because of it. Companies are now investing more in roles like Chief Data Officer, building their data science teams and talking about things like “data literacy” in … Read moreWhy visual literacy is essential to good data visualization

Correlation analysis of cyclically adjusted valuation measures and subsequent returns

In this post we’ll test three different cyclically-adjusted valuation measures: CAPE (earnings), CAPD (dividends) and CAPB (book value). CAPE is calculated like the P/E ratio, but by dividing the current real price with the last ten year’s average inflation-adjusted earnings. CAPD uses dividends instead of earnings, and CAPB uses book value. We’ll test the optimal … Read moreCorrelation analysis of cyclically adjusted valuation measures and subsequent returns

Master Python through building real-world applications (Part 7)

Data Collector Web App with PostgreSQL and Flask Working with database and queries can be pretty daunting to some, or maybe most of us. Perhaps, I have lost 100 readers by now just because there’s PostgreSQL written in the subtitle. But as you are here, I want you to know that it is an important thing … Read moreMaster Python through building real-world applications (Part 7)

Using Travis-CI to Create R-bloggers for Taiwan

R-bloggers.com is a great platform for R users, but I sometimes feel awkward to publish posts on R-bloggers when I have things to share that are only relevant to users in Taiwan. Inspired by R-bloggers, I thought maybe I could use Travis-CI and GitHub to create a blog that automatically updates its posts by retrieving … Read moreUsing Travis-CI to Create R-bloggers for Taiwan

A.I enhanced molecular discovery and optimization

Awesome! But how do we get there? Researchers at the forefront of their fields have been trying to use the existing tools we have on hand to solve this problem. There is a pattern in the modus operandi of current research, and the same general process applies to any A.I based science project. Researchers are carpenters, … Read moreA.I enhanced molecular discovery and optimization

Hash Me If You Can

We are living in the era of Big Data but the problem of course is that the bigger our data sets become the slower even simple search operations get. I will now show you a trick that is the next best thing to magic: building a search function that practically doesn’t slow down even for … Read moreHash Me If You Can

Everything you need to know about Scatter Plots for Data Visualisation

If you’re a Data Scientist there’s no doubt that you’ve worked with scatter plots before. Despite their simplicity, scatter plots are a powerful tool for visualising data. There’s a lot of options, flexibility, and representational power that comes with the simple change of a few parameters like color, size, shape, and regression plotting. Here you’ll … Read moreEverything you need to know about Scatter Plots for Data Visualisation

Making Music with Machine Learning

Image from https://www.maxpixel.net/Circle-Structure-Music-Points-Clef-Pattern-Heart-1790837 Music is not just an art, music is an expression of the human condition. When an artist is making a song you can often hear the emotions, experiences, and energy they have in that moment. Music connects people all over the world and is shared across cultures. So there is no way … Read moreMaking Music with Machine Learning

Exploration of the Social News TV: The Communication Behavior of #ajnewsgrid

What is NewsGrid? NewsGrid is a young news program broadcast globally by Al Jazeera since 2016. It is Al Jazeera’s first interactive news hour. The show is produced in three parts, top stories of the day presented by one presenter, stories create huge social reaction on Twitter presented by a social media presenter, and the … Read moreExploration of the Social News TV: The Communication Behavior of #ajnewsgrid

Hey, Who Moved the Goalposts?

Part of 10 reasons why Software Development projects fail series The most successful software development projects have a timeline and a series of milestones to accomplish that project within a set period of time. Those milestones are critical, because they help to divide a large project into a series of much smaller projects, and they … Read moreHey, Who Moved the Goalposts?

Interacting with The Demographic and Health Surveys (DHS) Program data

There seem to be a lot of ways to write about your R package, and rather than haveto decide on what to focus on I thought I’d write a little bit about everything.To begin with I thought it best to describe what problem rdhs tries to solve,why it was developed and how I came to … Read moreInteracting with The Demographic and Health Surveys (DHS) Program data

Watch if R is running from Shiny

Today I discovered that the tag of a Shiny App gets theshiny-busy class when computation is done in the R process. Whichmeans that you can potentially watch with JavaScript if the R process isrunning. TIL — Shiny Apps switch to the ‘shiny-busy’ class when R is performingcomputation in the background.So it’s basically possible to useJavaScript … Read moreWatch if R is running from Shiny

Playing Around with Phyllotactic Spirals

I wanted to figure out how to create gif animation using the magick, so I decided I’ll try that out with ggplot2 spiral art. Loading up packages I’m definitely in love with “magick” right now ? library(tidyverse) ## for pretty much everything… library(magick) ## I’m now a magick fan!!! library(scales) ## Handy when it comes … Read morePlaying Around with Phyllotactic Spirals

Binary Tree: The Diameter.

Dynamic programming sequences sub-problems together, having each sub-problem lead to the solution. With dynamic programming we no longer have to visits paths we’ve been down before, instead we can prune the shorter branches and track the diameter at each step. With the dynamic approach the algorithm travels down the tree and counts lengths on the … Read moreBinary Tree: The Diameter.

Revisiting Adam Smith’s Invisible Hand in the Data Economy

Fundamental paradigms of the free market should also be scrutinized by data science An unobservable market force that helps the demand and supply of goods in a free market to reach equilibrium automatically and efficiently is what we call the invisible hand. But I am a data scientist, I don’t deal in unobservable forces, no observations … Read moreRevisiting Adam Smith’s Invisible Hand in the Data Economy

Set Theory — Cardinality & Power Sets

With basic notation & operations cleared in articles one & two in this series, we’ve now built a fundamental understanding of Set Theory. This third article further compounds this knowledge by zoning in on the most important property of any given set: the total number of unique elements it contains. Also known as the cardinality, … Read moreSet Theory — Cardinality & Power Sets

Fast, static D3 maps built with Turf.js and the command-line

Combining Mike Bostock’s command-line cartography tutorial with the flexibility of Node.js Estimated percent of undocumented residents in U.S. metro areas. Source: Pew Research Center Recently, I needed to build a handful of U.S. state bubble maps to be embedded in a story for San Antonio Express-News. I wanted to use D3 but was concerned about slow asset … Read moreFast, static D3 maps built with Turf.js and the command-line

Introduction to Unsupervised Learning

Understand principal component analysis (PCA) and clustering methods Photo by Oscar Keys on Unsplash Unsupervised learning is a set of statistical tools for scenarios in which there is only a set of features and no targets. Therefore, we cannot make predictions, since there are no associated responses to each observation. Instead, we are interested in finding … Read moreIntroduction to Unsupervised Learning

The Basics of Cryptography

With Applications in R Source Have you ever wondered how companies securely store your passwords? Or how your credit card information is kept private when making online purchases? The answer is cryptography. The vast majority of internet sites now use some form of cryptography to ensure the privacy of its users. Even information such as emails … Read moreThe Basics of Cryptography

Assessing NHL award winners using K-means

Data sets The final data-set used is a combination of traditional and advanced player metrics. Traditional statistics concern metrics like goals and assists (total being known as points), plus-minus, penalty minutes and time on ice, whilst advanced player metrics deal more with player behavior and puck possession. Using Python’s beautifulsoup library, I scraped more traditional … Read moreAssessing NHL award winners using K-means

Reinforcement Learning and Deep Reinforcement Learning with Tic Tac Toe

In this article I want to share my project on implementing reinforcement learning and deep reinforcement learning methods on a Tic Tac Toe game. The article contains: 1. Rigorous definition of the game as a Markov decision process. 2. How to implement the reinforcement learning method, called TD(0), to create an agent that plays the … Read moreReinforcement Learning and Deep Reinforcement Learning with Tic Tac Toe

Software 2.0 — Playing with Neural Networks (Part 1)

In this article we are going to discuss about neural networks (from scratch), the innovative concept, which has taken the world by storm. I will assume that the reader is already familiar with the following concepts: Cost function (MSE and Cross Entropy) Gradient Descent Logistic regression Activation Function Binary Classification Particularly, this article will try … Read moreSoftware 2.0 — Playing with Neural Networks (Part 1)

Problems in Estimating GARCH Parameters in R (Part 2; rugarch)

Now here is a blog post that has been sitting on the shelf far longer than it should have. Over a year ago I wrote an article about problems I was having when estimating the parameters of a GARCH(1,1) model in R. I documented the behavior of parameter estimates (with a focus on ) and … Read moreProblems in Estimating GARCH Parameters in R (Part 2; rugarch)

Sentiment Analysis with Word Bags and Word Sequences

For generic text, word bag approaches are very efficient at text classification. For a binary text classification task studied here, LSTM working with word sequences is on par in quality with SVM using tf-idf vectors. But performance is a different matter… The bag-of-words approach to turning documents into numerical vectors ignores the sequence of words … Read moreSentiment Analysis with Word Bags and Word Sequences

Visualising Machine Learning Datasets with Google’s FACETS.

Data Although you can work with data provided on the demo page, I shall be working with another set of data. I will be doing EDA with FACETS on the Load Prediction Dataset. The problem statement is to predict whether an applicant who has been granted a loan by a company, will repay it back … Read moreVisualising Machine Learning Datasets with Google’s FACETS.

The Science Behind AlphaStar

How DeepMind Uses Reinforcement Learning to Beat Human Pros in StarCraft II Long term strategic planning has long been considered a unique quality of the human mind that would be very difficult to imitate by artificial intelligence(AI) agents. Conceptually, strategic thinking involves evaluating a large number of data points in the present in order to … Read moreThe Science Behind AlphaStar

Main benefits of using a Chatbot for your business

I’m going to tell you about the future — messenger chatbots. What is Messenger Chatbot and why it is crucial for your business? Basically, It’s a digital assistant, most of the times based on AI that has the direction of various commands which looks like a natural sound conversation with your customers. Now you have the opportunity … Read moreMain benefits of using a Chatbot for your business