My presentations on ‘Elements of Neural Networks & Deep Learning’ -Parts 4,5

This is the next set of presentations on “Elements of Neural Networks and Deep Learning”.  In the 4th presentation I discuss and derive the generalized equations for a multi-unit, multi-layer Deep Learning network.  The 5th presentation derives the equations for a Deep Learning network when performing multi-class classification along with the derivations for cross-entropy loss. The corresponding … Read more

Categories R Tags ExcerptFavorite

splashr 0.6.0 Now Uses the CRAN-nascent stevedore Package for Docker Orchestration

The splashr package [srht|GL|GH] — an alternative to Selenium for javascript-enabled/browser-emulated web scraping — is now at version 0.6.0 (still in dev-mode but on its way to CRAN in the next 14 days). The major change from version 0.5.x (which never made it to CRAN) is a swap out of the reticulated docker package with … Read more

Categories R Tags ExcerptFavorite

R Tip: Use Inline Operators For Legibility

R Tip: use inline operators for legibility. A Python feature I miss when working in R is the convenience of Python‘s inline + operator. In Python, + does the right thing for some built in data types: It concatenates lists: [1,2] + [3] is [1, 2, 3]. It concatenates strings: ‘a’ + ‘b’ is ‘ab’. … Read more

Categories R Tags ExcerptFavorite

ggeffects 0.8.0 now on CRAN: marginal effects for regression models #rstats

I’m happy to announce that version 0.8.0 of my ggeffects-package is on CRAN now. The update has fixed some bugs from the previous version and comes along with many new features or improvements. One major part that was addressed in the latest version are fixed and improvements for mixed models, especially zero-inflated mixed models (fitted … Read more

Categories R Tags ExcerptFavorite

pcLasso: a new method for sparse regression

I’m excited to announce that my first package has been accepted to CRAN! The package pcLasso implements principal components lasso, a new method for sparse regression which I’ve developed with Rob Tibshirani and Jerry Friedman. In this post, I will give a brief overview of the method and some starter code. (For an in-depth description … Read more

Categories R Tags ExcerptFavorite

rOpenSci’s new Code of Conduct

We are pleased to announce the release of our new Code of Conduct. rOpenSci’s community is our best asset and it’s important that we put strong mechanisms in place before we have to act on a report. As before, our Code applies equally to members of the rOpenSci team and to anyone from the community … Read more

Categories R Tags ExcerptFavorite

? R Coding Style Guide

Language is a tool that allows human beings to interact and communicate with each other. The clearer we express ourselves, the better the idea is transferred from our mind to the other. The same applies to programming languages: concise, clear and consistent codes are easier to read and edit. It is especially important, if you … Read more

Categories R Tags ExcerptFavorite

colorspace: New Tools for Colors and Palettes

A major update (version 1.4.0) of the R package colorspace has been released to CRAN, enhancing many of the package’s capabilities, e.g., more refined palettes, named palettes, ggplot2 color scales, visualizations for assessing palettes, shiny and Tcl/Tk apps, color vision deficiency emulation, and much more. Overview The colorspace package provides a broad toolbox for selecting … Read more

Categories R Tags ExcerptFavorite

Travis CI for R — Advanced guide

Travis CI for R — Advanced guide Continuous integration for building an R project in Travis CI including code coverage, pkgdown documentation, osx and multiple R-Versions Photo by Guilherme Cunha on Unsplash Travis CI is a common tool to build R packages. It is in my opinion the best platform to use R in continuous integration. Some of the … Read more

Categories R Tags ExcerptFavorite

Showing a difference in means between two groups

Visualising a difference in mean between two groups isn’t as straightforward as it should. After all, it’s probably the most common quantitative analysis in science. There are two obvious options: we can either plot the data from the two groups separately, or we can show the estimate of the difference with an interval around it. … Read more

Categories R Tags ExcerptFavorite

Medium + r-bloggers — How to integrate?

Medium + r-bloggers — How to integrate? Build up a PHP script that allows you to post your Medium articles on r-bloggers.com. The script filters an RSS feed by item tags. Photo by Ato Aikins on Unsplash Motivation I started my blog about R on Medium. Medium is a wonderful platform with a great user interface. The idea to … Read more

Categories R Tags ExcerptFavorite

XmR Chart | Step-by-Step Guide by Hand and with R

Is your process in control? The XmR chart is a great statistical process control (SPC) tool that can help you answer this question, reduce waste, and increase productivity. We’ll cover the concepts behind XmR charting and explain the XmR control constant with some super simple R code. Lastly, we’ll cover how to make the XmR … Read more

Categories R Tags ExcerptFavorite

Generating Synthetic Data Sets with ‘synthpop’ in R

Synthpop – A great music genre and an aptly named R package for synthesising population data. I recently came across this package while looking for an easy way to synthesise unit record data sets for public release. The goal is to generate a data set which contains no real units, therefore safe for public release … Read more

Categories R Tags ExcerptFavorite

Making sense of the METS and ALTO XML standards

Last week I wrote a blog post where I analyzedone year of newspapers ads from 19th century newspapers. The data is made available by thenational library of Luxembourg.In this blog post, which is part 1 of a 2 part series, I extract data from the 257gb archive, whichcontains 10 years of publications of the L’Union, … Read more

Categories R Tags ExcerptFavorite

Practical Data Science with R, 2nd Edition discount!

Please help share our news and this discount. The second edition of our best-selling book Practical Data Science with R2, Zumel, Mount is featured as deal of the day at Manning. The second edition isn’t finished yet, but chapters 1 through 4 are available in the Manning Early Access Program (MEAP), and we have finished … Read more

Categories R Tags ExcerptFavorite

10 years of playback history on Last.FM: “Just sit back and listen”

Alright, seems like this is developing into a blog where I am increasingly investigating my own music listening habits.Recently, I’ve come across the analyzelastfm package by Sebastian Wolf. I used it to download my complete listening history from Last.FM for the last ten years. That’s a complete dataset from 2009 to 2018 with exactly 65,356 … Read more

Categories R Tags ExcerptFavorite

How to combine Multiple ggplot Plots to make Publication-ready Plots

Categories Visualizing Data Tags Best R Packages Data Visualisation R Programming The life cycle of Data science can never be completed without communicating the results of the analysis/research. In fact, Data Visualization is one of the areas where R as a language for Data science has got an edge over the most-celebrated Python. With ggplot2 … Read more

Categories R Tags ExcerptFavorite

GetDFPData Ver 1.4

I just released a major update to package GetDFPData. Here are the main changes: Naming conventions for caching system are improved so that it reflects different versions of FRE and DFP files. This means the old caching system no longer works. If you have built yourself your own cache folder with many companies, do clean … Read more

Categories R Tags ExcerptFavorite

Parallelize a For-Loop by Rewriting it as an Lapply Call

A commonly asked question in the R community is: How can I parallelize the following for-loop? The answer almost always involves rewriting the for (…) { … } loop into something that looks like a y <- lapply(…) call. If you can achieve that, you can parallelize it via for instance y <- future.apply::future_lapply(…) or … Read more

Categories R Tags ExcerptFavorite

R Tip: Use seqi() For Indexes

R Tip: use seqi() for indexing. R‘s “1:0 trap” is a mal-feature that confuses newcomers and is a reliable source of bugs. This note will show how to use seqi() to write more reliable code and document intent. The issue is, contrary to expectations (formed in working with other programming languages) the sequence 1:0 is … Read more

Categories R Tags ExcerptFavorite

pinp 0.0.7: More small YAML options

A good six months after the previous release, another small feature release of our pinp package for snazzier one or two column Markdown-based pdf vignettes got onto CRAN minutes ago as another [CRAN-pretest-publish] release indicating a fully automated process (as can be done for packages free of NOTES, WARNING, ERRORS, and without ‘changes to worse’ … Read more

Categories R Tags ExcerptFavorite

Add a static pdf vignette to an R package

Most vignettes are built when a package is built, but there are occasions where you just want to include a pdf. For example when you want to include a paper. Of course there is a package supporting this, but in this post I will show you how to do it yourself with ease. The idea … Read more

Categories R Tags ExcerptFavorite

epubr 0.6.0 CRAN release

The epubr R package provides functions supporting the reading and parsing of internal e-book content from EPUB files. It has been updated to v0.6.0 on CRAN. This post highlights new functionality. The key improvements focus on cases where EPUB files have poorly arranged text when loaded into R as a result of their metadata entries … Read more

Categories R Tags ExcerptFavorite

Visualizing the Asian Cup with R!

Another year, another big soccer/football tournament! This time it’s thetop international competition in Asia, the Asian Cup hosted in theU.A.E. In this blog post I’ll be covering (responsible) web-scraping, data wrangling(tidyverse FTW!), and of course, data visualization with ggplot2. Let’s get started! Packages pacman::p_load(tidyverse, scales, lubridate, ggrepel, stringi, magick, glue, extrafont, rvest, ggtextures, cowplot, ggimage, … Read more

Categories R Tags ExcerptFavorite

Roll Your Own Federal Government Shutdown-caused SSL Certificate Expiration Monitor in R

By now, even remote villages on uncharted islands in the Pacific know that the U.S. is in the midst of a protracted partial government shutdown. It’s having real impacts on the lives of Federal government workers but they aren’t the only ones. Much of the interaction Federal agencies have with the populace takes place online … Read more

Categories R Tags ExcerptFavorite

Who is the greatest finisher in soccer?

It’s relatively easy to find the player who has scored the most goals in the last 12 years (hello, Lionel Messi). But which professional football (soccer) player is the best finisher, i.e. which player is most likely to put a shot they take into the goal? You can’t simply use the conversion rate (the ratio … Read more

Categories R Tags ExcerptFavorite

Waffle Geoms & Other Miscellaneous In-Development Package Updates

More than just sergeant has been hacked on recently, so here’s a run-down of various updates: waffle The square pie chart generating waffle package now contains a nascent geom_waffle() so you can do things like this: library(hrbrthemes) library(waffle) library(tidyverse) tibble( parts = factor(rep(month.abb[1:3], 3), levels=month.abb[1:3]), values = c(10, 20, 30, 6, 14, 40, 30, 20, … Read more

Categories R Tags ExcerptFavorite

Linguistic Signals of Album Quality: A Predictive Analysis of Pitchfork Review Scores Using Quanteda

In this post we will return to the Pitchfork music review data, parts of which I’ve analyzed in previous posts. Our goal here will be to use text mining and natural language processing (NLP) to understand linguistic signals of album quality. This type of analysis helps us understand what Pitchfork reviewers appreciate or dislike, and … Read more

Categories R Tags ExcerptFavorite

Rcpp, Camarón de la Isla and the Beauty of Maths

Desde que te estoy queriendoyo no sé lo que me pasacualquier vereda que tomosiempre me lleva a tu casa(Y mira que mira y mira, Camarón de la Isla) The verses that head this post are taken from a song of Camarón de la Isla and illustrate very well what is a strange attractor in the … Read more

Categories R Tags ExcerptFavorite

My presentations on ‘Elements of Neural Networks & Deep Learning’ -Part1,2,3

I will be uploading a series of presentations on ‘Elements of Neural Networks and Deep Learning’. In these video presentations I discuss the derivations of L -Layer Deep Learning Networks, starting from the basics. The corresponding implementations are available in vectorized R, Python and Octave are available in my book ‘Deep Learning from first principles:Second … Read more

Categories R Tags ExcerptFavorite

baRcodeR 0.1.2 release – new linear barcodes

baRcodeR 0.1.2 is released on CRAN today! Download and install by install.packages(“baRcodeR”) Example linear barcode The major feature of this release is the ability to print linear (a.k.a normal) barcodes through specifying type = “linear” in create_PDF() rather than type = “matrix” which prints the usual QR code. The github repository is at yihanwu/baRcodeR. Minor … Read more

Categories R Tags ExcerptFavorite

A Look Back on 2018: Part 1

Welcome to Reproducible Finance 2019! It’s a new year, a new beginning, the Earth has completed one more trip around the sun, and that means it’s time to look back on the previous January to December cycle. Today and next time, we’ll explore the returns and volatilities of various market sectors in 2018. We might … Read more

Categories R Tags ExcerptFavorite

vitae: Dynamic CVs with R Markdown

Why vitae? The process of maintaining a CV can be tedious. It’s a task I often forget about – that is until someone requests it and I find that my latest is woefully out of date. To make matters worse, these professional updates often need repeating across variety of sites (such as ORCID and LinkedIn). … Read more

Categories R Tags ExcerptFavorite

Considering sensitivity to unmeasured confounding: part 2

In part 1 of this 2-part series, I introduced the notion of sensitivity to unmeasured confounding in the context of an observational data analysis. I argued that an estimate of an association between an observed exposure \(D\) and outcome \(Y\) is sensitive to unmeasured confounding if we can conceive of a reasonable alternative data generating … Read more

Categories R Tags ExcerptFavorite

Updated Review: jamovi User Interface to R

Introduction jamovi (spelled with a lower-case “j”) is a free and open source graphical user interface for the R software that targets beginners looking to point-and-click their way through analyses. It is available for Windows, Mac, Linux, and even ChromeOS. Versions are also planned for servers and tablets. This post is one of a series of reviews which … Read more

Categories R Tags ExcerptFavorite

On the Road to 0.8.0 — Some Additional New Features Coming in the sergeant Package

It was probably not difficult to discern from my previous Drill-themed post that I’m fairly excited about the Apache Drill 1.15.0 release. I’ve rounded out most of the existing corners for it in preparation for a long-overdue CRAN update and have been concentrating on two helper features: configuring & launching Drill embedded Docker containers and … Read more

Categories R Tags ExcerptFavorite

R NewYorkers Feeling the Holiday Spirit? Here’s Your Tip

Combining Pivot Billions with R to dive into whether the holiday spirit inspires bigger tips and which parts of New York experience this effect the most. The holiday season brings with it a degree of cheer and joy that many claim makes people act friendlier towards each other. I wanted to see how this effect … Read more

Categories R Tags ExcerptFavorite

Animating Data Transformations: Part II

In our previous series on Animating Data Transformations, we showed you how to use gganimate to construct an animation which illustrates the process of going between tall and wide representations of data. Today, we will show the same procedure for constructing an animation of the unnest() function. The unnest() function takes a tibble containing a … Read more

Categories R Tags ExcerptFavorite

An Introduction to R— Merging and filtering data— Part 1

Data understanding by filtering and merging the 2019 Australian Tennis Open data for the Men’s tour. Photo by Christopher Burns on Unsplash You know it’s summer when the Australian Tennis Open visits Melbourne and everyone is excited that Roger and Serena are in town. Problem I am interested to predict who might win the 2019 Australian … Read more

Understanding the maths of Computed Tomography (CT) scans

Noseman is having a headache and as an old-school hypochondriac he goes to see his doctor. His doctor is quite worried and makes an appointment with a radiologist for Noseman to get a CT scan. Modern CT scanner from Siemens Because Noseman always wants to know how things work he asks the radiologist about the … Read more

Categories R Tags ExcerptFavorite

A deep dive into glmnet: offset

I’m writing a series of posts on various function options of the glmnet function (from the package of the same name), hoping to give more detail and insight beyond R’s documentation. In this post, we will look at the offset option. For reference, here is the full signature of the glmnet function: glmnet(x, y, family=c(“gaussian”,”binomial”,”poisson”,”multinomial”,”cox”,”mgaussian”), … Read more

Categories R Tags ExcerptFavorite

Dow Jones Stock Market Index (4/4): Trade Volume GARCH Model

Categories Advanced Modeling Tags Data Visualisation Linear Regression R Programming This is the final part of the 4-series posts. In this fourth post, I am going to build an ARMA-GARCH model for Dow Jones Industrial Average (DJIA) daily trade volume log ratio. You can read the other three parts in the following links: part 1, … Read more

Categories R Tags ExcerptFavorite

An even better rOpenSci website with Hugo

A bit more than one year ago, rOpenSci launched its new website design, by the designer Maru Lango. Not only did the website appearance change (for the better!), but the underlying framework too. ropensci.org is powered by Hugo, like blogdown! Over the last few months, we’ve made the best of this framework, hopefully improving your … Read more

Categories R Tags ExcerptFavorite

How do Convolutional Neural Nets (CNNs) learn? + Keras example

In this lesson, I am going to explain how computers learn to see; meaning, how do they learn to recognize images or object on images? One of the most commonly used approaches to teach computers “vision” are Convolutional Neural Nets. This lesson builds on top of two other lessons: Computer Vision Basics and Neural Nets. … Read more

Categories R Tags ExcerptFavorite

You did a sentiment analysis with tidytext but you forgot to do dependency parsing to answer WHY is something positive/negative

A small note on the growing list of users of the udpipe R package. In the last month of 2018, we’ve updated the package on CRAN with some noticeable changes The default models which are now downloaded with the function udpipe_download_model are now models built on Universal Dependencies 2.3 (released on 2018-11-15) This means udpipe … Read more

Categories R Tags ExcerptFavorite

French Baccalaureate Results

I. Context The French Baccalaureate (BAC) is the final exam all French students must pass to graduate from high school. Not only is it necessary to graduate, but a student’s performance on the BAC is the American equivalent to one’s performance on the ACT/SAT for college applications. As I am myself a product of the … Read more

Categories R Tags ExcerptFavorite

Analysis of South African Funds

Packages used in this post Disclaimer: I am no financial advisor, have never been and you should not take any of this analysis as investment advice. These thoughts are my own, please dont mail me about your money strategies/problems. I enjoy numbers, scraping and data analysis and that is wat this post is about. Also, … Read more

Categories R Tags ExcerptFavorite