Hash Me If You Can

We are living in the era of Big Data but the problem of course is that the bigger our data sets become the slower even simple search operations get. I will now show you a trick that is the next best thing to magic: building a search function that practically doesn’t slow down even for … Read more

Categories R Tags ExcerptFavorite

benchmarkme: new version

When discussing how to speed up slow R code, my first question is what is your computer spec? It’s always surprised me that people are wondering why analysing big data is slow, yet they are using a five-year-old cheap laptop. Spending a few thousand pounds would often make their problems disappear. To quantify the impact … Read more

Categories R Tags ExcerptFavorite

Web Scraping Google Sheets with RSelenium

Photo by freestocks.org on Unsplash I love to learn new things and one of ways I learn best is by doing. Also it’s been said that you never fully understand a topic until you are able to explain it , I think blogging is a low barrier to explaining things. Someone I met at a local data … Read more

Categories R Tags ExcerptFavorite

Forecast Framework Demo

(This article was first published on – R, and kindly contributed to R-bloggers) Want to learn how to do some forecasting with R? Here’s your chance to try out a new time-series forecasting package for R whose aim is to standardize and simplify the process of making and evaluating forecasts! The Reich Lab uses an … Read more

Categories R Tags ExcerptFavorite

Watch if R is running from Shiny

Today I discovered that the tag of a Shiny App gets theshiny-busy class when computation is done in the R process. Whichmeans that you can potentially watch with JavaScript if the R process isrunning. TIL — Shiny Apps switch to the ‘shiny-busy’ class when R is performingcomputation in the background.So it’s basically possible to useJavaScript … Read more

Categories R Tags ExcerptFavorite

Playing Around with Phyllotactic Spirals

I wanted to figure out how to create gif animation using the magick, so I decided I’ll try that out with ggplot2 spiral art. Loading up packages I’m definitely in love with “magick” right now ? library(tidyverse) ## for pretty much everything… library(magick) ## I’m now a magick fan!!! library(scales) ## Handy when it comes … Read more

Categories R Tags ExcerptFavorite

The Basics of Cryptography

With Applications in R Source Have you ever wondered how companies securely store your passwords? Or how your credit card information is kept private when making online purchases? The answer is cryptography. The vast majority of internet sites now use some form of cryptography to ensure the privacy of its users. Even information such as emails … Read more

Scaling H2O analytics with AWS and p(f)urrr (Part 3)

This is the final installment of a three part series that looks at how we can leverage AWS, H2O and purrr in R to build analytical pipelines. In the previous posts I looked at starting up the environment through the EC2 dashboard on AWS’ website. The other aspect we looked at, in Part II, was … Read more

Categories R Tags ExcerptFavorite

SatRday LA – R Conference Announcement

Come and have fun with local useRs at SatRday LA on April 6, 2019. This is the first SatRday in LA, and the second in the states. If you have not heard of SatRday, it is a one-day affordable, inclusive, non-profit R conference organized by local R users. Important Details Who: All levels of R users When: April … Read more

Categories R Tags ExcerptFavorite

Graphing My Daily Phone Use

How many times do I look at my phone? I set up a small program on my phoneto count the screen activations and logged to a file. In this post I showwhat went wrong and how to plot the results. The data I set up a small program on my phone that counts every day … Read more

Categories R Tags ExcerptFavorite

Mathematical Notation in Online R/exams

Many R/exams exercises employ mathematical notation that needs to be converted and rendered suitably for inclusion in online exams. While R/exams attempts to set suitable defaults, an overview is provided of possible adjustments and when these might be useful or even necessary. Overview A popular use case of the R/exams package is the generation of … Read more

Categories R Tags ExcerptFavorite

Building Big Shiny Apps — A Workflow (1/2)

During the rstudio::conf(2019L), I’ve presented an eposter called “Building Big Shiny Apps — A Workflow”. You can find the poster here, and this blog post is an attempt at a transcription of what I’ve been talking about while presenting the poster. As this is a rather long topic, I’ve divided this post into two parts: … Read more

Categories R Tags ExcerptFavorite

R tips and tricks – higher-order functions

A higher-order function is a function that takes one or more functions as arguments, and\or returns a function as its result. This can be super handy in programming when you want to tilt your code towards readability and still keep it concise.Consider the following code: # Generate some fake data > eps <- rnorm(10, sd= … Read more

Categories R Tags ExcerptFavorite

Summarizing rstudio::conf 2019 Summaries with Tidy Text Techniques

To be honest, I planned on writing a review of this past weekend’srstudio::conf 2019,but several other people have already done a great job of doingthat—just check out Karl Broman’s aggregation of reviews at the bottomof the page here!(More on this in a second.) In short, my thoughts on the wholeexperience are captured perfectly by NickStrayer’s … Read more

Categories R Tags ExcerptFavorite

Analyzing and Predicting Starbucks’ Location Strategy

Logistic Regression Prediction A basic logistic regression using demographic variables can correctly predict about 60% of zip codes that have a Starbucks and 90% of those that don’t. Given the unbalanced nature of the data set — 31K observations and ~5,500 with a Starbucks — a 60% prediction rate should be sufficient for the purposes of this exercise. Our … Read more

littler 0.3.6: Two neat enhancements

The seventh release of littler as a CRAN package is now available, following in the now more than twelve-year history as a package started by Jeff in 2006, and joined by me a few weeks later. littler is the first command-line interface for R and predates Rscript. And it is (in my very biased eyes) … Read more

Categories R Tags ExcerptFavorite

An Rstudio Addin for Network Analysis and Visualization

The ggraph package provides a ggplot-like grammar for plotting graphs and as such youcan produce very neat network visualizations. But as with ggplot, it takes a while to getused to the grammar. There are already a few amazing Rstudio Addins that assist you with ggplot(for example ggplotAssist andggThemeAssist),but there has not been any equivalent tools … Read more

Categories R Tags ExcerptFavorite

satRdays Newcastle 2019 Conference is Here!

We are pleased to announce the very first Satrday event in Newcastle upon Tyne (and England). satRdays Newcastle is a one-day, low-cost, community organised R conference in the heart of Newcastle City Centre. Where? The event will be held at Newcastle University. Getting to Newcastle is really easy Train: 90 minutes from Edinburgh or 3 … Read more

Categories R Tags ExcerptFavorite

What the world agrees with by @ellis2013nz

A serious, decades-long attempt to understand different peoples’ values David Hood (@Thoughfulnz) has been posting some interesting snippets of analysis using the World Values Survey data (like this example). This inspired me to have a look at the data myself; something that’s been on my to-do list for years. I have analysed it before, but … Read more

Categories R Tags ExcerptFavorite

R Conference Costs v2.0

Last year we gave you a price break down of some of the most popular R conferences around the globe for 2017. We’re going to do it again for 2018. Remember, you canget up-to-date information on upcoming conferences via our GitHub page. It’s important to note that these costs are the prices of an industry … Read more

Categories R Tags ExcerptFavorite

My #TidyverseDevDay and #RStudioConf 2019 Reflections!

This was my second RStudio Conference following last year’s editionin San Diego! In addition, at Tidyverse Developer Day I got a reallycool chance to work on issues and contribute to making the Tidyversebetter. This post won’t be a complete overview of the talks at theconference (others have already released some good blog posts on thatnote: … Read more

Categories R Tags ExcerptFavorite

Keeping up to date with R news

I’ve now given my talk about “How to be a resilient R user” three times, at R-Ladies Strasbourg and R-Ladies Paris in person, and at R-Ladies San José via Google Hangouts. It was fun! I covered part of the content of that talk in a blog post about where to get R help. Today, it’s … Read more

Categories R Tags ExcerptFavorite

more concentration, everywhere

Although it may sound like an excessive notion of optimality, one can hope at obtaining an estimator δ of a unidimensional parameter θ that is always closer to θ that any other parameter. In distribution if not almost surely, meaning the cdf of (δ-θ) is steeper than for other estimators enjoying the same cdf at … Read more

Categories R Tags ExcerptFavorite

Right Now It’s KDA…Asset Allocation.

This post will introduce KDA Asset Allocation. KDA — I.E. Kipnis Defensive Adaptive Asset Allocation is a combination of Wouter Keller’s and TrendXplorer’s Defensive Asset Allocation, along with ReSolve Asset Management’s Adaptive Asset Allocation. This is an asset allocation strategy with a profile unlike most tactical asset allocation strategies I’ve seen before (namely, it barely … Read more

Categories R Tags ExcerptFavorite

Quick Hit: Automating Production Graphics Uploads in R Markdown Documents with googledrive

As someone who measures all kinds of things on the internet as part of his $DAYJOB, I can say with some authority that huge swaths of organizations are using cloud-services such as Google Apps, Dropbox and Office 365 as part of their business process workflows. For me, one regular component that touches the “cloud” is … Read more

Categories R Tags ExcerptFavorite

Zen and The Art of Competing Against MBA’s

“I appreciate your ambition, but we’re looking for an MBA…” My senior manager smiled and indicated the topic was closed. Despite the fact I was effectively running our direct mail program in the absence of my recently departed boss, the door was closed and locked. I quit two months later. Within three years, I was promoted … Read more

Categories R Tags ExcerptFavorite

stringfix : new R package for string manipulation in a %>% way

I usually write around here in french and mainly report on French Hospitals data managment and the statistical tasks they imply. As today’s post is about a new package I have created, I’ll be writing in english. The package is called stringfix because it uses infix operators to manipulate character strings. This post is an … Read more

Categories R Tags ExcerptFavorite

Let’s call it tidysearch

R became 25 years old last year, and yet it’s only in relatively recent years that the language has really taken off with numerous conferences every year driven by a passionate and vibrant community of users. A large part of this has been driven by an ecosystem of R packages called the Tidyverse, which many … Read more

Categories R Tags ExcerptFavorite

Animating Data Transformations III – separate()

We recently have published two blogs on animating data transformations. The first, Animating Data Transformations, illustrated the spread() and gather() functions for going between wide and tall representations of data. The second, Animating Data Transformations II, covered the unnest() function for transforming a list column into a one value per row format. Today, we’re going to … Read more

Categories R Tags ExcerptFavorite

Le Monde puzzle [#1081]

A “he said-she said” Le Monde mathematical puzzle (again in the spirit of the famous Singapore high-school birthdate problem): Abigail and Corentin are both given a positive integer, a and b, such that a+b is either 19 or 20. They are asked one after the other and repeatedly if they are sure of the other’s … Read more

Categories R Tags ExcerptFavorite

Introduction to Data Analysis in RStudio

I’ve just started doing one of my favourite parts of my job – teaching a term of Data Analysis in R to about three hundred Bioscientists in their first year of higher education. My blog last week included a figure of their expected level of enjoyment: However,  I find they become very competent in both statistics … Read more

Categories R Tags ExcerptFavorite

LondonR calling

It’s the New Year and we’re kicking off 2019 with our first LondonR! The meetup took place on the 15th of January, and we were delighted to have about 100 people in attendance. With excellent speakers lined-up and a free bar for networking, we started 2019 with a BANG! Please find all the presentations here. Dawid Kaledkowski, ClickMeeting – … Read more

Categories R Tags ExcerptFavorite

If wealth had anything to do with intelligence…

…the richest man on earth would have a fortune of no more than $43,000! If you don’t believe me read this post! Have you ever thought about the distribution of wealth as a function of some quality? Especially rich people pride themselves on extraordinary abilities, so that they somehow “deserve” their wealth. Now “abilities” is … Read more

Categories R Tags ExcerptFavorite

Level up your Data Visualizations with quick plot

K-Means plot for Spotify Data Visualization is an essential part of a Data Scientists workflow. It allows us to visually understand our problem, analyses our models, and allows us to provide deep meaningful understanding to communities. As Data Scientists, we always look new ways of improving our data science workflow. Why should I use this over … Read more

How to prepare data for NLP (text classification) with Keras and TensorFlow

In the past, I have written and taught quite a bit about image classification with Keras (e.g. here). Text classification isn’t too different in terms of using the Keras principles to train a sequential or function model. You can even use Convolutional Neural Nets (CNNs) for text classification. What is very different, however, is how to … Read more

Categories R Tags ExcerptFavorite

Onboard and Offboard Data Manipulation in Flexdashboard

Harrison Schramm is a Professional Statistician and Non-Resident Senior Fellow at the Center for Strategic and Budgetary Assessments. The Shiny set of tools, and, by extension, Flexdashboard, give professional analysts tools to rapidly put interactive versions of their work in the hands of clients. Frequently, an end user will interact with data by either uploading … Read more

Categories R Tags ExcerptFavorite

RStudio Server on Azure

RStudio Server Pro is now available on the Azure Marketplace, the company announced on the RStudio Blog earlier this month. This means you can launch RStudio Server Pro on an virtual machine with the memory, disk, and CPU configuration of your choice, and pay by the minute for the VM instance plus a the RStudio … Read more

Categories R Tags ExcerptFavorite

Image Dithering in R

This January I played the most intriguing computer game I’ve played in ages: The Return of the Obra Dinn. Except for being a masterpiece of murder-mystery storytelling it also has the most unique art-style as it only uses black and white pixels. To pull this off Obra Dinn makes use of image dithering: the arrangement … Read more

Categories R Tags ExcerptFavorite

Tibbles, checking examples, & character encodings

Recently I’ve been preparing my gratia package for submission to CRAN. During my pre-flight testing I noticed an issue under Windows checking the examples in the package against the reference output I generated on linux. In the latest release of the tibble package, the way tibbles are printed has changed subtly and in a way … Read more

Categories R Tags ExcerptFavorite

What’s the deal with wind chill?

WIND SPEEDS UP HEAT LOSS Twenty years ago, a German colleague asked me what the deal was with wind chill. I guess they didn’t have it in Germany. I explained it was an attempt to communicate how it feels when there is a low temperature combined with wind.  But my colleague wanted to know how … Read more

Categories R Tags ExcerptFavorite

rstudio::conf 2019

Last week RStudio hosted their conference, rstudio::conf, in Austin and a whole lot of members of the R community came to see what’s new, where the community and the field might be heading and to enjoy tacos. R in production A major theme from this conference was R in production. Joe Cheng kicked this one off in … Read more

Categories R Tags ExcerptFavorite

An introduction to Stan with R

In this example, we want to fit the model to Canadian lynx predator and snowshoe hare prey with respective populations between 1900 and 1920, based on the number of pelts collected annually by the Hudson’s Bay Company. Notation and mathematical model We denote \(u(t)\) and \(v(t)\) as the prey and predator population respectively. The differential … Read more

Categories R Tags ExcerptFavorite

Correlated longitudinal data with varying time intervals

I was recently contacted to see if simstudy can create a data set of correlated outcomes that are measured over time, but at different intervals for each individual. The quick answer is there is no specific function to do this. However, if you are willing to assume an “exchangeable” correlation structure, where measurements far apart … Read more

Categories R Tags ExcerptFavorite

A Comprehensive List of Handy R Packages

Stuff I have found super useful for work and life Gang SuBlockedUnblockFollowFollowing Jan 21 Whether Python or R is more superior for Data Science / Machine Learning is an open debate. Despite of its quirkiness and not-so-true-but-generally-perceived slowness, R really shines in exploratory data analysis (EDA), in terms of data wrangling, visualizations, dashboards, myriad choices of … Read more

The Shiny Module Design Pattern

element. The object associated with server is simply a function with no body. To execute this minimal shiny app, we pass the ui and server objects to the shinyApp() function. The app will be spun up either in RStudio’s Viewer pane, in a Viewer window, or in your default Web browser, depending on your settings … Read more

Categories R Tags ExcerptFavorite

Chaining effect in clustering

In a previous blog post, I explained how we can leverage the k-means clustering algorithm to count the number of red baubles on a Christmas tree. This method fails however if we put Christmas tinsels on it. Let’s find a solution for this more difficult case. Filter red points Let’s first proceed as we did … Read more

Categories R Tags ExcerptFavorite