R Package Update: urlscan

The urlscan package (an interface to the urlscan.io API) is now at version 0.2.0 and supports urlscan.io’s authentication requirement when submitting a link for analysis. The service is handy if you want to learn about the details — all the gory technical details — for a website. For instance, say you wanted to check on … Read more

Categories R Tags ExcerptFavorite

Synthesising Multiple Linked Data Sets and Sequences in R

In my last post I looked at generating synthetic data sets with the ‘synthpop’ package, some of the challenges and neat things the package can do. It is simple to use which is great when you have a single data set with independent features. This post will build on the last post by tackling other … Read more

Categories R Tags ExcerptFavorite

The power of tapping into your community for support

This week the owner of my favorite Mexican restaurant in Baltimore, Rosalyn Vera, got death and arson threats. I could have been a bystander, but I tapped into my network and asked for help and she has received it. It’s been great to see the power of the community in action. The backstory So, I … Read more

Categories R Tags ExcerptFavorite

Multiple Data (Time Series) Streams Clustering

Related To leave a comment for the author, please follow the link and comment on their blog: Peter Laurinec. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics … Read more

Categories R Tags ExcerptFavorite

Navigate through Decennial Census and American Community Survey

Finding the right content in census data can be daunting. Just give you an idea how complex the census data are, there are 1127 tables and 25070 columns of table contents in the 2012-2017 ACS 5-year summary file alone. 2010 decennial census summary file 1 333 8959 2012-2017 5-year ACS summary file 1127 25070 2017 … Read more

Categories R Tags ExcerptFavorite

Homebrew 2.0.0 Released == homebrewanalytics package updated

A major new release of Homebrew has landed and now includes support for Linux as well as Windows! via the Windows Subsystem for Linux. There are overall stability and speed improvements baked in as well. The aforelinked notification has all the info you need to see the minutiae. Unless you’ve been super-lax in updating, brew … Read more

Categories R Tags ExcerptFavorite

Simulating the Six Nations 2019 Rugby Tournament in R

I really like running simulation models before sporting events because they can give you so much more depth of understanding compared to the ‘raw’ odds that you get from the media or bookmakers, etc.  Yes, a team might have a “30% chance of winning a tournament we might hear”.  But there might be another strong … Read more

Categories R Tags ExcerptFavorite

Setting up your blog with RStudio and blogdown II: Workflow

Workflow In Part I of this series of post we setup our new blog using blogdown and Hugo. Once the blog is configured, this is the typical workflow I follow to write new posts and update my blog online: Open your blog project with RStudio Load the blogdown library and start the Hugo server and … Read more

Categories R Tags ExcerptFavorite

Tutorial: Sequential Pattern Mining in R for Business Recommendations

by Allison Koenecke, Data Scientist, AI & Research Group at Microsoft, with acknowledgements to Amita Gajewar and John-Mark Agosta. In this tutorial, Allison Koenecke demonstrates how Microsoft could recommend to customers the next set of services they should acquire as they expand their use of the Azure Cloud, by using a temporal extension to conventional … Read more

Categories R Tags ExcerptFavorite

Mandalaxies

One cannot escape the feeling that these mathematical formulas have an independent existence and an intelligence of their own, that they are wiser than we are, wiser even than their discoverers (Heinrich Hertz) I love spending my time doing mathematics: transforming formulas into drawings, experimenting with paradoxes, learning new techniques … and R is a perfect … Read more

Categories R Tags ExcerptFavorite

dqrng v0.0.5: New and updated RNGs

A new version of dqrng has made it onto the CRAN servers after a brief hick-up. Thanks to the CRAN team in general and Uwe Ligges in particular for their relentless efforts. This versions adds a new RNG to be used together with the provided distribution functions: The 64 bit version of the 20 rounds … Read more

Categories R Tags ExcerptFavorite

recogeo: A new R package to reconcile changing geographies boundaries (and corresponding variables)

Demographics information is usually reported in relation to precise boundaries: administrative, electoral, statistical, etc. Comparing demographics information reported at different point in time is often problematic because boundaries keep changing. The recogeo package faciliates reconciling boundaries and their data by a spatial analysis of the boundaries of two different periods. In this post, I explain … Read more

Categories R Tags ExcerptFavorite

Quantile regression in R

Quantile regression: what is it? Let be some response variable of interest, and let be a vector of features or predictors that we want to use to model the response. In linear regression, we are trying to estimate the conditional mean function, , by a linear combination of the features. While the conditional mean function … Read more

Categories R Tags ExcerptFavorite

rOpenSci Software Peer Review: Still Improving

rOpenSci’s suite of packages is comprised of contributions from staff engineers and the wider R community, bringing considerable diversity of skills, expertise and experience to bear on the suite. How do we ensure that every package is held to a high standard? That’s where our software review system comes into play: packages contributed by the … Read more

Categories R Tags ExcerptFavorite

How GPL makes me leave R for Python :-(

Being a data scientist in a startup I can program with several languages, but often R is a natural choice. Recently I wanted my company to build a product based on R. It simply seemed like a perfect fit. But this turned out to be a slippery slope into the open-source code licensing field, which … Read more

Categories R Tags ExcerptFavorite

Book review: Beyond Spreadsheets with R

Disclaimer: Manning publications gave me the ebook version of Beyond Spreadsheets with R – A beginner’s guide to R and RStudio by Dr. Jonathan Carroll free of charge. Beyond Spreadsheets with R shows you how to take raw data and transform it for use in computations, tables, graphs, and more. You’ll build on simple programming techniques … Read more

Categories R Tags ExcerptFavorite

Announcing new software peer review editors: Melina Vidoni and Brooke Anderson

We are pleased to welcome Brooke Anderson and Melina Vidoni to our team of Associate Editors for rOpenSci Software Peer Review. They join Scott Chamberlain, Anna Krystalli, Lincoln Mullen, Karthik Ram, Noam Ross and Maëlle Salmon. With the addition of Brooke and Melina, our editorial board now includes four women and four men, located in … Read more

Categories R Tags ExcerptFavorite

Using Data Science to read 10 years of Luxembourguish newspapers from the 19th century

I have been playing around with historical newspaper data (seehere andhere). I have extracted thedata from the largest archive available, as described in the previous blog post, and now createda shiny dashboard where it is possible to visualize the most common words per article, as well asread a summary of each article.The summary was made … Read more

Categories R Tags ExcerptFavorite

missing digit in a 114 digit number [a Riddler’s riddle]

A puzzling riddle from The Riddler (as Le Monde had a painful geometry riddle this week): this number with 114 digits 530,131,801,762,787,739,802,889,792,754,109,70?,139,358,547,710,066,257,652,050,346,294,484,433,323,974,747,960,297,803,292,989,236,183,040,000,000,000 is missing one digit and is a product of some of the integers between 2 and 99. By comparison, 76! and 77! have 112 and 114 digits, respectively. While 99! has 156 digits. … Read more

Categories R Tags ExcerptFavorite

Fast Static Maps Built with R

Luke Whyte posted an article (apologies for a Medium link) over on Towards Data Science showing how to use a command line workflow involving curl, node and various D3 libraries and javascript source files to build a series of SVG static maps. It’s well written and you should give it a read especially since he … Read more

Categories R Tags ExcerptFavorite

Price’s Protein Puzzle: 2019 update

Chains of amino acids strung together make up proteins and since each amino acid has a 1-letter abbreviation, we can find words (English and otherwise) in protein sequences. I imagine this pursuit began as soon as proteins were first sequenced, but the first reference to protein word-finding as a sport is, to my knowledge, “Price’s … Read more

Categories R Tags ExcerptFavorite

R Markdown Template for Business Reports

In this post I’d like to introduce the R Markdown template for business reports by INWTlab. It’s been my aim to have a nice and clean template that is easy to customize in colors, cover and logo. I know there are quite a few templates available, but I was missing one to be used in … Read more

Categories R Tags ExcerptFavorite

Quick Hit: Using seymour to Subscribe to your Git[la|hu]b Repo Issues in Feedly

The seymour Feedly API package has been updated to support subscribing to RSS/Atom feeds. Previously the package was intended to just treat your Feedly as a data source, but there was a compelling use case for enabling subscription support: subscribing to code repository issues. Sure, there’s already email notice integration for repository issues on most … Read more

Categories R Tags ExcerptFavorite

Time Travel with RStudio Package Manager 1.0.4

We all love packages. We don’t love when broken package environments prevent usfrom reproducing our work. In version 1.0.4 of RStudio Package Manager,individuals and teams can navigate through repository checkpoints,making it easy to recreate environments and reproduce work. The new release alsoadds important security updates, improvements for Git sources, further access toretired packages, and beta … Read more

Categories R Tags ExcerptFavorite

December 2108: “Top 40” New CRAN Packages

By my count, 157 new packages stuck to CRAN in December. Below are my “Top 40” picks in ten categories: Computational Methods, Data, Finance, Machine Learning, Medicine, Science, Statistics, Time Series, Utilities and Visualization. This is the first time I have used the Medicine category. I am pleased that a few packages that appear to … Read more

Categories R Tags ExcerptFavorite

Using custom scales with the ‚scales‘ package

Maybe you already heard of the package “scales” – and if you didn’t hear about it, you might have used it without knowing (e.g., in the context of ggplot2 graphs). I want to show you a few of the functionalities of the “scales” package. I will also show you how to create your own scales. … Read more

Categories R Tags ExcerptFavorite

New R package: load and chart oceanic storms

Mapping historical storms data is now a little bit easier. Off the back of this blog, I have authored an R package (available at basilesimon/noaastorms) that downloads, cleans and parses NOAA IBtrack data for you. The National Oceanic and Atmospheric Administration releases datasets known as International Best Track Archive for Climate Stewardship. These datasets are … Read more

Categories R Tags ExcerptFavorite

Correlation analysis of cyclically adjusted valuation measures and subsequent returns

In this post we’ll test three different cyclically-adjusted valuation measures: CAPE (earnings), CAPD (dividends) and CAPB (book value). CAPE is calculated like the P/E ratio, but by dividing the current real price with the last ten year’s average inflation-adjusted earnings. CAPD uses dividends instead of earnings, and CAPB uses book value. We’ll test the optimal … Read more

Categories R Tags ExcerptFavorite

EARL London 2019 announcement

EARL London is back for 2019! We are thrilled to announce that the Enterprise Applications of the R Language Conference will be returning to the Tower Hotel from the 10-12 September 2019. If you’d like to see what you can expect during 3 days of EARL, then check out our highlights from last year’s conference. We are pleased … Read more

Categories R Tags ExcerptFavorite

Using Travis-CI to Create R-bloggers for Taiwan

R-bloggers.com is a great platform for R users, but I sometimes feel awkward to publish posts on R-bloggers when I have things to share that are only relevant to users in Taiwan. Inspired by R-bloggers, I thought maybe I could use Travis-CI and GitHub to create a blog that automatically updates its posts by retrieving … Read more

Categories R Tags ExcerptFavorite

Creating a word cloud on R-bloggers posts

This post will go through how to create a word cloud of article titles scraped from the awesome R-bloggers. Our goal will be to use R’s rvest package to search through 50 successive pages on the site for article titles. The stringr and tm packages will be used for string cleaning and for creating a … Read more

Categories R Tags ExcerptFavorite

Hash Me If You Can

We are living in the era of Big Data but the problem of course is that the bigger our data sets become the slower even simple search operations get. I will now show you a trick that is the next best thing to magic: building a search function that practically doesn’t slow down even for … Read more

Categories R Tags ExcerptFavorite

benchmarkme: new version

When discussing how to speed up slow R code, my first question is what is your computer spec? It’s always surprised me that people are wondering why analysing big data is slow, yet they are using a five-year-old cheap laptop. Spending a few thousand pounds would often make their problems disappear. To quantify the impact … Read more

Categories R Tags ExcerptFavorite

Web Scraping Google Sheets with RSelenium

Photo by freestocks.org on Unsplash I love to learn new things and one of ways I learn best is by doing. Also it’s been said that you never fully understand a topic until you are able to explain it , I think blogging is a low barrier to explaining things. Someone I met at a local data … Read more

Categories R Tags ExcerptFavorite

Playing Around with Phyllotactic Spirals

I wanted to figure out how to create gif animation using the magick, so I decided I’ll try that out with ggplot2 spiral art. Loading up packages I’m definitely in love with “magick” right now ? library(tidyverse) ## for pretty much everything… library(magick) ## I’m now a magick fan!!! library(scales) ## Handy when it comes … Read more

Categories R Tags ExcerptFavorite

Forecast Framework Demo

(This article was first published on – R, and kindly contributed to R-bloggers) Want to learn how to do some forecasting with R? Here’s your chance to try out a new time-series forecasting package for R whose aim is to standardize and simplify the process of making and evaluating forecasts! The Reich Lab uses an … Read more

Categories R Tags ExcerptFavorite

Watch if R is running from Shiny

Today I discovered that the tag of a Shiny App gets theshiny-busy class when computation is done in the R process. Whichmeans that you can potentially watch with JavaScript if the R process isrunning. TIL — Shiny Apps switch to the ‘shiny-busy’ class when R is performingcomputation in the background.So it’s basically possible to useJavaScript … Read more

Categories R Tags ExcerptFavorite

The Basics of Cryptography

With Applications in R Source Have you ever wondered how companies securely store your passwords? Or how your credit card information is kept private when making online purchases? The answer is cryptography. The vast majority of internet sites now use some form of cryptography to ensure the privacy of its users. Even information such as emails … Read more

Scaling H2O analytics with AWS and p(f)urrr (Part 3)

This is the final installment of a three part series that looks at how we can leverage AWS, H2O and purrr in R to build analytical pipelines. In the previous posts I looked at starting up the environment through the EC2 dashboard on AWS’ website. The other aspect we looked at, in Part II, was … Read more

Categories R Tags ExcerptFavorite

SatRday LA – R Conference Announcement

Come and have fun with local useRs at SatRday LA on April 6, 2019. This is the first SatRday in LA, and the second in the states. If you have not heard of SatRday, it is a one-day affordable, inclusive, non-profit R conference organized by local R users. Important Details Who: All levels of R users When: April … Read more

Categories R Tags ExcerptFavorite

Graphing My Daily Phone Use

How many times do I look at my phone? I set up a small program on my phoneto count the screen activations and logged to a file. In this post I showwhat went wrong and how to plot the results. The data I set up a small program on my phone that counts every day … Read more

Categories R Tags ExcerptFavorite

Mathematical Notation in Online R/exams

Many R/exams exercises employ mathematical notation that needs to be converted and rendered suitably for inclusion in online exams. While R/exams attempts to set suitable defaults, an overview is provided of possible adjustments and when these might be useful or even necessary. Overview A popular use case of the R/exams package is the generation of … Read more

Categories R Tags ExcerptFavorite

Building Big Shiny Apps — A Workflow (1/2)

During the rstudio::conf(2019L), I’ve presented an eposter called “Building Big Shiny Apps — A Workflow”. You can find the poster here, and this blog post is an attempt at a transcription of what I’ve been talking about while presenting the poster. As this is a rather long topic, I’ve divided this post into two parts: … Read more

Categories R Tags ExcerptFavorite

R tips and tricks – higher-order functions

A higher-order function is a function that takes one or more functions as arguments, and\or returns a function as its result. This can be super handy in programming when you want to tilt your code towards readability and still keep it concise.Consider the following code: # Generate some fake data > eps <- rnorm(10, sd= … Read more

Categories R Tags ExcerptFavorite

Summarizing rstudio::conf 2019 Summaries with Tidy Text Techniques

To be honest, I planned on writing a review of this past weekend’srstudio::conf 2019,but several other people have already done a great job of doingthat—just check out Karl Broman’s aggregation of reviews at the bottomof the page here!(More on this in a second.) In short, my thoughts on the wholeexperience are captured perfectly by NickStrayer’s … Read more

Categories R Tags ExcerptFavorite