Twins on the up

Are multiple births on the increase? My twin boys turned 5 years old today. Wow, time flies. Life is never dull, because twins are still seen as something of a novelty, so wherever we go, we find ourselves in conversation with strangers, who are intrigued by the whole thing. In order to save time if … Read more

Categories R Tags ExcerptFavorite

Rsampling Fama French

Today we will continue our work on Fama French factor models, but more as a vehicle to explore some of the awesome stuff happening in the world of tidy models. For new readers who want get familiar with Fama French before diving into this post, see here where we covered importing and wrangling the data, … Read more

Categories R Tags ExcerptFavorite

My introductory course on Bayesian statistics

So, after having held workshops introducing Bayes for a couple of years now, I finally pulled myself together and completed my DataCamp course: Fundamentals of Bayesian Data Analysis in R! ? While it’s called a course, it’s more like a 4 hour workshop and — without requiring anything but basic R skills and a vague … Read more

Categories R Tags ExcerptFavorite

Teaching and Learning Materials for Data Visualization

Data Visualization: A Practical Introduction will begin shipping next week. I’ve written an R package that contains datasets, functions, and a course packet to go along with the book. The socviz package contains about twenty five datasets and a number of utility and convenience functions. The datasets range in size from things with just a … Read more

Categories R Tags ExcerptFavorite

Day 12 – little helper dive

We at STATWORX work a lot with R and we often use the same little helper functions within our projects. These functions ease our daily work life by reducing repetitive code parts or by creating overviews of our projects. At first, there was no plan to make a package, but soon I realised, that it … Read more

Categories R Tags ExcerptFavorite

Visualizing Hurricane Data with Shiny

Motivation for Project Around the time that I was selecting a topic for this project, my parents and my hometown found themselves in the path of a Category 1 hurricane. Thankfully, everyone was ok, and there was only minor damage to their property. But this event made me think about how long it had been … Read more

Categories R Tags ExcerptFavorite

Scraping the Turkey Accordion

Related To leave a comment for the author, please follow the link and comment on their blog: R on datawookie. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) … Read more

Categories R Tags ExcerptFavorite

Using ggplot2 for functional time series

I spoke yesterday about using ggplot2 for functional data graphics, rather than the custom-built plotting functionality available in the many functional data packages, including my own rainbow package written with Hanlin Shang. It is a much more powerful and flexible way to work, so I thought it would be useful to share some examples. French … Read more

Categories R Tags ExcerptFavorite

Network Centrality in R: New ways of measuring Centrality

This is the third post of a series on the concept of “network centrality” withapplications in R and the package netrankr. The last part introduced the concept ofneighborhood-inclusion and its implications for centrality. In this post, weextend the concept to a broader class of dominance relations by deconstructing indicesinto a series of building blocks and … Read more

Categories R Tags ExcerptFavorite

Geocomputation with R – the afterword

I am extremely proud to announce that Geocomputation with R is complete.It took Robin, Jannes, and me almost 2 years of collaborative planning, writing, refinement, and deployment to make the book available for anyone interested in open source, command-line approaches for handling geographic data.We’re very happy that it’s now ready to present to the world … Read more

Categories R Tags ExcerptFavorite

Reading List Faster With parallel, doParallel, and pbapply

I have several tables that I would like to load as a sole data frame. Derived functions from read. table () have a lot of convenient features, but it seems like there is a lot of steps in the implementation that would slow things down. The gain in performance of reading 29 CSV files (about … Read more

Categories R Tags ExcerptFavorite

Sharing Modeling Pipelines in R

Reusable modeling pipelines are a practical idea that gets re-developed many times in many contexts. wrapr supplies a particularly powerful pipeline notation, and a pipe-stage re-use system (notes here). We will demonstrate this with the vtreat data preparation system. Our example task is to fit a model on some arbitrary data. Our model will try … Read more

Categories R Tags ExcerptFavorite

Le Monde puzzle [#1075]

A new Le Monde mathematical puzzle in the digit category: Find the largest number such that each of its internal digits is strictly less than the average of its two neighbours. Same question when all digits differ. For instance, n=96433469 is such a number. When trying pure brute force (with the usual integer2digits function!) le=solz=3 … Read more

Categories R Tags ExcerptFavorite

Day 11 – little helper trim

We at STATWORX work a lot with R and we often use the same little helper functions within our projects. These functions ease our daily work life by reducing repetitive code parts or by creating overviews of our projects. At first, there was no plan to make a package, but soon I realised, that it … Read more

Categories R Tags ExcerptFavorite

DB connected R application on open-source Shiny server, part 1

As a follow-up of my previous study of Australian politicians on Twitter I’ve decided to build a more sophisticated, autonomous solution. The idea at glance: Collect regularly tweets from Members of Australian Parliament Store them in the database Visualize findings (in up-to-date state) in web dashboard A goal here is to build a solution that … Read more

Categories R Tags ExcerptFavorite

Reflections on the 10th anniversary of the Revolutions blog

On December 9 2008, very nearly ten years ago, the first post on Revolutions was published. Way back then, this blog was part of a young startup called Revolution Computing, which later became Revolution Analytics. (That name persists to this day in the URL of this blog.) The idea at that time was to introduce … Read more

Categories R Tags ExcerptFavorite

5½ Reasons to Ditch Spreadsheets for Data Science: Code is Poetry

The post 5½ Reasons to Ditch Spreadsheets for Data Science: Code is Poetry appeared first on The Lucid Manager. When I studied civil engineering some decades ago, we solved all our computing problems by writing code. Writing in BASIC or PASCAL, I could quickly perform fundamental engineering analysis, such as reinforced concrete beams, with my … Read more

Categories R Tags ExcerptFavorite

The ‘knight on an infinite chessboard’ puzzle: efficient simulation in R

Previously in this series: I’ve recently been enjoying The Riddler: Fantastic Puzzles from FiveThirtyEight, a wonderful book from 538’s Oliver Roeder. Many of the probability puzzles can be productively solved through Monte Carlo simulations in R. Here’s one that caught my attention: Suppose that a knight makes a “random walk” on an infinite chessboard. Specifically, … Read more

Categories R Tags ExcerptFavorite

Great post Yash!

Great post Yash! For those readers interested in getting data from the fitbit API using R I’ve documented the process here: https://towardsdatascience.com/the-gamification-of-fitbit-how-an-api-provided-the-next-level-of-training-eaf7b267af00 Related R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, … Read more

Categories R Tags ExcerptFavorite

ggmap Tutorial Updated!

Y’all it may have taken me a little time, but I did listen. Thank you for your emails. Because of you, I have now updated my ggmap tutorial to address the Google Static Map API service issues! For those of you who have been following along with issue #51 in the ggmap repo, you’ll notice … Read more

Categories R Tags ExcerptFavorite

Day 10 – little helper %nin%

We at STATWORX work a lot with R and we often use the same little helper functions within our projects. These functions ease our daily work life by reducing repetitive code parts or by creating overviews of our projects. At first, there was no plan to make a package, but soon I realised, that it … Read more

Categories R Tags ExcerptFavorite

The Need for Speed Part 1: Building an R Package with Fortran (or C)

Everyone who has ever used R has, at one time or another, wished for an increase in R’s speed. If you haven’t, you’re not using R hard enough! Recently, as part of some research on credibility, I was calculating layer loss costs for millions of simulated loss observations. As I progressed, the R markdown document … Read more

Categories R ExcerptFavorite

An 8-hour course on R and Data Mining

I will run an 8-hour course on R and Data Mining at Black Mountain, CSIRO, Australia on 10 & 13 December 2018. The course materials, incl. slides, R scripts and datasets, are available at http://www.rdatamining.com/training/course. Below is outline of the course. Part I:– R Programming: basics of R language and programming, parallel computing, and data … Read more

Categories R ExcerptFavorite

CRAN Release of R/exams 2.3-2

New minor release of the R/exams package to CRAN, containing a range of smaller improvements and bug fixes. Notably scanning of written NOPS exams is enhanced and made more reliable and a new exercise template demonstrates how to use advanced processing of numeric answers in Moodle. Version 2.3-2 of the one-for-all exams generator R/exams has … Read more

Categories R ExcerptFavorite

Interesting packages taken from R/Pharma

A few month ago I joined the R/Pharma conference in Cambridge, MA. As a take away I thought of my project and how I can improve, with solutions others provided. Mainly solutions in R are R-packages. So I’m a R-Shiny programmer in a regulated environment, so the list of the solutions I took are mainly … Read more

Categories R ExcerptFavorite

Canada Map

I taught my Data Visualization seminar in Philadelphia this past Friday and Saturday. It covers most of the content of my book, including a unit on making maps. The examples in the book are from the United States. But what about other places? Two of the participants were from Canada, and so here’s an example … Read more

Categories R ExcerptFavorite

Smartly select and mutate data frame columns, using dict

Motivation The dplyr functions select and mutate nowadays are commonly applied to perform data.frame column operations, frequently combined with magrittrs forward %>% pipe. While working well interactively, however, these methods often would require additional checking if used in “serious” code, for example, to catch column name clashes. In principle, the container package provides a dict-class … Read more

Categories R ExcerptFavorite

Day 09 – little helper object_size_in_env

We at STATWORX work a lot with R and we often use the same little helper functions within our projects. These functions ease our daily work life by reducing repetitive code parts or by creating overviews of our projects. At first, there was no plan to make a package, but soon I realised, that it … Read more

Categories R ExcerptFavorite

It was twenty years ago …

… this week that I made a first cameo in the debian/changelog for the Debian R package: r-base (0.63.1-1) unstable; urgency=low New upstream release Linked html directory to /usr/doc/r-base/doc/html (Dirk Eddelbuettel) – Douglas Bates [email protected] Fri, 4 Dec 1998 14:22:19 -0600 For the next few years I assisted Doug here and there, and then formally … Read more

Categories R ExcerptFavorite

Timing Grouped Mean Calculation in R

This note is a comment on some of the timings shared in the dplyr-0.8.0 pre-release announcement. The original published timings were as follows: With performance metrics: measurements are marketing. So let’s dig in the above a bit. These timings are of the kind of small task large number of repetition breed that Matt Dowle writes … Read more

Categories R ExcerptFavorite

Automated Dashboard visualizations with distribution in R

Categories Programming Tags Data Visualisation R Markdown R Programming In this article, you learn how to make Automated Dashboard visualizations with distribution in R. First you need to install the `rmarkdown` package into your R library. Assuming that you installed the `rmarkdown`, next you create a new `rmarkdown` script in R. After this you type … Read more

Categories R ExcerptFavorite

R:case4base – Sorting data with base R

In this post in the R:case4base series we will examine sorting (ordering) data in base R. We will learn to sort our data based on one or multiple columns, with ascending or descending order and as always look at alternatives to base R, namely the tidyverse’s dplyr and data.table to show how we can achieve … Read more

Categories R ExcerptFavorite

R some blog 2018-12-08 04:19:00

Motivation The dplyr functions select and mutate nowadays are commonly applied to perform data.frame column operations, frequently combined with magrittrs forward %>% pipe. While working well interactively, however, these methods often would require additional checking if used in “serious” code, for example, to catch column name clashes. In principle, the container package provides a dict-class … Read more

Categories Featured, R ExcerptFavorite

Day 08 – little helper intersect2

We at STATWORX work a lot with R and we often use the same little helper functions within our projects. These functions ease our daily work life by reducing repetitive code parts or by creating overviews of our projects. At first, there was no plan to make a package, but soon I realised, that it … Read more

Categories Featured, R ExcerptFavorite

“Increase sample size until statistical significance is reached” is not a valid adaptive trial design; but it’s fixable.

TLDR: Begin with N of 10, increase by 10 until p < 0.05 or max N reached. This design has inflated type-I error. Lower p-value threshold needed to ensure specified type-I error rate. The number of interim analyses and max N affect the type-I error rate. Threshold can be identified using simulation. A recent Facebook … Read more

Categories Featured, R ExcerptFavorite

Automated Dashboard Visualizations with Ranking in R

Categories Programming Tags Data Visualisation R Markdown R Programming In this article, you learn how to make Automated Dashboard Visualizations with Ranking in R. First you need to install the `rmarkdown` package into your R library. Assuming that you installed the `rmarkdown`, next you create a new `rmarkdown` script in R. After this you type … Read more

Categories R ExcerptFavorite

Shinyfit: Advanced regression modelling in a shiny app

Many of our projects involve getting doctors, nurses, and medical students to collect data on the patients they are looking after. We want to involve many of them in data analysis, without the requirement for coding experience or access to statistical software. To achieve this we have built Shinyfit, a shiny app for linear, logistic, … Read more

Categories R ExcerptFavorite

R community update: announcing useR Delhi December meetup and CFP

Time really does fly. It’s been 5 months since Delhi NCR useR group had come into being and our first meetup. It was a successful event which included sessions featuring an R-core member and a veteran data scientist. More importantly, the 50+ community members who’d turned up took part in stimulating discussions and got to … Read more

Categories R ExcerptFavorite

R Functions for Bayesian Stats and Summaries

A new update of my sjstats-package just arrived at CRAN. This blog post demontrates those functions of the sjstats-package that deal especially with Bayesian models. The update contains some new and some revised functions to compute summary statistics of Bayesian models, which are now described in more detail.

Categories R ExcerptFavorite

Automatic GPUs

A reproducible R / Python approach to getting up and running quickly on GCloud with GPUs in Tensorflow “A high view of a sea of clouds covering a mountain valley in the Dolomites” by paul morris on Unsplash Backstory After completing Google’s excellent Data Engineering Certified Specialization on Coursera recently (*which I highly recommend), I … Read more