﻿ R – Page 22 – Data Science Austria

## Function basis and regression

In the first part of the course on linear models, we’ve seen how to construct a linear model when the vector of covariates \boldsymbol{x} is given, so that \mathbb{E}(Y|\boldsymbol{X}=\boldsymbol{x}) is either simply \boldsymbol{x}^\top\boldsymbol{\beta} (for standard linear models) or a functional of \boldsymbol{x}^\top\boldsymbol{\beta} (in GLMs). But more generally, we can consider transformations of the covariates, so … Read more Function basis and regression

## The probabilities implied by bookmaker odds: Introducing the ‘implied’ package

[This article was first published on R – opisthokonta.net, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. My package for converting bookmaker odds into probabilities is now on … Read more The probabilities implied by bookmaker odds: Introducing the ‘implied’ package

## R tips and tricks – Paste a plot from R to a word file

[This article was first published on R – Eran Raviv, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. In this post you will learn how to properly paste … Read more R tips and tricks – Paste a plot from R to a word file

## Using R: 10 years with R

Yesterday, 29 Feburary 2020, was the 20th anniversary of the release R 1.0.0. Jozef Hajnala’s blog has a cute anniversary post with some trivia. I realised that it is also (not to the day, but to the year) my R anniversary. I started using R in 2010, during my MSc project in Linköping. Daniel Nätt, … Read more Using R: 10 years with R

## Predicting the misclassification cost incurred in air pressure system failure in heavy vehicles

[This article was first published on Stories Data Speak, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Abstract The Air Pressure System (APS) is a type of function … Read more Predicting the misclassification cost incurred in air pressure system failure in heavy vehicles

## SR2 Chapter 2 Hard

[This article was first published on Brian Callander, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Here’s my solution to the hard exercises in chapter 2 of McElreath’s … Read more SR2 Chapter 2 Hard

## Source code chapter of ‘evidence-based software engineering’ reworked

[This article was first published on The Shape of Code » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. The Source code chapter of my evidence-based software … Read more Source code chapter of ‘evidence-based software engineering’ reworked

## Log transform or log link? And confounding variables. by @ellis2013nz

Last week I wrote about the relationship between weight and height in US adults, as seen in the US Centers for Disease Control and prevention (CDC) Behavioral Risk Factor Surveillance System, an annual telephone survey of around 400,000 interviews per year. In particular, I tested the widely-circulated claim that Body Mass Index (BMI) exaggerates the … Read more Log transform or log link? And confounding variables. by @ellis2013nz

## The significance of the sector on the salary in Sweden, a comparison between different occupational groups, part 3

To complete the analysis on the significance of the sector on the salary for different occupational groups in Sweden I will in this post examine the correlation between salary and sector using statistics for education. The F-value from the Anova table is used as the single value to discriminate how much the region and salary … Read more The significance of the sector on the salary in Sweden, a comparison between different occupational groups, part 3

## What to know before you adopt Hugo/blogdown

Fancy (re-)creating your website using Hugo, with or without blogdown?Feeling a bit anxious?This post is aimed at being the Hugo equivalent of “What to know before you adopt a pet”.We shall go through things that can/will break in the future, and what you can do to prevent future pain. I’m writing this post with R … Read more What to know before you adopt Hugo/blogdown

## SR2 Chapter 2 Medium

[This article was first published on Brian Callander, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Here’s my solutions to the medium exercises in chapter 2 of McElreath’s … Read more SR2 Chapter 2 Medium

## Drawdowns by the data

[This article was first published on R on OSM, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. We’re taking a break from our series on portfolio construction for … Read more Drawdowns by the data

## How to Acquire Large Satellite Image Datasets for Machine Learning Projects

Introduction Historically, only governments and large corporations have had access to quality satellite images. In recent years, satellite image datasets have become available to anyone with a computer and an internet connection. The quality, quantity, and precision of these datasets is continuously improving, and there are many free and commercial platforms at your disposal to … Read more How to Acquire Large Satellite Image Datasets for Machine Learning Projects

## All you need to know on PCA …

[This article was first published on François Husson, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. All you need to do with PCA is in Factoshiny! PCA – … Read more All you need to know on PCA …

## Machine Learning with R: A Hands-on Introduction from Robert Muenchen at Machine Learning Week, Las Vegas

[This article was first published on R-posts.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Join Robert Muenchen’s workshop about Machine Learning with R at Machine Learning Week … Read more Machine Learning with R: A Hands-on Introduction from Robert Muenchen at Machine Learning Week, Las Vegas

## XGBoostLSS – An extension of XGBoost to probabilistic forecasting

[This article was first published on R-posts.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Introduction  To reason rigorously under uncertainty we need to invoke the language of  … Read more XGBoostLSS – An extension of XGBoost to probabilistic forecasting

## Uncovering Government Bias with Statistical Modelling

A data-driven analysis of the Australian ‘Sports Rorts’ scandal You can run, but you can’t hide (from statistics). The Australian Liberal party is about to find this out the hard way. In recent weeks, the Liberal party has been accused of using \$100M of sporting grants to win votes in the lead up to the … Read more Uncovering Government Bias with Statistical Modelling

## Convolutional Neural Network under the Hood

[This article was first published on R – Hi! I am Nagdev, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Neural networks have really taken over for solving … Read more Convolutional Neural Network under the Hood

## Building A base dplyr With Primitives: Grouped Operations, Pipes and More!

[This article was first published on Random R Ramblings, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. In my last post we looked at how we can recreate … Read more Building A base dplyr With Primitives: Grouped Operations, Pipes and More!

## Student’s t-test in R and by hand: how to compare two groups under different scenarios

One of the most important test within the branch of inferential statistics is the Student’s t-test. The Student’s t-test for two samples is used to test whether two groups (two populations) are different in terms of a quantitative variable, based on the comparison of two samples drawn from these two groups. In other words, a … Read more Student’s t-test in R and by hand: how to compare two groups under different scenarios

## Version 0.4.0 of nnetsauce, with fruits and breast cancer classification

English version / Version en français A new version of nnetsauce, version 0.4.0, is now available on Pypi and for R. As usual, you can install it on Python by using the following commands (command line): pip install nnetsauce And if you’re using R, it’s still (R console): library(devtools) devtools::install_github(“thierrymoudiki/nnetsauce/R-package”) library(nnetsauce) The R version may … Read more Version 0.4.0 of nnetsauce, with fruits and breast cancer classification

## Data Science in Manufacturing: An Overview

[This article was first published on R – Hi! I am Nagdev, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Original article published in opendatascience.com In the last … Read more Data Science in Manufacturing: An Overview

## Developing a complex R Shiny app – the good, the bad and the ugly

Together with Clara Bicalho (UC Berkeley) and Sisi Huang (WZB), I recently developed a web application that acts as a convenient interface to the DeclareDesign R package and its repository of research designs, DesignLibrary. This web application, which we called DeclareDesign Wizard, allows users to investigate and customize research designs in their web browser. We … Read more Developing a complex R Shiny app – the good, the bad and the ugly

## MLOPS for R with Azure Machine Learning

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. The video recording of my RStudio::conf talk, MLOPS for R with Azure Machine … Read more MLOPS for R with Azure Machine Learning

## if … else and ifelse

[This article was first published on Rcrastinate, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Let’s make this a quick and quite basic one. There is this incredibly … Read more if … else and ifelse

## RStudio Package Manager 1.1.2 – Windows

[This article was first published on RStudio Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. RStudio Package Manager 1.1.2 introduces beta support for Windows packagebinaries. These binaries … Read more RStudio Package Manager 1.1.2 – Windows

## chain of lynx and drove of hares

[This article was first published on R – Xi’an’s Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. A paper (and an introduction to the paper) in Nature … Read more chain of lynx and drove of hares

## A New Baby Boom Poster

[This article was first published on R on kieranhealy.org, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. I wanted to work through a few examples of more polished … Read more A New Baby Boom Poster

## Testing REST APIs with Newman

[This article was first published on r-bloggers | STATWORX, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. REST APIs have become a quasi-standard, be it to provide an … Read more Testing REST APIs with Newman

## Including Function Factories in an R Package: Using Collate

[This article was first published on Random R Ramblings, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. This week I was working on a package which included a … Read more Including Function Factories in an R Package: Using Collate

## New xgboost defaults

[This article was first published on Philipp Probst, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. xgboost is the most famous R package for gradient boosting and it … Read more New xgboost defaults

## The p-direction: A Bayesian equivalent of the p-value?

[This article was first published on R on easystats, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. The Bayesian framework is powerful and allows for an incredible amount … Read more The p-direction: A Bayesian equivalent of the p-value?

## 3 recommended books on learning R

I sometimes get asked how I got started learning R. I thought I would use this post to go through a few books I read along the way which have been highly useful. The Art of R Programming The Art of R Programming: A Tour of Statistical Software Design is one of the first R … Read more 3 recommended books on learning R

I built robustreg in 2006 and at the time the major stat packages did not have a robust regression available.  Below are graphs of weekly and cumulative downloads from just the RStudio mirror.  I would estimate total downloads at over 150,000. The median_rcpp() function is written in C++ and is multiple times faster than the … Read more R Robustreg Package Downloads

## opentripplanner: Fast and Easy Multimodal Trip Planning in R with OpenTripPlanner

With services like Google Maps, finding the fastest route from A to B has become quick, cheap, and easy. Not just for driving but walking, cycling and public transport too. But in the field of transport studies, we often want not only a single route, but thousands or millions of routes. This is where we … Read more opentripplanner: Fast and Easy Multimodal Trip Planning in R with OpenTripPlanner

## RStudio 1.3 Preview: Integrated Tutorials

[This article was first published on RStudio Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. This blog post is part of a series on new features in … Read more RStudio 1.3 Preview: Integrated Tutorials

## Book slides – Analyzing Financial and Economic Data with R

[This article was first published on R on msperlin, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. The slides for my newly released book Analyzing Financial and Economic … Read more Book slides – Analyzing Financial and Economic Data with R

## multiplying the bars

[This article was first published on R – Xi’an’s Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. The latest Riddler makes the remark that the expression |-1|-2|-3| … Read more multiplying the bars

## Analyzing global literacy rates data in R

I was curious to see the countries whose average female literacy rate was greater or equal to their male average literacy rate. Most of the countries on this graph are small countries belonging to the South American and African regions which is really surprising. Countries with the highest discrepancy between male and female literacy rates … Read more Analyzing global literacy rates data in R

## January 2020: “Top 40” New R Packages

[This article was first published on R Views, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. One hundred forty-seven new packages made it to CRAN in January. Here … Read more January 2020: “Top 40” New R Packages

## Le Monde puzzle [#1132]

[This article was first published on R – Xi’an’s Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. A vaguely arithmetic challenge as Le weekly Monde current mathematical … Read more Le Monde puzzle [#1132]

## The significance of the sector on the salary in Sweden, a comparison between different occupational groups, part 2

In my last post, I examined the significance of the sector on the salary for different occupational groups using statistics from different regions. In previous posts I have shown a correlation between the salary and experience and also salary and education, In this post, I will examine the correlation between salary and sector using statistics … Read more The significance of the sector on the salary in Sweden, a comparison between different occupational groups, part 2

## Synthetic micro-datasets: a promising middle ground between data privacy and data analysis

Intro: the need for microdata, and the risk of disclosure Survey and administrative data are essential for scientific research, however accessing such datasetscan be very tricky, or even impossible. In my previous job I was responsible for getting access tosuch “scientific micro-datasets” from institutions like Eurostat.In general, getting access to these micro datasets was only … Read more Synthetic micro-datasets: a promising middle ground between data privacy and data analysis

## digest 0.6.25: Spookyhash bugfix

[This article was first published on Thinking inside the box , and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. And a new version of digest is getting onto … Read more digest 0.6.25: Spookyhash bugfix

## Nifty Upcoming Enhancements to unpack/to

[This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. We have some really nifty upcoming enhancements to wrapr unpack/to. … Read more Nifty Upcoming Enhancements to unpack/to

## Body Mass Index by @ellis2013nz

BMI has an expectations management problem Body Mass Index (BMI) is an attempt to give a quick back-of-envelope answer to the question “if someone weighs W kg, is that a lot or not very much?” Clearly the answer to that question has to take into account at a minimum the person’s height; in general, whatever … Read more Body Mass Index by @ellis2013nz

## RcppSimdJson 0.0.2: First Update!

[This article was first published on Thinking inside the box , and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Following up on the initial RcppSimdJson release, a first … Read more RcppSimdJson 0.0.2: First Update!

## R is turning 20 years old next Saturday. Here is how much bigger, stronger and faster it got over the years

[This article was first published on Jozef’s Rblog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. It is almost the 29th of February 2020! A day that is … Read more R is turning 20 years old next Saturday. Here is how much bigger, stronger and faster it got over the years

## relgam: Fitting reluctant generalized additive models

Introduction and motivation tl;dr: Reluctant generalized additive modeling (RGAM) produces highly interpretable sparse models which allow non-linear relationships between the response and each individual feature. However, non-linear relationships are only included if deemed important in improving prediction performance. RGAMs working with quantitative, binary, count and survival responses and is computationally efficient. Consider the supervised learning … Read more relgam: Fitting reluctant generalized additive models