Function basis and regression

In the first part of the course on linear models, we’ve seen how to construct a linear model when the vector of covariates \boldsymbol{x} is given, so that \mathbb{E}(Y|\boldsymbol{X}=\boldsymbol{x}) is either simply \boldsymbol{x}^\top\boldsymbol{\beta} (for standard linear models) or a functional of \boldsymbol{x}^\top\boldsymbol{\beta} (in GLMs). But more generally, we can consider transformations of the covariates, so … Read more Function basis and regression

The probabilities implied by bookmaker odds: Introducing the ‘implied’ package

[This article was first published on R – opisthokonta.net, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. My package for converting bookmaker odds into probabilities is now on … Read more The probabilities implied by bookmaker odds: Introducing the ‘implied’ package

R tips and tricks – Paste a plot from R to a word file

[This article was first published on R – Eran Raviv, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. In this post you will learn how to properly paste … Read more R tips and tricks – Paste a plot from R to a word file

Predicting the misclassification cost incurred in air pressure system failure in heavy vehicles

[This article was first published on Stories Data Speak, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Abstract The Air Pressure System (APS) is a type of function … Read more Predicting the misclassification cost incurred in air pressure system failure in heavy vehicles

SR2 Chapter 2 Hard

[This article was first published on Brian Callander, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Here’s my solution to the hard exercises in chapter 2 of McElreath’s … Read more SR2 Chapter 2 Hard

Source code chapter of ‘evidence-based software engineering’ reworked

[This article was first published on The Shape of Code » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. The Source code chapter of my evidence-based software … Read more Source code chapter of ‘evidence-based software engineering’ reworked

Log transform or log link? And confounding variables. by @ellis2013nz

Last week I wrote about the relationship between weight and height in US adults, as seen in the US Centers for Disease Control and prevention (CDC) Behavioral Risk Factor Surveillance System, an annual telephone survey of around 400,000 interviews per year. In particular, I tested the widely-circulated claim that Body Mass Index (BMI) exaggerates the … Read more Log transform or log link? And confounding variables. by @ellis2013nz

The significance of the sector on the salary in Sweden, a comparison between different occupational groups, part 3

To complete the analysis on the significance of the sector on the salary for different occupational groups in Sweden I will in this post examine the correlation between salary and sector using statistics for education. The F-value from the Anova table is used as the single value to discriminate how much the region and salary … Read more The significance of the sector on the salary in Sweden, a comparison between different occupational groups, part 3

What to know before you adopt Hugo/blogdown

Fancy (re-)creating your website using Hugo, with or without blogdown?Feeling a bit anxious?This post is aimed at being the Hugo equivalent of “What to know before you adopt a pet”.We shall go through things that can/will break in the future, and what you can do to prevent future pain. I’m writing this post with R … Read more What to know before you adopt Hugo/blogdown

SR2 Chapter 2 Medium

[This article was first published on Brian Callander, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Here’s my solutions to the medium exercises in chapter 2 of McElreath’s … Read more SR2 Chapter 2 Medium

Drawdowns by the data

[This article was first published on R on OSM, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. We’re taking a break from our series on portfolio construction for … Read more Drawdowns by the data

How to Acquire Large Satellite Image Datasets for Machine Learning Projects

Introduction Historically, only governments and large corporations have had access to quality satellite images. In recent years, satellite image datasets have become available to anyone with a computer and an internet connection. The quality, quantity, and precision of these datasets is continuously improving, and there are many free and commercial platforms at your disposal to … Read more How to Acquire Large Satellite Image Datasets for Machine Learning Projects

Machine Learning with R: A Hands-on Introduction from Robert Muenchen at Machine Learning Week, Las Vegas

[This article was first published on R-posts.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Join Robert Muenchen’s workshop about Machine Learning with R at Machine Learning Week … Read more Machine Learning with R: A Hands-on Introduction from Robert Muenchen at Machine Learning Week, Las Vegas

XGBoostLSS – An extension of XGBoost to probabilistic forecasting

[This article was first published on R-posts.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Introduction  To reason rigorously under uncertainty we need to invoke the language of  … Read more XGBoostLSS – An extension of XGBoost to probabilistic forecasting

Uncovering Government Bias with Statistical Modelling

A data-driven analysis of the Australian ‘Sports Rorts’ scandal You can run, but you can’t hide (from statistics). The Australian Liberal party is about to find this out the hard way. In recent weeks, the Liberal party has been accused of using $100M of sporting grants to win votes in the lead up to the … Read more Uncovering Government Bias with Statistical Modelling

Building A base dplyr With Primitives: Grouped Operations, Pipes and More!

[This article was first published on Random R Ramblings, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. In my last post we looked at how we can recreate … Read more Building A base dplyr With Primitives: Grouped Operations, Pipes and More!

Student’s t-test in R and by hand: how to compare two groups under different scenarios

One of the most important test within the branch of inferential statistics is the Student’s t-test. The Student’s t-test for two samples is used to test whether two groups (two populations) are different in terms of a quantitative variable, based on the comparison of two samples drawn from these two groups. In other words, a … Read more Student’s t-test in R and by hand: how to compare two groups under different scenarios

Version 0.4.0 of nnetsauce, with fruits and breast cancer classification

English version / Version en français A new version of nnetsauce, version 0.4.0, is now available on Pypi and for R. As usual, you can install it on Python by using the following commands (command line): pip install nnetsauce And if you’re using R, it’s still (R console): library(devtools) devtools::install_github(“thierrymoudiki/nnetsauce/R-package”) library(nnetsauce) The R version may … Read more Version 0.4.0 of nnetsauce, with fruits and breast cancer classification

Developing a complex R Shiny app – the good, the bad and the ugly

Together with Clara Bicalho (UC Berkeley) and Sisi Huang (WZB), I recently developed a web application that acts as a convenient interface to the DeclareDesign R package and its repository of research designs, DesignLibrary. This web application, which we called DeclareDesign Wizard, allows users to investigate and customize research designs in their web browser. We … Read more Developing a complex R Shiny app – the good, the bad and the ugly

if … else and ifelse

[This article was first published on Rcrastinate, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Let’s make this a quick and quite basic one. There is this incredibly … Read more if … else and ifelse

RStudio Package Manager 1.1.2 – Windows

[This article was first published on RStudio Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. RStudio Package Manager 1.1.2 introduces beta support for Windows packagebinaries. These binaries … Read more RStudio Package Manager 1.1.2 – Windows

R for Excel Users: Pivot Tables, VLOOKUPs in R

[This article was first published on business-science.io, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. New business and financial analysts are finding R every day. Most of these … Read more R for Excel Users: Pivot Tables, VLOOKUPs in R

Including Function Factories in an R Package: Using Collate

[This article was first published on Random R Ramblings, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. This week I was working on a package which included a … Read more Including Function Factories in an R Package: Using Collate

New xgboost defaults

[This article was first published on Philipp Probst, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. xgboost is the most famous R package for gradient boosting and it … Read more New xgboost defaults

The p-direction: A Bayesian equivalent of the p-value?

[This article was first published on R on easystats, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. The Bayesian framework is powerful and allows for an incredible amount … Read more The p-direction: A Bayesian equivalent of the p-value?

opentripplanner: Fast and Easy Multimodal Trip Planning in R with OpenTripPlanner

With services like Google Maps, finding the fastest route from A to B has become quick, cheap, and easy. Not just for driving but walking, cycling and public transport too. But in the field of transport studies, we often want not only a single route, but thousands or millions of routes. This is where we … Read more opentripplanner: Fast and Easy Multimodal Trip Planning in R with OpenTripPlanner

Book slides – Analyzing Financial and Economic Data with R

[This article was first published on R on msperlin, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. The slides for my newly released book Analyzing Financial and Economic … Read more Book slides – Analyzing Financial and Economic Data with R

multiplying the bars

[This article was first published on R – Xi’an’s Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. The latest Riddler makes the remark that the expression |-1|-2|-3| … Read more multiplying the bars

Analyzing global literacy rates data in R

I was curious to see the countries whose average female literacy rate was greater or equal to their male average literacy rate. Most of the countries on this graph are small countries belonging to the South American and African regions which is really surprising. Countries with the highest discrepancy between male and female literacy rates … Read more Analyzing global literacy rates data in R

Le Monde puzzle [#1132]

[This article was first published on R – Xi’an’s Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. A vaguely arithmetic challenge as Le weekly Monde current mathematical … Read more Le Monde puzzle [#1132]

The significance of the sector on the salary in Sweden, a comparison between different occupational groups, part 2

In my last post, I examined the significance of the sector on the salary for different occupational groups using statistics from different regions. In previous posts I have shown a correlation between the salary and experience and also salary and education, In this post, I will examine the correlation between salary and sector using statistics … Read more The significance of the sector on the salary in Sweden, a comparison between different occupational groups, part 2

Synthetic micro-datasets: a promising middle ground between data privacy and data analysis

Intro: the need for microdata, and the risk of disclosure Survey and administrative data are essential for scientific research, however accessing such datasetscan be very tricky, or even impossible. In my previous job I was responsible for getting access tosuch “scientific micro-datasets” from institutions like Eurostat.In general, getting access to these micro datasets was only … Read more Synthetic micro-datasets: a promising middle ground between data privacy and data analysis

R is turning 20 years old next Saturday. Here is how much bigger, stronger and faster it got over the years

[This article was first published on Jozef’s Rblog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. It is almost the 29th of February 2020! A day that is … Read more R is turning 20 years old next Saturday. Here is how much bigger, stronger and faster it got over the years

relgam: Fitting reluctant generalized additive models

Introduction and motivation tl;dr: Reluctant generalized additive modeling (RGAM) produces highly interpretable sparse models which allow non-linear relationships between the response and each individual feature. However, non-linear relationships are only included if deemed important in improving prediction performance. RGAMs working with quantitative, binary, count and survival responses and is computationally efficient. Consider the supervised learning … Read more relgam: Fitting reluctant generalized additive models