Chaining effect in clustering

In a previous blog post, I explained how we can leverage the k-means clustering algorithm to count the number of red baubles on a Christmas tree. This method fails however if we put Christmas tinsels on it. Let’s find a solution for this more difficult case. Filter red points Let’s first proceed as we did … Read more

Categories R Tags ExcerptFavorite

R Studio Conf 2019 – Easing your FOMO with R Resources

Image credit: Results from the 2018 RStudio Learning R Survey, shared by Carl Howe at RStudio Conf 2019 Y’all if I had just one takeaway from R Studio Conference 2019, it’s that we are a friendly and tight knit community! Virtually and in person, we look out for each other. This past week, the FOMO … Read more

Categories R Tags ExcerptFavorite

hrbrthemes 0.6.0 on CRAN + Other In-Development Package News

Version 0.6.0 of the hrbrthemes package should be hitting a CRAN mirror near you soon. Apart from some general documentation and code cleanup this release includes the dark theme folks have been seeing in blog posts and tweets over the past few months. It’s called theme_ft_rc() since it is an homage to the wonderful new … Read more

Categories R Tags ExcerptFavorite

A shiny Web App from LEGO— truck + trailer

How to Build a Shiny “Truck” part 2 — Let the LEGO “truck” app pull a trailer. An example of a modularized shiny app. In September 2018 I used an automotive metaphor explaining a large scale R shiny app. RViews published the article. I would summarize the article in one phrase. Upon building large applications (trucks) in R … Read more

Categories R Tags ExcerptFavorite

Docker Images for R: r-base versus r-apt

I need to deploy a Plumber API in a Docker container. The API has some R package dependencies which need to be baked into the Docker image. There are a few options for the base image: The first option, r-base, would require building the dependencies from source, a somewhat time consuming operation. The last option, … Read more

Categories R Tags ExcerptFavorite

Spatial lag model trees

Economic growth models are recursively partitioned to assess heterogeneity in growth and convergence across EU regions while adjusting for spatial dependencies. Accompanied by R package lagsarlmtree, combining partykit::mob and spdep::lagsarlm. Citation Martin Wagner, Achim Zeileis (2019). “Heterogeneity and Spatial Dependence of Regional Growth in the EU: A Recursive Partitioning Approach.” German Economic Review, 20(1), 67-82. … Read more

Categories R Tags ExcerptFavorite

Multiple Rstudio Server instances using a single R/LaTeX install with KVM

Introduction Rstudio Server Open Source Edition (OSE) is offered with some key limitations compared to the Pro Edition. A few of these limitations are easy to circumvent using basic Linux sysadmin skills (such as encrypting traffic by using a reverse proxy), but most of RStudio Server OSE’s limitations are not so easy to work with, … Read more

Categories R Tags ExcerptFavorite

Intrinsic time for cryptocurrency data

This week, I attended a three-day hacking event of the Crypto Traders Berlin Meetup group. The aim was to find relationships between sentiment data of bitcointalk and twitter and the price of crypto currencies. In practise, it turns out to be not that easy. For starters, it is not easy to determine if the social … Read more

Categories R Tags ExcerptFavorite

My presentations on ‘Elements of Neural Networks & Deep Learning’ -Parts 6,7,8

This is the final set of presentations in my series ‘Elements of Neural Networks and Deep Learning’. This set follows the earlier 2 sets of presentations namely1. My presentations on ‘Elements of Neural Networks & Deep Learning’ -Part1,2,32. My presentations on ‘Elements of Neural Networks & Deep Learning’ -Parts 4,5 In this final set of … Read more

Categories R Tags ExcerptFavorite

Fully General Record Transforms with cdata

One of the design goals of the cdata R package is that very powerful and arbitrary record transforms should be convenient and take only one or two steps. In fact it is the goal to take just about any record shape to any other in two steps: first convert to row-records, then re-block the data … Read more

Categories R Tags ExcerptFavorite

Feeling the rstudio::conf ❤️

I am heading home from my third year of attending rstudio::conf! If you weren’t there, watch for the videos to be released so you can check out the talks; I know I will do the same so I can see the talks I was forced to miss by scheduling constraints. I love this conference, and … Read more

Categories R Tags ExcerptFavorite

A Newbie’s Guide to Making A Pull Request (for an R package)

I had the wonderful opportunity to participate in the{tidyverse} Developer Daythe day after rstudio::conf2019officially wrapped up. One of the objectives of the eventwas to encourageopen-source contributor newbies (like me ?) togain some experience, namely through submittingpull requests to address issues with {tidyverse} packages. Having only ever worked with my own packages/repos before,I found this was … Read more

Categories R Tags ExcerptFavorite

GeoPAT2: Entropy calculations for local landscapes

GeoPAT 2 is an open-source software written in C and dedicated to pattern-based spatial and temporal analysis.Four main types of analysis available in GeoPAT 2 are (i) search, (ii) change detection, (iii) segmentation, and (iv) clustering.However, additional applications are also possible, including extracting information about spatial patterns. Global landscape diversity (based on Shannon entropy of … Read more

Categories R Tags ExcerptFavorite

Create R Markdown reports and presentations even better with these 3 practical tips

Including R Markdown in the workflow for presenting and publishing analyses that use code in R or other languages is a great way to make presentations, dashboards or reports good looking, reproducible and version controllable. In this post, we will look at three simple ways to improve that workflow even further with methods that are … Read more

Categories R Tags ExcerptFavorite

simmer 4.2.1

The 4.2.1 release of simmer, the Discrete-Event Simulator for R, is on CRAN with quite interesting new features and fixes. As discussed in the mailing list, there is a way to handle the specific case in which an arrival is rejected because a queue is full: library(simmer) reject <- trajectory() %>% log_(“kicked off…”) patient <- … Read more

Categories R Tags ExcerptFavorite

Extracting colours from your images with Image Quantization

magick really does the “Magic!” I have been playing around bit with package “magick”, and I think I am now hooked… Although I haven’t been able to understand everything written in vignette just yet. One of function I got really excited is image_quantize. This function will reduce the number of unique colours used in the … Read more

Categories R Tags ExcerptFavorite

Summer Internships 2019

We are excited to announce the second formal summer internship program at RStudio. The goal of this program is to enable RStudio employees to collaborate with students to do work that will help both RStudio users and the broader R community, and help ensure that the community of R developers is as diverse as its … Read more

Categories R Tags ExcerptFavorite

Window Aggregate operator in batch mode in SQL Server 2019

So this came as a surprise, when working on calculating simple statistics on my dataset, in particular min, max and median. First two are trivial. The last one was the one, that caught my attention. While finding the fastest way on calculating the median (statistic: median) for given dataset, I have stumbled upon an interesting … Read more

Categories R Tags ExcerptFavorite

Rcrastinate is moving.

Hi all, this is just an announcement. I am moving Rcrastinate to a blogdown-based solution and am therefore leaving blogger.com. If you’re interested in the new setup and how you could do the same yourself, please check out the all shiny and new Rcrastinate over at http://rcrastinate.rbind.io/ In my first post over there, I am … Read more

Categories R Tags ExcerptFavorite

Factor Analysis in R with Psych Package: Measuring Consumer Involvement

The post Factor Analysis in R with Psych Package: Measuring Consumer Involvement appeared first on The Lucid Manager. The first step for anyone who wants to promote or sell something is to understand the psychology of potential customers. Getting into the minds of consumers is often problematic because measuring psychological traits is a complex task. … Read more

Categories R Tags ExcerptFavorite

Are you parallelizing your raster operations? You should!

If you plan to do anything with the raster package you should definitely consider parallelize all your processes, especially if you are working with very large image files. I couldn’t find any blog post describing how to parallelize with the raster package (it is well documented in the package documentation, though). So here my notes. Load … Read more

Categories R Tags ExcerptFavorite

RcppArmadillo 0.9.200.7.0

A new RcppArmadillo bugfix release arrived at CRAN today. The version 0.9.200.7.0 is another minor bugfix release, and based on the new Armadillo bugfix release 9.200.7 from earlier this week. I also just uploaded the Debian version, and Uwe’s systems have already create the CRAN Windows binary. Armadillo is a powerful and expressive C++ template … Read more

Categories R Tags ExcerptFavorite

forecast 8.5

The latest minor release of the forecast package has now been approved on CRAN and should be available in the next day or so. Version 8.5 contains the following new features Updated tsCV() to handle exogenous regressors. Reimplemented naive(), snaive(), rwf() for substantial speed improvements. Added support for passing arguments to auto.arima() unit root tests. … Read more

Categories R Tags ExcerptFavorite

Make Teaching R Quasi-Quotation Easier

To make teaching R quasi-quotation easier it would be nice if R string-interpolation and quasi-quotation both used the same notation. They are related concepts. So some commonality of notation would actually be clarifying, and help teach the concepts. We will define both of the above terms, and demonstrate the relation between the two concepts. String-interpolation … Read more

Categories R Tags ExcerptFavorite

Automated Dashboard for Classification Neural Network in R

Categories Programming Tags Data Visualisation Flexdashboard Neural Networks R Programming In this article, you learn how to make Automated Dashboard for Classification Neural Network in R. First you need to install the `rmarkdown` package into your R library. Assuming that you installed the `rmarkdown`, next you create a new `rmarkdown` script in R. After this … Read more

Categories R Tags ExcerptFavorite

My course on Hyperparameter Tuning in R is now on Data Camp!

I am very happy to announce that (after many months) my interactive course on Hyperparameter Tuning in R has now been officially launched on Data Camp! Course Description For many machine learning problems, simply running a model out-of-the-box and getting a prediction is not enough; you want the best model with the most accurate prediction. … Read more

Categories R Tags ExcerptFavorite

ROC Curves

I have been thinking about writing a short post on R resources for working with (ROC) curves, but first I thought it would be nice to review the basics. In contrast to the usual (usual for data scientists anyway) machine learning point of view, I’ll frame the topic closer to its historical origins as a … Read more

Categories R Tags ExcerptFavorite

RStudio Connect 1.7.0

RStudio Connect is the publishing platform for everything you create in R. Inconversations with our customers, R users were excited to have a central placeto share all their data products, but were facing a tough problem. Theircolleagues working in Python didn’t have the same option, leaving their workstranded on their desktops. Today, we are excited … Read more

Categories R Tags ExcerptFavorite

AI, Machine Learning and Data Science Roundup: January 2019

A monthly roundup of news about Artificial Intelligence, Machine Learning and Data Science. This is an eclectic collection of interesting blog posts, software announcements and data applications from Microsoft and elsewhere that I’ve noted over the past month or so. Open Source AI, ML & Data Science News Preview of Tensorflow 2.0 (the public preview … Read more

Categories R Tags ExcerptFavorite

Using DataCamp reduces anxiety about learning R!

I used DataCamp‘s excellent Introduction to R as Essential Prior Independent Study and found it made people a bit less worried about a term of R! I have a lot of fun teaching first year biology undergraduates but there are a few challenges in teaching data skills when they are not (perceived as) a student’s core discipline … Read more

Categories R Tags ExcerptFavorite

Automated Dashboard for Credit Modelling with Decision trees and Random forests in R

Categories Programming Tags Data Visualisation Flexdashboard R Programming RMarkdown In this article, you learn how to make Automated Dashboard for Credit Modelling with Decision trees and Random forests in R. First you need to install the `rmarkdown` package into your R library. Assuming that you installed the `rmarkdown`, next you create a new `rmarkdown` script … Read more

Categories R Tags ExcerptFavorite

Lecture slides: Real-World Data Science (Fraud Detection, Customer Churn & Predictive Maintenance)

These are slides from a lecture I gave at the School of Applied Sciences in Münster. In this lecture, I talked about Real-World Data Science and showed examples on Fraud Detection, Customer Churn & Predictive Maintenance. The slides were created with xaringan. Related To leave a comment for the author, please follow the link and … Read more

Categories R Tags ExcerptFavorite

Use foreach with HPC schedulers thanks to the future package

The future package is a powerful and elegant cross-platform framework for orchestrating asynchronous computations in R. It’s ideal for working with computations that take a long time to complete; that would benefit from using distributed, parallel frameworks to make them complete faster; and that you’d rather not have locking up your interactive R session. You can … Read more

Categories R Tags ExcerptFavorite

Feature Selection using Genetic Algorithms in R

This is a post about feature selection using genetic algorithms in R, in which we will do a quick review about: What are genetic algorithms? GA in ML? What does a solution look like? GA process and its operators The fitness function Genetics Algorithms in R! Try it yourself Relating concepts Animation source: “Flexible Muscle-Based … Read more

Categories R Tags ExcerptFavorite

Using clusterlab to benchmark clustering algorithms

Clusterlab is a CRAN package (https://cran.r-project.org/web/packages/clusterlab/index.html) for the routine testing of clustering algorithms. It can simulate positive (data-sets with >1 clusters) and negative controls (data-sets with 1 cluster). Why test clustering algorithms? Because they often fail in identifying the true K in practice, published algorithms are not always well tested, and we need to know … Read more

Categories R Tags ExcerptFavorite

Selecting ‘special’ photos on your phone

At the beginning of the new year I always want to clean up my photos on my phone. It just never happens. So now (like so many others I think) I have a lot of photos on my phone from the last 3.5 years. The iPhone photos app helps you a bit to go through … Read more

Categories R Tags ExcerptFavorite

Mango Solutions contributes to technology partners RStudio conference

As leading advanced analytics partner for RStudio, Mango Solutions are delighted to be contributing to the upcoming rstudio::conf programme with a workshop and a talk. Two of Mango’s senior consultants, Aimée Gott, Education Practice Lead and Mark Sellors, Head of Data Engineering will be sharing their R expertise with delegates. Aimée Gott will be delivering the Intermediate … Read more

Categories R Tags ExcerptFavorite

Neural Text Modelling with R package ruimtehol

Last week the R package ruimtehol was released on CRAN (https://github.com/bnosac/ruimtehol) allowing R users to easily build and apply neural embedding models on text data. It wraps the ‘StarSpace’ library “>https://github.com/facebookresearch/StarSpace allowing users to calculate word, sentence, article, document, webpage, link and entity ’embeddings’. By using the ’embeddings’, you can perform text based multi-label classification, … Read more

Categories R Tags ExcerptFavorite

Understanding the Magic of Neural Networks

Everything “neural” is (again) the latest craze in machine learning and artificial intelligence. Now what is the magic here? Let us dive directly into a (supposedly little silly) example: we have three protagonists in the fairy tail little red riding hood, the wolf, the grandmother and the woodcutter. They all have certain qualities and little … Read more

Categories R Tags ExcerptFavorite

Scaling H2O analytics with AWS and p(f)urrr (Part 2)

This is the second installment in a three part series on integrating H2O, AWS and p(f)urrr. In Part II, I will showcase how we can combine purrr and h2o to train and stack ML models. In the first post we looked at starting up an AMI on AWS which acts as the infrastructure upon which … Read more

Categories R Tags ExcerptFavorite

My presentations on ‘Elements of Neural Networks & Deep Learning’ -Parts 4,5

This is the next set of presentations on “Elements of Neural Networks and Deep Learning”.  In the 4th presentation I discuss and derive the generalized equations for a multi-unit, multi-layer Deep Learning network.  The 5th presentation derives the equations for a Deep Learning network when performing multi-class classification along with the derivations for cross-entropy loss. The corresponding … Read more

Categories R Tags ExcerptFavorite

splashr 0.6.0 Now Uses the CRAN-nascent stevedore Package for Docker Orchestration

The splashr package [srht|GL|GH] — an alternative to Selenium for javascript-enabled/browser-emulated web scraping — is now at version 0.6.0 (still in dev-mode but on its way to CRAN in the next 14 days). The major change from version 0.5.x (which never made it to CRAN) is a swap out of the reticulated docker package with … Read more

Categories R Tags ExcerptFavorite

R Tip: Use Inline Operators For Legibility

R Tip: use inline operators for legibility. A Python feature I miss when working in R is the convenience of Python‘s inline + operator. In Python, + does the right thing for some built in data types: It concatenates lists: [1,2] + [3] is [1, 2, 3]. It concatenates strings: ‘a’ + ‘b’ is ‘ab’. … Read more

Categories R Tags ExcerptFavorite

ggeffects 0.8.0 now on CRAN: marginal effects for regression models #rstats

I’m happy to announce that version 0.8.0 of my ggeffects-package is on CRAN now. The update has fixed some bugs from the previous version and comes along with many new features or improvements. One major part that was addressed in the latest version are fixed and improvements for mixed models, especially zero-inflated mixed models (fitted … Read more

Categories R Tags ExcerptFavorite

pcLasso: a new method for sparse regression

I’m excited to announce that my first package has been accepted to CRAN! The package pcLasso implements principal components lasso, a new method for sparse regression which I’ve developed with Rob Tibshirani and Jerry Friedman. In this post, I will give a brief overview of the method and some starter code. (For an in-depth description … Read more

Categories R Tags ExcerptFavorite

rOpenSci’s new Code of Conduct

We are pleased to announce the release of our new Code of Conduct. rOpenSci’s community is our best asset and it’s important that we put strong mechanisms in place before we have to act on a report. As before, our Code applies equally to members of the rOpenSci team and to anyone from the community … Read more

Categories R Tags ExcerptFavorite

? R Coding Style Guide

Language is a tool that allows human beings to interact and communicate with each other. The clearer we express ourselves, the better the idea is transferred from our mind to the other. The same applies to programming languages: concise, clear and consistent codes are easier to read and edit. It is especially important, if you … Read more

Categories R Tags ExcerptFavorite

colorspace: New Tools for Colors and Palettes

A major update (version 1.4.0) of the R package colorspace has been released to CRAN, enhancing many of the package’s capabilities, e.g., more refined palettes, named palettes, ggplot2 color scales, visualizations for assessing palettes, shiny and Tcl/Tk apps, color vision deficiency emulation, and much more. Overview The colorspace package provides a broad toolbox for selecting … Read more

Categories R Tags ExcerptFavorite