Getting your toes wet in R: Hydrology, meteorology, and more

Importance of Hydrology Given that liquid water is essential to life on Earth, water research cuts across numerous disciplines including hydrology, meteorology, geography, climate science, engineering, ecology, and more. Numerous R packages have emerged from this diversity of approaches, and we recently gathered many of them into a new rOpenSci task view which we broadly … Read more

Categories R Tags ExcerptFavorite

Musings on missing data

I’ve been meaning to share an analysis I recently did to estimate the strength of the relationship between a young child’s ability to recognize emotions in others (e.g. teachers and fellow students) and her longer term academic success. The study itself is quite interesting (hopefully it will be published sometime soon), but I really wanted to … Read more

Categories R Tags ExcerptFavorite

A unified syntax for accessing models’ information

The richness and variety of packages for building and fitting statistical models in R is absolutely astonishing and contributes to the language’s popularity. However, this diversity makes it hard for developpers that want to create tools that work with different types of models. Indeed, the way to access models’ internal information (such as parameters names, … Read more

Categories R Tags ExcerptFavorite

Finding a Difference that Matters

Determining Differences ANOVA, or Analysis of Variance, measures the likelihood that the mean is the same across a series of groups within the data. It does this by measuring the “between group variance” and “within group variance”. Between Group Variance The between group variance sums: The mean of each group subtracted from the overall data … Read more

Data Science R&D at TD Ameritrade

Hugo Bowne-Anderson, the host of DataFramed, the DataCamp podcast, recently interviewed Sean Law, a Senior Applied Researcher and Data Scientist at TD Ameritrade. Hugo: Hi there, Sean, and welcome to DataFramed. Sean: Thanks for having me, Hugo. Hugo: It’s a real pleasure to have you on the show, and I’m really excited to have you … Read more

Categories R Tags ExcerptFavorite

One week EARL London abstract extension – 8 April

Due to quite a few panicked email requests over the weekend we have decided to extend the EARL London abstract deadline until 8 April.  Thank you so much to all the people that have already submitted, we can’t wait to look through all of the brilliant submissions. Submit your abstract here. Related R-bloggers.com offers daily e-mail … Read more

Categories R Tags ExcerptFavorite

RInside Help in Testing

A problem arises when building R interfaces to C/C++ libraries involves testing: how to go about replicating the existing C/C++ tests in R without undue effort. If the C/C++ tests are simple and small enough, they can be manually translated. However, when there are many tests, and each test initializes its own large data structures, … Read more

Categories R Tags ExcerptFavorite

Shiny Apps for Interactive Data Analysis

We are excited and happy to share a set of shinyapps built for interactive data analysis and teaching at Rsquared Academy. The apps are part of our R packages and presently cover the following topics: Descriptive Statistics Probability Distributions Hypothesis Testing Linear Regression Logistic Regression RFM Analysis Data Visualization We would suggest that you explore … Read more

Categories R Tags ExcerptFavorite

Easyalluvial 0.2.0 released

easyalluvial allows you to build exploratory alluvial plots (sankey diagrams) with a single line of code while automatically binning numerical variables. In version 0.2.0 marginal histograms improve the visibility of those numerical variables. Further a method has been added that creates model agnostic 4 dimensional partial dependence alluvial plots to visualise the response of statistical … Read more

Categories R Tags ExcerptFavorite

Monotonic Binning with GBM

In addition to monotonic binning algorithms introduced in my previous post (https://statcompute.wordpress.com/2019/03/10/a-summary-of-my-home-brew-binning-algorithms-for-scorecard-development), two more functions based on Generalized Boosted Regression Models have been added to my GitHub repository, gbm_bin() and gbmcv_bin(). The function gbm_bin() estimates a GBM model without the cross validation and tends to generate a more granular binning outcome. The function gbmcv_bin() estimates … Read more

Categories R Tags ExcerptFavorite

Tips for R to Python and Vice-Versa seamlessly

When we TATVA AI visit our clients, often both data scientists and higher management ask us, how we deal with both  Python and R simultaneously for client requests; as there is no universal preference among clients. Though solution is not straight forward, however, I suggest to exploit common libraries for quick deployments, such as, dfply (python) … Read more

Categories R Tags ExcerptFavorite

Matrix-style screensaver in R

This post shares short code snippet to make your own screen saver in R, The Matrix-style: The code takes a few seconds to complete. nx = 100 ny = 80 kk <- 110 x = sample(x = 1:nx, size = kk, replace = TRUE) y = seq(-1, -ny, length = kk) codes <- matrix(0:127, 8, … Read more

Categories R Tags ExcerptFavorite

Using R: plotting the genome on a line

Imagine you want to make a Manhattan-style plot or anything else where you want a series of intervals laid out on one axis after one another. If it’s actually a Manhattan plot you may have a friendly R package that does it for you, but here is how to cobble the plot together ourselves with … Read more

Categories R Tags ExcerptFavorite

mapedit 0.5.0 and Leaflet.pm

[view rawRmd] In our last post mapedit and leaflet.js >1.0 wediscussed remaining tasks for theRConsortium funded projectmapedit. mapedit 0.5.0 fixesa couple of lingering issues, but primarily focuses on bringing thepower of Leaflet.pm as analternate editor.Leaflet.draw,the original editor in mapedit provided by leaflet.extras, is awonderful tool but struggles with snapping and those pesky holes that wecommonly … Read more

Categories R Tags ExcerptFavorite

See you at useR! Toulouse

Hey all, just a quick post to give you some details about my workshop at useR! 2019, in Toulouse! Hacking RStudio: Advanced Use of your Favorite IDE About Have you ever wanted to become more productive with RStudio? Then this workshop is made for you! You’ve been wandering the web for a while now, reading … Read more

Categories R Tags ExcerptFavorite

Website with Australian federal election forecasts by @ellis2013nz

The election forecasts Building on my recent blog posts, I’ve put up a page dedicated to forecasts of the coming Australian federal election. It takes the state space model of two-party-preferred vote from my first blog on polls leading up to this election, and combines it with a more nuanced understanding of the seats actually … Read more

Categories R Tags ExcerptFavorite

Analyzing performances of cricketers using cricketr template

The cricketr package has several functions that perform several different analyses on both batsman and bowlers. The package has function that plot percentage frequency runs or wickets, runs likelihood for a batsman, relative run/strike rates of batsman and relative performance/economy rate for bowlers are available. Other interesting functions include batting performance moving average, forecast and … Read more

Categories R Tags ExcerptFavorite

Wrapping up the stars project

[view rawRmd] Summary This is the fourth blog on thestars project, an it completes theR-Consortium funded project for spatiotemporal tidy arrays with R. Itreports on the current status of the project, and current developmentdirections. Although this project ends, with the release of stars 0.3 onCRAN, theadoption, update, enthusiasm and participation in the development of thestars … Read more

Categories R Tags ExcerptFavorite

Quick Control Charts for AFL

Related To leave a comment for the author, please follow the link and comment on their blog: Analysis of AFL. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) … Read more

Categories R Tags ExcerptFavorite

no country for old liars

A puzzle from the Riddler about a group of five persons, A,..,E, where all and only people strictly older than L are liars, all making statements about others’ ages: A: B>20 and D>16 B: C>18 and E<20 C: D<22 and A=19 D: E≠20 and B=20 E: A>21 and C<18 The Riddler is asking for the … Read more

Categories R Tags ExcerptFavorite

drat 0.1.5: New release

A new version of drat just arrived on CRAN. And like the last time in December 2017 it went through as an automatically processed upgrade directly from the CRAN prechecks. Being a simple package can have its upsides… And like the last time, this release once again draws largely upon contributed pull requests. Neal Fultz … Read more

Categories R Tags ExcerptFavorite

Cross Platform Super Dark IDE Theme, R-Studio Server

A recent post over at r-bar.net demonstrated how to use a Windows system utility to achieve a super Dark Rstudio theme. Super Dark Theme Well if you liked that post , but are on Linux or OSX or use RStudio Server, then we have a cross platform solution for you! Rstudio is a web browser, … Read more

Categories R Tags ExcerptFavorite

Explore your Researcher Degrees of Freedom

I am an applied economist working in the area of accounting and corporate transparency. I work with observational data a lot, meaning with data that is already available and not under my control. Whenever I set sails to design a test, there are a lot of decisions to take: Which sample should I use? What … Read more

Categories R Tags ExcerptFavorite

Le Monde puzzle [#1088]

A board (Ising!) Le Monde mathematical puzzle in the optimisation mode, again: On a 7×7 board, what is the maximal number of locations that one can occupy when imposing at least two empty neighbours ? Which I tried to solve by brute force and simulated annealing (what else?!), first defining a target targ=function(tabz){ sum(tabz[-c(1,9),-c(1,9)]-1.2*(tabz[-c(1,9),-c(1,9)]*tabz[-c(8,9),-c(1,9)] +tabz[-c(1,9),-c(1,9)]*tabz[-c(1,2),-c(1,9)] … Read more

Categories R Tags ExcerptFavorite

Using RStudio and LaTeX

This post will explain how to integrate RStudio and LaTeX, especially the inclusion of well-formatted tables and nice-looking graphs and figures produced in RStudio and imported to LaTeX. To follow along you will need RStudio, MS Excel and LaTeX. Using tikzdevice to insert R Graphs into LaTeX I am a very visual thinker. If I … Read more

Categories R Tags ExcerptFavorite

#21: A Third and Final (?) Post on Stripping R Libraries

Welcome to the 21th post in the reasonably relevant R ramblings series, or R4 for short. Back in August of 2017, we wrote two posts #9: Compating your Share Libraries and #10: Compacting your Shared Libraries, After The Build about “stripping” shared libraries. This involves removing auxiliary information (such as debug symbols and more) from … Read more

Categories R Tags ExcerptFavorite

Super Dark IDE Theme, R-Studio, Inverted Color

A dark IDE theme may increase visual comfort and productivity for those spending extended amounts of time coding, writing, and reading at a computer terminal. Why? If your 9 to 5 has you chained to a computer, you’ve likely experienced eye strain. Typical symptoms include soreness, irritation, and difficulty focusing your vision. All of which … Read more

Categories R Tags ExcerptFavorite

A bit more understanding of Cronbach’s alpha

Cronbach’salpha reliability coefficient is one of the most widely used indicators of thescale reliability. It is used often without concern for the data (this will bea different text) because it is simple to calculate and it requires only oneimplementation of a single scale. The aim of this article is to provide some moreinsight into the … Read more

Categories R Tags ExcerptFavorite

Deployment of Binning Outcomes in Production

In my previous post (https://statcompute.wordpress.com/2019/03/10/a-summary-of-my-home-brew-binning-algorithms-for-scorecard-development), I’ve shown different monotonic binning algorithm that I developed over time. However, these binning functions are all useless without a deployment vehicle in production. During the weekend, I finally had time to draft a R function(https://github.com/statcompute/MonotonicBinning/blob/master/code/calc_woe.R) that can be used to deploy the binning outcome and to apply the WoE … Read more

Categories R Tags ExcerptFavorite

Bio7 3.0 Released

27.03.2019 A new release of Bio7 is available which is built upon Eclipse 4.11 and the latest Java OpenJDK. This new version comes bundled with OpenJDK 12, supports the dynamic compilation of Java 11 and fixes several annoying bugs on MacOSX (e.g., shutdown crashes). The R interface has been improved and the R-Shell now updates … Read more

Categories R Tags ExcerptFavorite

R Studio Shortcuts and Tips

How can you work faster in R Studio? Do you really want to know? In this article, I would like to share with you some of my favorite productivity features of R Studio along with their respective shortcuts. As well I will provide information about some other tools and techniques that are useful. I also prepared … Read more

Categories R Tags ExcerptFavorite

Rome Was Not Built In A Day But widgetcard Was!

I saw a second post on turning htmlwidgets into interactive Twitter Player cards and felt somewhat compelled to make creating said entities a bit easier so posited the following: Wld this be useful packaged up, #rstats?https://t.co/sfqlWnEeJVhttps://t.co/troKzmzTNv (TLDR/V: Single function to turn an HTML widget into a deployable interactive Twitter card) pic.twitter.com/uahB52YfE2 — boB Rudis (@hrbrmstr) … Read more

Categories R Tags ExcerptFavorite

Could you be the next graduate Mango?

At Mango, we firmly believe that any decision can be better made using analytics and data. We also know that a company’s success is increasingly dependent on becoming data-driven. That’s where we come in. Our mission is to empower organisations to make informed decisions using data science and advanced analytics to drive bigger gains, lower … Read more

Categories R Tags ExcerptFavorite

Inverse Statistics – and how to create Gain-Loss Asymmetry plots in R

Asset returns have certain statistical properties, also called stylized facts. Important ones are: Absence of autocorrelation: basically the direction of the return of one day doesn’t tell you anything useful about the direction of the next day. Fat tails: returns are not normal, i.e. there are many more extreme events than there would be if … Read more

Categories R Tags ExcerptFavorite

Koning Filip lijkt op …

Last call for the course on Text Mining with R, held next week in Leuven, Belgium on April 1-2. Viewing the course description as well as subscription can be done at https://lstat.kuleuven.be/training/coursedescriptions/text-mining-with-r Some things you’ll learn … is that King Filip of Belgium is similar to public expenses if we just look at open data … Read more

Categories R Tags ExcerptFavorite

February 2019: “Top 40” New CRAN Packages

One hundred and fifty-one new packages arrived at CRAN in February. Here are my “Top 40” picks organized into eight categories: Bioinformatics, Data, Machine Learning, Medicine, Statistics, Time Series, Utilities and Visualization. Bioinfomatics Cascade v1.7: Implements a modeling tool allowing gene selection, reverse engineering, and prediction in cascade networks. See Jung et al. (2014) for … Read more

Categories R Tags ExcerptFavorite

Markov chain Monte Carlo doesn’t “explore the posterior”

First some background, then the bad news, and finally the good news. Spoiler alert: The bad news is that exploring the posterior is intractable; the good news is that we don’t need to explore all of it. Sampling to characterize the posterior There’s a misconception among Markov chain Monte Carlo (MCMC) practitioners that the purpose … Read more

Categories R Tags ExcerptFavorite

What it the interpretation of the diagonal for a ROC curve

Last Friday, we discussed the use of ROC curves to describe the goodness of a classifier. I did say that I will post a brief paragraph on the interpretation of the diagonal. If you look around some say that it describes the “strategy of randomly guessing a class“, that it is obtained with “a diagnostic … Read more

Categories R Tags ExcerptFavorite

Operator Notation for Data Transforms

As of cdata version 1.0.8 cdata implements an operator notation for data transform. The idea is simple, yet powerful. First let’s start with some data. d <- wrapr::build_frame( “id”, “measure”, “value” | 1 , “AUC” , 0.7 | 1 , “R2” , 0.4 | 2 , “AUC” , 0.8 | 2 , “R2” , 0.5 … Read more

Categories R Tags ExcerptFavorite

Critical Thinking in Data Science

Hugo Bowne-Anderson, the host of DataFramed, the DataCamp podcast, recently interviewed Debbie Berebichez, a physicist, TV host and data scientist and is currently the Chief Data Scientist at Metis in NY. Hugo: Hi there, Debbie, and welcome to DataFramed. Debbie: Hi, Hugo. It’s a pleasure of mine to be here. Hugo: It is such a … Read more

Categories R Tags ExcerptFavorite

quantmod_0.4-14 on CRAN

I just pushed a new release of quantmod to CRAN! I’m most excited about the update to getSymbols() so it doesn’t throw an error and stop processing if there’s a problem with one ticker symbol. Now getSymbols() will import all the data it can, and provide an informative error message for any ticker symbols it could not import. Another cool feature … Read more

Categories R Tags ExcerptFavorite

Play with the cyphr package

The cyphr package seems to provide a good choice for small research group that shares sensitive data over internet (e.g., DropBox). I did some simple experiment myself and made sure it can actually serve my purpose. I did my experiment on two computers (using openssl): I created the test data on my Linux workstation running … Read more

Categories R Tags ExcerptFavorite

Summer Interns 2019

We received almost 400 applications for our 2019 internship program from students with very diverse backgrounds. After interviewing several dozen people and making some very difficult decisions, we are pleased to announce that these twelve interns have accepted positions with us for this summer: Therese Anders: Calibrated Peer Review. Prototype tools to conduct experiments to … Read more

Categories R Tags ExcerptFavorite

nice student project

In all of my undergraduate classes, I require a term project, done in groups of 3-4 students. Though the topic is specified, it is largely open-ended, a level of “freedom” that many students are unaccustomed to. However, some adapt quite well. The topic this quarter was to choose a CRAN package that does not use … Read more

Categories R Tags ExcerptFavorite

ShinyProxy 2.2.0

ShinyProxy is a novel, open source platform to deploy Shiny apps for the enterpriseor larger organizations. Secured Embedding of Shiny Apps Since version 2.0.1 ShinyProxy provides a REST API to manage (launch, shut down) Shiny apps and consume the content programmatically inside broader web applications or portals. This allows to cleanly separate the responsiblity for … Read more

Categories R Tags ExcerptFavorite