Geocomputation with R – the afterword
I am extremely proud to announce that Geocomputation with R is complete.It took Robin, Jannes, and me almost 2 years of collaborative planning, writing, refinement, and deployment to make the … Read more
I am extremely proud to announce that Geocomputation with R is complete.It took Robin, Jannes, and me almost 2 years of collaborative planning, writing, refinement, and deployment to make the … Read more
Reusable modeling pipelines are a practical idea that gets re-developed many times in many contexts. wrapr supplies a particularly powerful pipeline notation, and a pipe-stage re-use system (notes here). We … Read more
A new Le Monde mathematical puzzle in the digit category: Find the largest number such that each of its internal digits is strictly less than the average of its two … Read more
by Mark Niemann-Ross, an author, educator, and writer who teaches about R and Raspberry Pi at LinkedIn Learning I spend a LOT of time at r-project.org, in particular the sections for documentation and … Read more
Photo credit: Lynda.com Conditional Random Fields, Sequence Prediction, Sequence Labelling Parsing XML is a process that is designed to read XML and create a way for programs to use XML. … Read more
Introduction As a data scientist, I often find myself looking for external data sources that could be relevant for my machine learning projects. The problem is that it is uncommon … Read more
In this post, I’m going to review some of the top concepts I learned that turned me from a technical data scientist to a good data scientist Two months ago, … Read more
The AI future is uncertain, but generally, I think it will improve life. I was one of the 900+ futurists interviewed for The Pew Research study released yesterday, “Artificial Intelligence and … Read more
An explanation of Bayes’ theorem and linear discriminant analysis Photo by Jerry Kiesewetter on Unsplash Overview Previously, logistic regression was introduced for classification. Unfortunately, like any model, it presents some flaws: … Read more
We at STATWORX work a lot with R and we often use the same little helper functions within our projects. These functions ease our daily work life by reducing repetitive … Read more
As a follow-up of my previous study of Australian politicians on Twitter I’ve decided to build a more sophisticated, autonomous solution. The idea at glance: Collect regularly tweets from Members … Read more
The Undertaking Recently, I was involved in developing a machine learning solution for one of the largest North American steel manufacturers. The company wanted to leverage the power of ML … Read more
Select the probability threshold based on the ROC curve BigQuery provides an incredibly convenient way to train machine learning models on large, structured datasets. In an earlier article, I showed you … Read more
Q: How does a data scientist manage projects and teams? How do you make duration and resource estimations? These are great questions, and I think people don’t ask them enough, … Read more
It’s easy to create a function in R, but what if you want to call that function from a different application, with the scale to support a large number of … Read more
A series of insights and battle scars from the world of medical device design With the upcoming launch of one of our AI products, there has been a repeating question that … Read more
Detecting multiple objects in images and tracking them in videos In my previous story, I went over how to train an image classifier in PyTorch, with your own images, and then … Read more
Key Points of My Work Disclaimers: I will present only a portion of the code I wrote for this competition, my teammates are absolutely not responsible for my awful and buggy … Read more
We’ll be back with Season 2 early in 2019 and to keep you thinking, curious and data focused in between seasons, we’re having a DataFramed challenge. The winner will get … Read more
You’ve heard a lot about student debt, but what about technical debt? It’s Friday evening in the Bahamas. You’re relaxing under a striped red umbrella with a succulent glass of … Read more
On December 9 2008, very nearly ten years ago, the first post on Revolutions was published. Way back then, this blog was part of a young startup called Revolution Computing, … Read more
The post 5½ Reasons to Ditch Spreadsheets for Data Science: Code is Poetry appeared first on The Lucid Manager. When I studied civil engineering some decades ago, we solved all … Read more
Previously in this series: I’ve recently been enjoying The Riddler: Fantastic Puzzles from FiveThirtyEight, a wonderful book from 538’s Oliver Roeder. Many of the probability puzzles can be productively solved … Read more
From silver bullet syndrome to silver linings In this article I plan to share with you our recent experience pitching AI to business folk, and what lessons we learned along the … Read more
Great post Yash! For those readers interested in getting data from the fitbit API using R I’ve documented the process here: https://towardsdatascience.com/the-gamification-of-fitbit-how-an-api-provided-the-next-level-of-training-eaf7b267af00 Related R-bloggers.com offers daily e-mail updates about R … Read more
Y’all it may have taken me a little time, but I did listen. Thank you for your emails. Because of you, I have now updated my ggmap tutorial to address … Read more
During my training classes, after/during discussion on the common machine learning models I will usually bring up a topic and that is the usage of insights from these models or … Read more
We at STATWORX work a lot with R and we often use the same little helper functions within our projects. These functions ease our daily work life by reducing repetitive … Read more
Everyone who has ever used R has, at one time or another, wished for an increase in R’s speed. If you haven’t, you’re not using R hard enough! Recently, as … Read more
Reducing Costs By Improving Processes Our team was recently asked how data analytics and data science can be used to improve bottlenecks and patient flows in hospitals. Healthcare providers and … Read more
Imagine you have sent your alien ? friend (optimization algorithm) to a supermarket (hypothesis space) to buy your favorite cheese (solution). The only clue she has is the picture of … Read more
I will run an 8-hour course on R and Data Mining at Black Mountain, CSIRO, Australia on 10 & 13 December 2018. The course materials, incl. slides, R scripts and … Read more
New minor release of the R/exams package to CRAN, containing a range of smaller improvements and bug fixes. Notably scanning of written NOPS exams is enhanced and made more reliable … Read more
Questions related to this repository from a project I created almost three years ago are among the most numerous questions I receive. The repository itself is really nothing too special, … Read more
The challenge of finding the right view through data Edward Hugh Simpson, a statistician and former cryptanalyst at Bletchley Park, described the statistical phenomenon that takes his name in a technical … Read more
In the previous part (Part I) of the word representation series, I talked about fixed word representations that make no assumption about semantics (meaning) and similarity of words. In this … Read more
A few month ago I joined the R/Pharma conference in Cambridge, MA. As a take away I thought of my project and how I can improve, with solutions others provided. … Read more
A walk-through of implementing AlphaZero using custom TensorFlow operations and a custom Python C module I describe here my implementation of the AlphaZero algorithm, available on Github, written in Python with … Read more
I taught my Data Visualization seminar in Philadelphia this past Friday and Saturday. It covers most of the content of my book, including a unit on making maps. The examples … Read more
Tensorflow is great. Really, I mean it. The problem is it’s great up to a point. Sometimes you want to do very simple things, but tensorflow is giving you a … Read more
Motivation The dplyr functions select and mutate nowadays are commonly applied to perform data.frame column operations, frequently combined with magrittrs forward %>% pipe. While working well interactively, however, these methods … Read more
Dec 8, 2018 Photo by Alina Grubnyak on Unsplash Graphs provide us with a very useful data structure. They can help us to find structure within our data. With the advent … Read more
We at STATWORX work a lot with R and we often use the same little helper functions within our projects. These functions ease our daily work life by reducing repetitive … Read more
Our Moneyball approach to the EPL Fantasy League My friend and I have been playing the Official Fantasy English Premier League game for many years, and despite our firm belief that … Read more
Previously, I wrote an article about estimating distributions using nonparametric estimators, where I discussed the various methods of estimating statistical properties of data generated from an unknown distribution. This article … Read more
What should I expect before a round of the popular drinking game? Recommended equipment for Ride The Bus College. It’s a time for things like exploring your personality, finding your values, … Read more
The intersection of Chemistry and A.I A.I has seen unprecedented growth in the past couple years. Although machine learning architectures like Neural Networks (NN) have been known for a long time … Read more
… this week that I made a first cameo in the debian/changelog for the Debian R package: r-base (0.63.1-1) unstable; urgency=low New upstream release Linked html directory to /usr/doc/r-base/doc/html (Dirk … Read more
This note is a comment on some of the timings shared in the dplyr-0.8.0 pre-release announcement. The original published timings were as follows: With performance metrics: measurements are marketing. So … Read more
Deep Learning What are we talking about? A quick search on Google give us the following definition of “deep learning” : “the ensemble of deep learning methods is a part of a … Read more