How to Think of Pandas Data Visualization If You’re Coming From Excel

Building a Mental Model for Data Visualization in Pandas Photo by Kristopher Roller on Unsplash Having read a lot of tutorials on Pandas data visualization, I still can’t grasp the mechanics of it. Creating even a simple plot always requires me to look into the documentation. And even after running the code and getting the … Read more

Prediction Intervals in Linear Regression

This post covers how to calculate prediction intervals for Linear Regression. Normally when modeling, we get a single value from a regression model. But what if that value is used to plan or make important decisions? Then a single value may overstate our confidence when we’d like to know our uncertainty or error margin. Sure, … Read more

The Pandemic of Uncertainty

COVID-19 and the Hazy Progress of Science Photo by Ben White on Unsplash If only science were easy to decipher, perhaps there would be more consensus on how to best handle the COVID-19 crisis. But alas, so many questions remain. For example: Are we past the peak? Did we flatten the curve? Is a second … Read more

Conv2d: Finally Understand What Happens in the Forward Pass

A visual and mathematical explanation of the 2D convolution layer and its arguments Deep Learning’s libraries and platforms such as Tensorflow, Keras, Pytorch, Caffe or Theano help us in our daily lives so that every day new applications make us think “Wow!”. We all have our favorite framework, but what they all have in common … Read more

Atom’s Hydrogen for writing data science projects using Python

This is, of course, a very easy step to do. Just head over to and download the latest version of Atom. Once downloaded, simply copy and paste the Application file to your, well, Applications folder. “Then just double click to open it, right?” — Well, not really. Many Mac users, including myself, found that … Read more

Lessons learned scaling an open source machine learning platform

Takeaways from a year of open source development In 2019, we open-sourced Cortex, our model serving platform. In the intervening months, Cortex has received thousands of GitHub stars and has been put into production by students, startups, and large corporations around the world. While open source has been fundamental to Cortex’s growth, building an open … Read more

Checking your Package for Compatibility with R 4.0.0

As a package maintainer you might be observing an increasing number of questions raised by people that have recently migrated to R 4.0.0 and are now trying to get your package to work. Yet, rhub::check_with_rrelease() currently still uses R 3.6.3 as test base. While migrating to a new R version is always tempting maybe you … Read more

Categories R Tags ExcerptFavorite

How To Work From Home — Developers Edition

Popular blogs are packed with “game-changing” productivity techniques: Pomodoro, Getting Things Done, Pareto Principle to name just a few. I have tried a few of them, but the main issue is that when you do something creative like programming, they do not fully apply. I will share those techniques that work best for me, than … Read more

Bye-bye Python. Hello Julia!

Don’t get me wrong. Python’s popularity is still backed by a rock-solid community of computer scientists, data scientists and AI specialists. But if you’ve ever been at a dinner table with these people, you also know how much they rant about the weaknesses of Python. From being slow to requiring excessive testing, to producing runtime … Read more

AI and Our Complex Future

Change, Instability and Disruption By Dirk Knemeyer and Jonathan Follett How do we navigate a possible future where AI and emerging technologies remake the landscape of large-scale systems such as science, technology, society, and policy? We spoke with Sam Arbesman, Scientist-in-Residence at Lux Capital and the author of two award-winning books, most recently “Overcomplicated: Technology … Read more

Covid Data Science Launches Forecast Engine

Building a working forecast model for COVID-19 As Covid-19 continues to cause disruption and sickness throughout the world we decided to create a tool that would combine population with recent cases to forecast the probability of getting infected. This tool is now available for members of the public and government agencies to use. The data … Read more

Amazon RDS for SQL Server now supports Windows authentication in more AWS Regions

Amazon RDS for SQL Server expands support for Windows authentication using the AWS Managed Microsoft AD service in more AWS Regions. You can now use Windows authentication in these additional Regions: US West (N. California), South America (São Paulo), Asia Pacific (Mumbai), EU (Paris), and Asia Pacific (Hong Kong). Favorite

Categories AWS ExcerptFavorite

Using Machine Learning to Improve Customer Retention

Online grocers have long marketed a message of convenience. Unfortunately this strength is also their biggest weakness — customers switch between platforms regularly, choosing the one that is most efficient. This means that building brand loyalty is difficult, but key to customer retention. Sources: Ads of the World, Amazon, Design Covert Loyalty in grocery shopping … Read more

Amazon CodeGuru Reviewer launches new, more cost-effective pricing model

Amazon CodeGuru Reviewer evaluates your code against best practices observed in popular open source code repositories and Amazon’s own code base. It identifies potential defects that are difficult for developers to find and recommends changes that improve the overall code quality. Currently Amazon CodeGuru Reviewer supports Java applications using GitHub, and Amazon CodeCommit repositories.  This … Read more

Categories AWS ExcerptFavorite

AWS DeepComposer announces real-time visualizations for in-console model training and improved interactivity in learning capsules

AWS DeepComposer gives developers a creative way to learn machine learning. With DeepComposer, developers can get hands-on, literally, with a musical keyboard and the latest machine learning techniques to expand their ML skills. Developers can build generative AI models without having to write a single line of code by using the included sample models and … Read more

Categories AWS ExcerptFavorite

Numpy Essentials for Data Science

This post covers the basics of NumPy that you should practice to make your life easier as a Data Scientist/Analyst. Topics to be covered: Installation and Import Numpy array creation(1D or Multidimensional) Indexing and Slicing Attributes of a NumPy array Basic mathematical operations Array Manipulation functions Installation and Import Before we get down to talking … Read more

What can a PhD add to your data science career?

(And what does it cost you.) There are many career paths towards data science. Even though the field was mostly populated by people with academic backgrounds at the beginning, this is definitely not the only valid entry point. The long-standing debate about whether or not should you have a PhD to be a data scientist … Read more

Understanding forwarding, peering, and private zones in Cloud DNSUnderstanding forwarding, peering, and private zones in Cloud DNSCustomer Engineer, Networking SpecialistProduct Manager

The Domain Name System, or DNS, is one of the most foundational services of the Internet, turning human-friendly domain names into IP addresses. Often handled by specialized network engineers within an organization, DNS can feel like a black box to people who don’t deal with it often. For one, DNS terminology can be confusing, and … Read more

Learn 3 in-demand cloud skills in 30 days at no cost during the month of MayLearn 3 in-demand cloud skills in 30 days at no cost during the month of MayTechnical Curriculum Lead, Cloud Infrastructure

In April, we announced we were expanding our Google Cloud learning resources to support the growing number of people working and learning from home. Today, we are excited to announce that if you sign up by May 31, 20201, you can still enroll in Google Cloud training on both Pluralsight and Qwiklabs at no cost … Read more

Designing distributed systems using NALSD flashcardsDesigning distributed systems using NALSD flashcardsSite Reliability Manager

There are many ways to design distributed systems. One way involves growing systems organically—components are rewritten or redesigned as the system handles more requests. Another method starts with a proof of concept. Once the system adds value to the business, a second version is designed from the ground up.  At Google, we use a method … Read more

Experimenting With Hierarchical Reinforment Learning for Planning

Utilizing Computational Techniques to Model How Humans Approach Cognitive Tasks What is HRL? Suppose Jane, sitting in her Boston apartment, gets a notification from her boss that there is an important business meeting happening in Paris next week. How will she go about planning this trip? Rather than go about goal achievement through a series … Read more

Enable remote work faster with new Windows Virtual Desktop capabilities

In the past few months, there has been a dramatic and rapid shift in the speed at which organizations of all sizes have enabled remote work amidst the global health crisis. Companies examining priorities and shifting resources with agility can help their employees stay connected from new locations and devices, allowing for business continuity essential … Read more

Combine values into a vector

[This article was first published on Quantargo Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Combine values into a vector Everything is a vector Get to know … Read more

Categories R Tags ExcerptFavorite

Hyperparameter Tuning with Keras Tuner

Getting the most out of your models Photo by yinka adeoti on Unsplash Great data scientists do not settle with “okay”, they go beyond to achieve the extraordinary. In this article, we’ll review techniques data scientists use to create models that work great and win competitions. Getting the most out of our models means choosing … Read more

Detecting COVID-19 using Deep Learning

A practical approach to help medical practitioners helping us in the battle against COVID-19 Coronavirus disease 2019 (COVID-19) is a highly infectious disease caused by severe acute respiratory syndrome coronavirus 2. The disease first originated in December 2019 from Wuhan, China and since then it has spread globally across the world affecting more than 200 … Read more

Circular Queue or Ring Buffer

The Modulo Operator If we’re supposed to be increasing the head and the tail by 1 spot in the array how do we set it back to zero when we get to the end to ensure we keep looping through the array? We could do something like this: if (head + 1) = capacity head … Read more

DeepPavlov: “Keras” for Natural Language Processing answers COVID Questions

Training Open Domain Question Answering model on COVID papers dataset Using Azure Machine Learning and DeepPavlov library. Photo by Obi Onyeador on Unsplash In the field of image-related deep learning, Keras library plays an important role, radically simplifying such tasks as transfer learning or using pre-trained models. If you switch to the area of NLP, … Read more

Sitcoms natural language comparison

A lot of the highest-scoring terms are minor character names or other names that were not removed, which is not surprising, but the method also picks up some terms that are genuinely characteristic of one show, such as ‘regional’, ‘sherry’, and ‘surgical’. The word ‘will’ appears for several shows because it is not considered a … Read more

Time sinks for data wrangling

[This article was first published on Bluecology blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Estimating the time it will take to complete a project is an … Read more

Categories R Tags ExcerptFavorite