The New Data Engineering Stack

DATA ENGINEERING Technologies for the Complete Data Engineer Remember the time when the software development industry realized that a single person can take on multiple technologies glued tightly with each other and came up with the notion of a Full Stack Developer — someone who does data modelling, writes backend code and also does front … Read more

DeepMind Makes History Yet Again By Solving One of the Biggest Challenges in Biology

DeepMind’s AlphaFold can now predict a protein’s structure to the width of an atom. Photo by Photoholgic on Unsplash You may have heard about “DeepMind” in the past, and if you haven’t, now you will. To this day, DeepMind has acquired a number of achievements since it was founded, but it is most notable for … Read more

F-beta Score in Keras Part II

Creating custom F-beta score for multi classification problems in Keras Photo by Edgar Chaparro on Unsplash In the previous article (part I), we explained stateless and stateful metrics in Keras, derived the formula for f-beta score and created both stateless and stateful custom f-beta metric in Keras for binary classification problems. In this article (part … Read more

F-beta Score in Keras Part I

The big question now is which of precision and recall should we consider as our evaluating metric? Well, it depends on our choice and the context of our problem. Precision will be our metric of interest if False Positive is more consequential than False Negative i.e. we want to avoid type I error more than … Read more

React & D3: Rendering Circles On A Map

Image by author In my last article React & D3: Rendering A Map I walked through the code to render a map using both the D3 within React and D3(math)/React(DOM) approaches. It’s a great entry point for anyone with D3 expertise but relatively new to the React. In this article I will perform one last … Read more

Analyzing Solar Power Energy (IoT Analysis)

[This article was first published on, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Interested in more tutorials? Learn more R tips: 👉 Register for our blog … Read more

Categories R Tags ExcerptFavorite

Using RStudio with Github Classroom

In March, 12th, Github has launched the Github Classroom platform. TL; DR, you can continue. For the long story, click here. Classroom For those that want to know more about the capabilities of Github Classroom, I recommend you start here. Using RStudio Why do we need this tutorial? Well, Github Classroom already allows an auto-integration … Read more

Categories R Tags ExcerptFavorite

Dataset of All ISL Results Season 1 and 2

The second ISL season wrapped up a couple of weeks ago, meaning that now we’re in the off-season. I loved watching the CBC’s coverage of the ISL. The races were exciting, the swimmers seemed like they were having a great time, they were absolutely swimming great times. My other sporting love is the NBA and … Read more

Categories R Tags ExcerptFavorite

A NOTE on URL checks of your R package

[This article was first published on Posts on R-hub blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Have you ever tried submitting your R package to CRAN … Read more

Categories R Tags ExcerptFavorite

Selecting the Best Phylogenetic Evolutionary Model

With this blog post, I show how to use the mcbette R package in an informal way. A more formal introduction on mcbette can be found in the Journal of Open Source Science . After introducing a concrete problem, I will show how mcbette can be used to solve it. After discussing mcbette, I will … Read more

Categories R Tags ExcerptFavorite

Advancing healthcare with the Healthcare Interoperability Readiness ProgramAdvancing healthcare with the Healthcare Interoperability Readiness ProgramGlobal Director, Healthcare Strategy and Solutions, Google CloudVice President, Business Application Platform, Google Cloud

The 21st Century Cures Act, a United States law enacted at the end of 2016, mandates patient data interoperability for payers, providers, and healthcare organizations. As we approach rolling implementation deadlines, healthcare organizations are wrestling with how to liberate data from siloed systems—not only to give patients more granular control of their data, but also … Read more

Does Data Science Make You Happy?

© Yulia Lukashina Career How to make sure that data science is your calling? Indeed, if you feel good in your job, you are on the right path then! I am totally convinced that everyone can be good (and earn good money!) only in the job they like doing. If you are bored by your … Read more

Testing AWS Lambdas Locally with a HTTPS Proxy

AWS Lambda is generally one of the easiest ways to deploy and execute code in the cloud, especially when deploying code with the sam CLI. The simplicity of serverless resource definitions coupled with the ability to package resources locally and ensure they run on AWS provides a beautiful development experience. Except sometimes this beautiful process … Read more

A Bayesian Model for Estimating the Effects of Covid-19 Vaccines

PYMC3, Bernoulli Distribution, Pfizer, Moderna, AstraZeneca Because of the trials are still ongoing, researchers caution against making head-to-head comparisons of vaccines based on incomplete data. But for the sake of learning, we will do it anyway, just not making any meaningful conclusions. Recently, the announcements went out that the potential effectiveness of SARS-CoV-2 vaccine candidates … Read more

Digital event: Explore how data and analytics will impact the future of your business

Planning strategic data and analytics initiatives is now critical for helping your organization build the agility and resilience needed to successfully navigate the future. Join us on December 3, 2020, from 9:00 AM to 10:30 AM Pacific Time (UTC-8), for the Shape Your Future with Azure Data and Analytics digital event to explore how data … Read more

Is Data Science worth it or overrated?

Opinion And its true business meaning within digital transformation Photo by National Cancer Institute on Unsplash Businesses are increasingly reducing, offloading, and offshoring their data science capabilities. Once dubbed as “The Sexiest Job of the 21st Century” by Harvard Business Review and other leading publications, what went wrong? A bit of background, I am a … Read more

3D Rotations: Intuitions And Limitations

Rotations In 2D — The most common types of rotations that everyone thinks about are rotations in the plane (2D rotations). 2D rotations are what we are used to when drawing diagrams on paper or seeing figures in high school textbooks. 2D rotations occur over a single axis, the “z-axis” perpendicular to the plane. We … Read more

How to Get Yourself a Mentor

If not for the guidance of the handful of people I call mentors, I wouldn’t have grown to be a Senior Data Scientist. 90% of the success is all my hard work, but I attribute 10% of that growth to the crucial directions I was guided towards. After graduating, I didn’t know if I had … Read more

Moviegoer — 5 Reasons Why Cinema is the Perfect Dataset of Emotional and Anthropological Knowledge

Moviegoer Movies demonstrate emotional communication atop valuable societal context. They are the perfect dataset for emotional AI models. What can movies teach AI models about how people communicate and express emotion? (image by author) Smart devices, digital assistants, and service chatbots are becoming ubiquitous, but they don’t yet have the emotional capacity they need to … Read more

Approaching a Machine Learning Problem

I have saved the dataset as heart.csv in the same directory as the notebook. Let’s load it using Pandas. import pandas as pdheart = pd.read_csv(‘heart.csv’) heart here is the pandas DataFrame that is created out of our csv file. Using this DataFrame, we can access several methods and properties of our data. Let us have … Read more

Impressive Medium Articles on AI/ML This Month (Nov)

Read Time: 33 mins Target Audience: Machine Learning Engineers (Mobile focused) Review This article is a transcription of a podcast, for readers who prefer to listen as opposed to read, click here to view the video version. This article details the conversation held between Jeremie Harris and Matthew Stewart on subject matters concerning machine learning … Read more

How We Can Commoditize Data Integration Pipelines

Most engineers in their professional life will have to deal with data integrations. In the past few years, a few companies such as Fivetran and StitchData have emerged for batch-based integrations, and Segment for event-based ones. But none of these companies have solved the problem of data integrations, which becomes more and more complex with … Read more

Semi-supervised Anomaly Detection using Auto Encoders

A convolutional auto encoder based approach for semi-supervised anomaly detection in images. In this article, I’ll be discussing a paper [1] that proposes an AutoEncoder based approach for the task of semi-supervised anomaly detection. If you want to look at the GitHub repository link, results and conclusion directly, please scroll to the bottom of the … Read more

Data Preprocessing with Python Pandas — Part 3 Normalisation

Image by mohamed Hassan from Pixabay This tutorial explains how to preprocess data using the Pandas library. Preprocessing is the process of doing a pre-analysis of data, in order to transform them into a standard and normalised format. Preprocessing involves the following aspects: missing values data formatting data normalisation data standardisation data binning In this … Read more

Explain by Example: Deep Learning (NN)

I was at an airport recently — a statement I didn’t think I would be able to make for another year or so. And I don’t know about you guys but I feel like I have been cursed by the airport gods because for some reason, I can never get past baggage check-in without overweight … Read more

Non-circular PCA and Neural Decoder

References: [1] N. Even-Chen, D.G. Muratore, S.D. Stavisky, and et al. Power-saving design opportunities for wireless intracortical brain computer interfaces. (2020) Nat Biomed Eng. [2] X. Li, T. Adali, and M. Anderson. Noncircular principal component analysis and its application to model selection (2011). IEEE Transactions on Signal Processing, 59(10):4516–4528. [3] David A. Markowitz, Yan T. … Read more

Learn How Neural Networks Learn

What was the problem? Well, I trusted my sister more than my parents to give me the right answer. I assigned more weight to my sister’s explanation than my parents’ answer. Next time, I’ll know to listen more to my parents and less to my sister. This process of adjusting weights is called backpropagation. When … Read more

A Beginners Guide to Feature Engineering with QGIS

Install QGIS QGIS can be painlessly installed from their website here. At the time of writing I’m using version 3.16.1 so expect things to look slightly different in later versions. Ground Yourself! When you first open QGIS you’ll be faced with a blindingly white background with tons of tiny buttons on the top and sides. … Read more

Simple Yet Effective Data Preprocessing Toolbox

Binned Average Monthly Hours (Left: Fixed width cut, Right: quartile cut) Occasionally there are many null values which might invoke error during one hot encoding or machine learning operations. Therefore, we might want to remove or impute the null values in the specified features. There are many strategies to impute values such as median, mean, … Read more

MARS: Multivariate Adaptive Regression Splines — How to Improve on Linear Regression?

Machine Learning A visual explanation of the MARS algorithm with Python examples and comparison to linear regression Model prediction comparison between MARS and Linear Regression. Image by author. Machine Learning is making huge leaps forward, with an increasing number of algorithms enabling us to solve complex real-world problems. This story is part of a deep … Read more

Five Views of AI Risk

Source: Photo by Loic Leray on Unsplash Thirty years from now, will we look back at 2020 as the year when AI discriminated against minority groups, disinformation propagated by special interest groups and aided by AI-based personalization caused political instability, deep fakes and other AI-supported security infringements basically rendered AI untrustworthy and propelled us into … Read more

Sorting a Dictionary in Python

Using the items() method If we want to get a sorted copy of the entire dictionary, we need to use the dictionary items() method: print(dictionary_of_names.items())# dict_items([(‘beth’, 37), (‘jane’, 32), (‘john’, 41), (‘mike’, 59)]) Notice how the items() method returns a dict_items object, which looks similar to a list of tuples. This dict_items object is an … Read more

How to Train and Deploy Custom AI-Generated Quotes using GPT2, FastAPI, and ReactJS

Generate quotes at will Good quotes help make us stronger. What is truly inspiring about quotes is not their tone or contentedness but how those who share them reflect life experiences that really serve others. I didn’t write the above quote about quotes (Quote-ception; bad pun?), but an AI model I trained did. And it … Read more

The importance of probabilistic thinking when faced with uncertainty

Why probabilistic thinking can be an extraordinarily powerful and practical tool to approach the unknown and uncontrollable Photo by Loic Leray on Unsplash In a video published in November 2020, physician and philosopher of science Etienne Klein called for engineers to express their opinions and participate in a public debate. As a result, I decided … Read more

Introducing the Notion API Ruby Gem

The NotionAPI Gem is available for installation here, and the GitHub repository is here. To get started, all you need is to retrieve your token_v2 credentials (open a Notion session in the browser, navigate to cookies, and look for the token_v2 key). After that, you can begin a session with the following code: require “notion_api”@client … Read more