Towards Explainable Graph Neural Networks

Table of Contents What are Graph Convolutional Networks (GNNs) Current Challenges with Explainability for GNNs First Attempt: Visualizing Node Activations Reusing Approaches from Convolutional Neural Networks Model-Agnostic Approach: GNNExplainer About Me References Foreword This is a slightly more advanced tutorial that assumes a basic knowledge of Graph Neural Networks and a little bit of computational … Read more

Azure Files share snapshot management by Azure Backup is now generally available

Microsoft Azure Files offers fully managed file shares in the cloud that are accessible via the industry standard Server Message Block (SMB) protocol. For users of Azure Files, share snapshots have offered a read-only version of file shares from a previous point in time. Share snapshots are also incremental in nature, making their storage usage … Read more

Movie Recommendation With Recommenderlab

[This article was first published on r-bloggers | STATWORX, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Because You Are Interested In Data Science, You Are Interested In … Read more

Categories R Tags ExcerptFavorite

Data science from scratch

Source Data Science, which is also known as the sexiest job of the century, has become a dream job for many of us. But for some, it looks like a challenging maze and they don’t know where to start. If you are one of them, then continue reading. In this post, I’ll discuss how you … Read more

The most important part of Artificial Intelligence Systems development

Nowadays, AI-related fields like Data Science and Machine Learning have become mainstream. Practically everyone can now become a data scientist. That’s, of course, a good thing, but unfortunately, people forget about what an AI system is. I prefer the notion by Patrick Winston from MIT: Artificial Intelligence (or Intelligent Agent) is a system based on … Read more

Acing the Verizon Media Data Science Interview

Photo by Leon Seibert on Unsplash Verizon Media is the two-year-old subsidiary of the Verizon Communications conglomerate responsible for developing online, digital media and advertising services. The subsidiary emerged in the wake of Verizon’s well-chronicled acquisition of two companies that played a massive role in ushering in a host of widely-used digital services for online … Read more

Data Science, Machine Learning or AI: Where Should I Start?

Image by StartupStockPhotos from Pixabay Spoiler: the answer is often “none of the above”. One can barely look at the news these days without coming across a reference to data science, machine learning or artificial intelligence. And for good reason — the explosion in collected data via connected sensors and devices combined with breakthroughs in … Read more

Winning the Data Compression Game

How it’s possible to beat world-class compression algorithms in certain special cases Photo by Rodrigo Pereira on Unsplash There are several powerful data compression programs in widespread use. Some famous examples are gzip, bzip2, and pkzip. Recently I started wondering if, given a specific input file to be compressed, I could create an algorithm that … Read more

4 for 4.0.0 – Four Useful New Features in R 4.0.0

[This article was first published on R – Detroit Data Lab, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. With the release of R 4.0.0 upon us, let’s … Read more

Categories R Tags ExcerptFavorite

Language ModelingII: ULMFiT and ELMo

This is Part 2 of the 4 part series on language modeling. Language models actively used in search engines In the previous post, we understood the concept of language modeling and the way it differs from regular pre-trained embeddings like word2vec and GloVe. On our journey to towards REALM (Retrieval-Augmented Language Model Pre-Training), we will … Read more

Beginners’ Guide to Image Classification: VGG-19, Resnet 50 and InceptionResnetV2 with TensorFlow

I still remember the first time I heard about deep learning and the thrill of hearing the term, thinking it as a mysterious black box and all-mighty models. Image classification is a simple yet meaningful task for beginners to learn about deep learning and get used to its notation and concepts. In this project, I … Read more

Image Segmentation: Predicting Image Mask with Carvana Data

#Data The data set sources from an image masking challenge hosted on Kaggle from Carvana. Sample picture from data set. Left picture is original photo, middle shows mask and the right picture shows the original subtracts background, keeping the car only. There are only two classes present: class 0: background class 1: foreground, the car … Read more

What is Search-Based Analytics?

In addition to the general challenges presented by applying NLP to data to create search-based analytics, there are use-case specific challenges. If you survey how 100 companies are using business intelligence tools, you are likely to get 100 different answers. That’s because all industries work with different data. Even individual companies within the same industry … Read more

Testing for Covid-19 in the U.S.

For almost a month, on a daily basis, we are working with colleagues (Romuald, Chi and Mathieu) on modeling the dynamics of the recent pandemic. I learn of lot of things discussing with them, but we keep struggling with the tests. Paul, in Montréal, helped me a little bit, but I think we will still … Read more

Categories R Tags ExcerptFavorite

Workflow automation tools for package developers

[This article was first published on Posts on R-hub blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. As a package developer, there are quite a few things … Read more

Categories R Tags ExcerptFavorite

shinyFeedback 0.2.0 CRAN Release

[This article was first published on Posts on Tychobra, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. I am excited to announce that shinyFeedback 0.2.0 is on its … Read more

Categories R Tags ExcerptFavorite

New Orleans and Normalization

[This article was first published on R on kieranhealy.org, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. My post about Apple’s mobility data from a few days ago … Read more

Categories R Tags ExcerptFavorite

lmSubsets: Exact variable-subset selection in linear regression

[This article was first published on Achim Zeileis, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. The R package lmSubsets for flexible and fast exact variable-subset selection is … Read more

Categories R Tags ExcerptFavorite

Causal Models for Regression

Causal Model: Example It is about time to introduce an example. Educational researchers are interested in the determinants of student achievement on standardized tests such SAT, ACT, GRE, PISA, and the likes. The SAT test is assessed on a continuous scale ranging between 400 and 1600 points and is particularly amenable to regression analysis. A … Read more

Production machine learning isn’t hard anymore

Deploying machine learning microservices has gotten easier Source: Pexels Within many software engineering disciplines, production use cases are fairly standardized. Take web development, for example. To implement authentication in a web app, you wouldn’t invent a database, write your own hashing function, or design a new auth strategy. You’d use one of a few well-defined … Read more

Data Science for analytical minds | Introduction

Are you someone who: Has studied concepts of statistics, econometrics or mathematics? Loves reasoning, creating wild hypothesis and findings flaws in your concepts or theories? Planning to start a career in data science* or looking for ways to accelerate your data science career? If yes, then you’ve landed on the right 5 part blog series … Read more

41 Questions to Test your Knowledge of Python Strings

1. How would you confirm that 2 strings have the same identity? The is operator returns True if 2 names point to the same location in memory. This is what we’re referring to when we talk about identity. Don’t confuse is with ==, the latter which only tests equality. animals = [‘python’,’gopher’]more_animals = animalsprint(animals == … Read more

AWS Config adds additional conformance packs

The AWS Control Tower Guardrails conformance pack template contains all of the guardrails that are based on AWS Config rules. Use this conformance pack to apply AWS Control Tower detective guardrails to your existing accounts before you enroll the accounts in AWS Control Tower. You can also use this conformance pack to manage resources in … Read more

Categories AWS ExcerptFavorite

K-Nearest Neighbors Classification from Scratch with NumPy

Welcome to another post of implementing machine learning algorithms from scratch with NumPy. In this post, I will implement K-nearest neighbors (KNN) which is a machine learning algorithm that can be used both for classification and regression purposes. It falls under the category of supervised learning algorithms that predict target values for unseen observations. In … Read more

Why The Central Limit Theorem in Data Science?

Today I’ll be discussing what the central limit theorem (or CLT) is and why is it important for every data science enthusiast to know. The central limit theorem states that for a given dataset with unknown distribution, the sample means will approximate the normal distribution. In other words, the theorem states that as the size … Read more

Improving your security posture with centralized secrets managementImproving your security posture with centralized secrets managementDeveloper Advocate and Product Manager

Adopting centralized secrets management is an important step in improving your organization’s security posture. Centralized solutions provide unified auditing, access controls, and policy management, but many organizations struggle to install, configure, and drive internal adoption of these solutions due to lack of integrations, lack of experience, or organizational resistance. One of the biggest advantages of … Read more

Conda (+ pip) and Docker FTW!

A solution to the environment and package management problems that plague data science projects. Combining Conda envs and Docker containers can increase the reproducibility of your data science workflows. Source As this post focuses on incorporating Docker into an existing Conda (+ pip) workflow, I assume that you are already using Conda (+ pip) to … Read more

R package numbr 0.11.3 posted

[This article was first published on RStats – Tom Hopper, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. My simple package of useful numeric functions, numbr, has been … Read more

Categories R Tags ExcerptFavorite

X is for scale_x

[This article was first published on Deeply Trivial, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. These next two posts will deal with formatting scales in ggplot2 – … Read more

Categories R Tags ExcerptFavorite

Creating custom plotting functions with matplotlib

A short tutorial to learn how to create modular functions that involve plotting with matplotlib TLDR: Define your own functions that involve plotting onto a specific axes with the following syntax: def custom_plot(x, y, ax=None, **plt_kwargs):if ax is None:ax = plt.gca()ax.plot(x, y, **plt_kwargs) ## example plot herereturn(ax)def multiple_custom_plots(x, y, ax=None, plt_kwargs={}, sct_kwargs={}):if ax is None:ax … Read more

Tips before migrating to a newer R version

This post is based on real events. Several times when I installed the latest version of R, and proceeded to install all the packages I had in the previous version, I encountered problems. It also applies when updating packages after a while. I decided to make this post after seeing the community reception to a … Read more

Categories R Tags ExcerptFavorite

Microsoft Services is now a Kubernetes Certified Service Provider

Modern applications are increasingly built using containers, which are microservices packaged with their dependencies and configurations. For this reason, many companies are either containerizing their existing applications or creating new complex applications that are composed of multiple containers. As applications grow to span multiple containers deployed across multiple servers, operating them becomes more complex. To … Read more

ZeroR: The Simplest Possible Classifier… or: Why High Accuracy can be Misleading

In one of my most popular posts So, what is AI really? I showed that Artificial Intelligence (AI) basically boils down to autonomously learned rules, i.e. conditional statements or simply, conditionals. In this post, I create the simplest possible classifier, called ZeroR, to show that even this classifier can achieve surprisingly high values for accuracy … Read more

Categories R Tags ExcerptFavorite

Common Graph Theory Problems

This post aims to give an extensive yet intuitive set of problem statements and possible solutions using Graph Theory. A lot of problems we encounter every day could be paraphrased to a graph problem or a near similar subproblem. So it’s required to have some familiarity with different graph variations and their applications. If you … Read more

Neural Networks Overview

My intent is to walk with you through the main concepts of Neural Networks using analogies, math, code, plots, drawings, and mind maps. We focus on the building block of Neural Networks: Perceptrons. Throughout the article, we will confront the intimidating math and implement it using Python code with Numpy. We will also look at … Read more

Algorithm Bias In Artificial Intelligence Needs To Be Discussed (And Addressed)

A step forward would be the inclusion of diversity efforts at the early stages of any process or project within the AI industry. We can even go as far back as encouraging diversity in academic institutions. I have experienced the lack of diversity within the technology field. During my MSc in Machine learning and Computer … Read more

Einstein’s Gravity Theory and the Bending of Light by the Sun

The general theory of relativity (published in 1915), Albert Einstein’s geometric theory of gravity, is regarded by many scientists as probably the “most beautiful of all existing physical theories.” In this article, I will describe one of the classical tests of general relativity, arguably the most famous one, which is the bending of light by … Read more

Constructing, Manipulating, Classifying and Generating Audio with Digital Signal Processing and…

How is Sound Simulated with Digital Signal Processing? A digital signal, like an analog signal, is a representation of sound. There are two main aspects to a digital signal that allows a machine to render a representation. These two aspects can be thought of as X and Y; Time and Amplitude, respectively: Sample Rate and … Read more

Python List Methods

A summary of the list methods Python has to offer Photo by Markus Winkler on Unsplash In Python, a list is a collection of pieces of data. A list is surrounded by square brackets [ ] with each item separated by a comma ( , ), and can contain anywhere from zero to infinity items … Read more

What Does it Take to Break into Data Science with a Bachelor’s Degree?

Photo by Carlos Muza on Unsplash Things I’ve learned from my experience completing a data science boot camp and landing a data science role with only an undergraduate degree. You’re a recent college grad or have a year or three of work experience after obtaining your non-data-science-related Bachelor’s degree, but you are starting to have … Read more

How Large Companies Evaluate AI Startups

As an AI Consultant, I often have to evaluate AI startups to determine if their solutions would make sense for us and if they can become a real business partner over the long term. Our process for evaluating startups is more or less similar to what VCs call due diligence. This need of evaluating ML … Read more

Apple Data Scientist Interview Questions and Answers!

Q: Describe the difference between L1 and L2 regularization, specifically in regards to the difference in their impact on the model training process. Both L1 and L2 regularization are methods used to reduce the overfitting of training data. Least Squares minimizes the sum of the squared residuals, which can result in low bias but high … Read more

RcppArmadillo 0.9.870.2.0

[This article was first published on Thinking inside the box , and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Armadillo is a powerful and expressive C++ template library … Read more

Categories R Tags ExcerptFavorite