[Paper Summary] Distilling the Knowledge in a Neural Network

Photo by Aw Creative on Unsplash The authors start the paper with a very interesting analogy to explain the notion that the requirements for the training & inference could be very different. The analogy given is that of a larva and its adult form and the fact the requirements of nourishments for the two forms … Read more [Paper Summary] Distilling the Knowledge in a Neural Network

The Correct Way to Measure Inference Time of Deep Neural Networks

The network latency is one of the more crucial aspects of deploying a deep network into a production environment. Most real-world applications require blazingly fast inference time, varying anywhere from a few milliseconds to one second. But the task of correctly and meaningfully measuring the inference time, or latency, of a neural network, requires profound … Read more The Correct Way to Measure Inference Time of Deep Neural Networks

How to scrape ANY website with python and beautiful soup

Now you don’t need to know how HTML/CSS works (although, it can be really helpful if you do). The only thing that’s important to know is that you can think of every HTML tag as an object. These HTML tags have attributes that you can query, and each one is different. Each line of code … Read more How to scrape ANY website with python and beautiful soup

Measuring Agreement with Cohen’s Kappa Statistic

This lesser-known metric can help you better evaluate how models perform on imbalanced data A lot of the most intriguing — to me — use cases for classifications are to identify outliers. The outlier may be a spam message in your inbox, a diagnosis of an extremely rare disease, or an equity portfolio with extraordinary … Read more Measuring Agreement with Cohen’s Kappa Statistic

Is Facial Recognition Technology Racist? State of the Art algorithms explained

Let’s break this function down one by one. The first component, face classification, simply penalizes the model for saying that there is a face at a location, while no face exists in the image. “Face box regression” is a fancy term for the distance between the bounding box coordinates of the predicted face and the … Read more Is Facial Recognition Technology Racist? State of the Art algorithms explained

A Complete Beginner’s Guide to Deal with NULL Values in SQL

We cannot use comparison operators=,<,>,<>to test for NULL values. Instead, we have to use IS NULL and IS NOT NULL predicates. IS NULL: Return rows that contain NULL values Syntax: expression IS NULL SELECT ID, Student,Email1,Email2FROM tblSouthParkWHERE Email1 IS NULL AND Email2 IS NULLORDER BY ID The above query yields all records where both Email1 … Read more A Complete Beginner’s Guide to Deal with NULL Values in SQL

AI pseudoscience and scientific racism

Recent attempts to predict criminality from facial features recall a long tradition of unethical and racist pseudoscience Source: Wikimedia Commons A recent paper about to be published by Harrisburg University caused quite a stir earlier this month. Titled “A Deep Neural Network Model to Predict Criminality Using Image Processing,” the paper promised: With 80 percent … Read more AI pseudoscience and scientific racism

Coronavirus: Which country got it right?

Note from the editors: Towards Data Science is a Medium publication primarily based on the study of data science and machine learning. We are not health professionals or epidemiologists, and the opinions of this article should not be interpreted as professional advice. To learn more about the coronavirus pandemic, you can click here. In this … Read more Coronavirus: Which country got it right?

How (NOT) To Predict Stock Prices With LSTMs

Not so recently, a brilliant and ‘original’ idea suddenly struck me — what if I could predict stock prices using Machine Learning. After all, a time series can be easily modeled with an LSTM. I could see myself getting rich overnight! If this is so easy, why hasn’t anyone done it yet? Very excited at … Read more How (NOT) To Predict Stock Prices With LSTMs

Why Building an AI Decentralized Autonomous Organization (AI DAO)

Beyond the already complex challenge of implementing AI, some companies have started analyzing the possible benefits of building an AI Decentralized Autonomous Organizations (AI DAOs). During my latest mission, I had to help create new business models, identify the right AI approach, and create a roadmap for the creation of several AI DAOs proof of … Read more Why Building an AI Decentralized Autonomous Organization (AI DAO)

The Bechdel test and the X-Mansion with tidymodels and #TidyTuesday

[This article was first published on rstats | Julia Silge, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Lately I’ve been publishingscreencasts demonstrating how to use thetidymodels framework, … Read more The Bechdel test and the X-Mansion with tidymodels and #TidyTuesday

My Tableau dashboards sucked – until I started drawing them

Data visualization tools such as Tableau are loved and used because of how simple they make it to show correlations in large datasets. The exact reason they are used is also their biggest flaw. It’s too easy to simply click buttons until you find something which looks acceptable. Lets look at some examples. I’ve recreated … Read more My Tableau dashboards sucked – until I started drawing them

Why R? Webinar – JD Long – Helping drive data science adoption in organizations

[This article was first published on http://r-addict.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. July 2nd (8:00pm UTC+2) is a date for the last Webinar at Why … Read more Why R? Webinar – JD Long – Helping drive data science adoption in organizations

one bridge further

[This article was first published on R – Xi’an’s Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Jackie Wong, Jon Forster (Warwick) and Peter Smith have just … Read more one bridge further

Learn How to Create Web Data Apps in Python

import streamlit as stimport pandas as pdimport plotly.express as pximport pydeck as pdkimport numpy as np#Load and Cache the [email protected](persist=True)def getmedata():url = ‘https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv’df = pd.read_csv(url, delimiter=’,’, header=’infer’)df.rename(index=lambda x: df.at[x, ‘Country/Region’], inplace=True)dft = df.loc[df[‘Province/State’].isnull()]dft = dft.transpose()dft = dft.drop([‘Province/State’, ‘Country/Region’, ‘Lat’, ‘Long’])dft.index = pd.to_datetime(dft.index)return(dft, df)df1 = getmedata()[0]st.title(‘Building a Data Dashboard with Streamlit’)st.subheader(‘while exploring COVID-19 data’)#####In Scope Countriescountrylist … Read more Learn How to Create Web Data Apps in Python

Predicting Future Wars

Insights from Open Data and Machine Learning I know what you are thinking: Wars are rare and complicated events, one can’t expect to take into account their entire complexity. And you are right, they spring from an intricate array of political, economic, and historical reasons without forgetting the thick coat of randomness, thus they should … Read more Predicting Future Wars

Industrialize Analytics — How do we get there?

This article is Part 1 of the series “Winning in Analytics!”. Let’s look at key enablers, to scale your AI initiatives with success. Photo by Tim Mossholder on Unsplash Dear AI Enthusiasts, we love to realize the full potential of our data! We would love to see our analytics proof of concepts achieve reality! But … Read more Industrialize Analytics — How do we get there?

Neural Networks using Tensorflow via Keras in R – Video

[This article was first published on http://r-addict.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. On June 25th we had a pleasure to host Why R? Webinar with … Read more Neural Networks using Tensorflow via Keras in R – Video

Black-Scholes Option Pricing is Wrong

Theory, assumptions, problems, and solutions for practitioners Photo by Pixabay from Pexels The equation offered by Black and Scholes (1973) is the standard theoretical pricing model for European options. The keyword being theoretical as the Black-Scholes model makes some key assumptions that are immediately violated in practice. Key model assumptions: No transaction costs No arbitrage … Read more Black-Scholes Option Pricing is Wrong

How to Avoid Potential Machine Learning Pitfalls

This post is for all those data science aficionados out there who recently jumped on to the machine learning bandwagon. Whether you studied data science in college or are autodidactic, most aspiring data scientists get a reality check when trying their hand on a machine learning project in a practical setting. I struggled with the … Read more How to Avoid Potential Machine Learning Pitfalls

A guide to setting up monitoring for object creation in Cloud StorageA guide to setting up monitoring for object creation in Cloud StorageBig Data and Analytics Cloud ConsultantStrategic Cloud Engineer

Cloud Storage provides worldwide, highly durable object storage that scales to exabytes of data. This means customers of all sizes and industries can use it to store and protect any amount of data for a range of use cases, such as websites, mobile applications, backup and restore, archive, and big data analytics. The objects are … Read more A guide to setting up monitoring for object creation in Cloud StorageA guide to setting up monitoring for object creation in Cloud StorageBig Data and Analytics Cloud ConsultantStrategic Cloud Engineer

Machine Learning Explainability Introduction via eli5

Tree-based Feature Importance Machine learning model such as random forests is typically treated as a black-box. Why? A forest consists of a large number of deep trees, where each tree is trained on bagged data using a random selection of features. To gaining a full understanding by examining each tree would close to impossible. For … Read more Machine Learning Explainability Introduction via eli5

How to accelerate and compress neural networks with quantization

Going from floats to integers Neural networks are very resource intensive algorithms. They not only incur significant computational costs, they also consume a lot of memory in addition. Even though the commercially available computational resources increase day by day, optimizing the training and inference of deep neural networks is extremely important. If we run our … Read more How to accelerate and compress neural networks with quantization

Factorization Machines for Item Recommendation with Implicit Feedback Data

Model Evaluation Now that all the theory is out of the way, let’s see how these components come together to produce high-quality recommendations on a well-known real-world data set. We’ll train an implicit feedback FM model using the author’s new RankFM package which implements the techniques described above and compare its performance to the popular … Read more Factorization Machines for Item Recommendation with Implicit Feedback Data

Five reasons to view this Azure Synapse Analytics virtual event

The virtual event Azure Synapse Analytics: How It Works is now available on demand. In demos and technical discussions, Microsoft customers explain how they’re using the newest Azure Synapse Analytics capabilities to deliver insights faster, bring together an entire analytics ecosystem in a central location, reduce costs, and transform decision-making. This post outlines five key … Read more Five reasons to view this Azure Synapse Analytics virtual event

Would You Rather be a Data Analyst or Data Scientist?

How does it feel to be in one of these roles? Find out here. Photo by Christina @ wocintechchat.com [1]. Introduction Data Analyst Data Scientist Summary References After working as both a professional data analyst and data scientist, I thought it would be insightful to highlight the experience of each position along with some key … Read more Would You Rather be a Data Analyst or Data Scientist?

What Covid-related topics are being discussed in Spotify Podcasts?

To my excitement, Spotify already has a developer API in which we can use to get data from Spotify or trigger certain actions for Spotify users. What we need to do is just register to the site, create an app, and get the API token. Then, we can use spotipy package in Python to retrieve … Read more What Covid-related topics are being discussed in Spotify Podcasts?

Product Placement, Pricing and Promotion Strategies with Association Rule Learning

Using the Apriori algorithm to offer product recommendation, product placement, pricing and bundling strategies Imagine if we could understand what our customers’ next purchase could be! Imagine if we could find patterns in purchase behaviour and use it to our advantage! The key to the future is in history! Market Basket Analysis helps retailers identify … Read more Product Placement, Pricing and Promotion Strategies with Association Rule Learning

Coloring an Image using Crayola Colors (Python)

Creating an array of colors. Firstly, I picked the corresponding RGB values for a 120 crayons Crayola box and copied them into a list. colorsFile = open(“colors.txt”,”r”)colors = []for line in colorsFile.readlines():colorset = line.strip().split(” “)rgbFormat = [int(x) for x in colorset[2].split(“,”)]colors.append(rgbFormat) Secondly, I started by picking an image and resizing it to a smaller size. … Read more Coloring an Image using Crayola Colors (Python)

Machine Learning for Neonatal Intensive Care

The neonatal intensive care unit (NICU) is an environment in which life-changing decisions are made. Neonatologists use information from a variety of sources to build up a picture of a newborn’s condition to ensure they are receiving the right medical care. These highly trained specialists use their judgement in tandem with a constant stream of … Read more Machine Learning for Neonatal Intensive Care

What Python package is best for getting data from Twitter? Comparing Tweepy and Twint.

What does that all mean? Practically, if the only thing that you are looking to do is collect a large number tweets, Twint is probably a better tool, whereas Tweepy is better suited for collecting a richer set of metadata, allows for flexibility and potentially scalability as well for those using the official API. That’s … Read more What Python package is best for getting data from Twitter? Comparing Tweepy and Twint.

[Knowledge Distillation] FitNets : Hints For Thin Deep Nets

Image by emmaws4s from Pixabay Note — There is also a YouTube video explaining this paper The paper begins by making a case that often wide & deep models require a huge number of multiplications and that results in high memory and computing demands. Because of this even if the network is a top-performing model … Read more [Knowledge Distillation] FitNets : Hints For Thin Deep Nets

6 Key Areas of Business Intelligence in the New Era

Image by Qimono from Pixabay (CC0) Back in 1958, Han Peter Luhn, a researcher at IBM, initiated the concept of Business Intelligence (BI), using the definition from Webster’s Dictionary: to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal. Given its definition, Business Intelligence is indeed … Read more 6 Key Areas of Business Intelligence in the New Era

State-of-the-art python project setup

An opinionated setup guide for your next python project Python is one of the fastest growing programming languages. It’s tooling is evolving fast to catch up. I have been writing python for over 10 years now and sometimes it’s hard to keep up with all the new tooling out there. Recently, I had an opportunity … Read more State-of-the-art python project setup

Introducing Modeltime: Tidy Time Series Forecasting using Tidymodels

[This article was first published on business-science.io, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. I’m beyond excited to introduce modeltime, a new time series forecasting package designed … Read more Introducing Modeltime: Tidy Time Series Forecasting using Tidymodels

The Three Stages of a Data Scientist

In this article, we’ll explore the differences between data scientists as Decision Support, Advisor and Integrated Partner One of the best and worst parts of being a data scientist is the ambiguity that the role can often entail. Since data science is a relatively new function, the mandate and objectives aren’t always clear. This often … Read more The Three Stages of a Data Scientist

Jean Alfonso-Decena: Leading Innovation in Conversational AI and Disrupting the Philippine FinTech…

WOMEN IN TECHNOLOGY SERIES An Interview with the Head of Operations & Partnerships at Starbutter AI Over the past few years, tech companies and researchers all over the world have been competing to advance the frontiers of artificial intelligence. With the broadening and fast-paced developments in the space of technology, it is clear that utilizing … Read more Jean Alfonso-Decena: Leading Innovation in Conversational AI and Disrupting the Philippine FinTech…

Modules and Packages in Python: Fundamentals for Data Scientists

Understand the basics with a concrete example! Photo by Matthew Fournier on Unsplash When your Python code grows in size, most probably it becomes unorganised over time. Keeping your code in the same file as it grows makes your code difficult to maintain. At this point, Python modules and packages help you to organize and … Read more Modules and Packages in Python: Fundamentals for Data Scientists

Build The World’s Simplest ETL (Extract, Transform, Load) Pipeline in Ruby With Kiba

We’re going to bundle this up in a tiny ruby project. Create our directory $ mkdir kiba-etl && cd kiba-etl/ Add the source CSV Create a CSV file with touch phone.csv and paste in the following. id,number1,123.456.78912,2223,303-030-30304,444-444-44445,900-000-000016,#10000000007,#98989898988,800-000-000009,999.999.999910,1.1.1.1.1.1.1.1.1.111,(112)233-445512,(121)212-0000 In a real situation, you might use a service like Twilio to detect if they’re real phone numbers. … Read more Build The World’s Simplest ETL (Extract, Transform, Load) Pipeline in Ruby With Kiba

Downloading Stock Data and Representing it Visually

Using YFinance and Plotly libraries for Stock Data Analysis Photo by Alec Favale on Unsplash In this article, I will explain to you how you can use YFinance a python library aimed to solve the problem of downloading stock data by offering a reliable, threaded, and Pythonic way to download historical market data from Yahoo! … Read more Downloading Stock Data and Representing it Visually

Why doesn’t Python support i++ increment syntax

On a normal day, this line drew my attention when someone asking for the reason of ++ not as an operator in Python. If you want to know the original reason, you’ll have to either wade through old Python mailing lists or ask somebody who was there (eg. Guido) ~ By stackoverflow And this enforces … Read more Why doesn’t Python support i++ increment syntax

Introduction to Factors in R

Factors play a crucial role in data analysis. Learn how to create, subset, and compare them. A factor refers to a statistical data type used to store categorical variables. Categorical variables belong to a limited number of categories. Continuous variables, on the other hand, can correspond to an infinite number of values. It is important … Read more Introduction to Factors in R

Cocalc vs. Colab — Which Is Better for a Hands-On Workshop?

photo by author I recently had the opportunity to deliver a hands-on workshop on training a Keras deep learning model. This workshop was a follow-on for a session I had done for a local meetup that reviewed the content in my upcoming book for Manning Publications, Deep Learning with Structured Data. After the introductory session … Read more Cocalc vs. Colab — Which Is Better for a Hands-On Workshop?