6 Data Science Certificates To Level Up Your Career

Because of the appeal of the field of data science and the premise of high incomes, more and more people decide to join the field every day. Some may come from a technical background, while others just join in due to curiosity; regardless of the reason you decide to join the field, your no.1 goal … Read more 6 Data Science Certificates To Level Up Your Career

Scikeras Tutorial: A MIMO Wrapper for CapsNet Hyperparameter Tuning with Keras

Building up on our discussion so far, the wrapper would need to override both BaseWrappers.feature_encoder() and BaseWrappers.target_encoder() . Depending on the type of transformation required, we could either resort to writing our custom transformer, or use one of the many transformers that are already offered in sklearn.preprocessing . For this tutorial, we will demonstrate both … Read more Scikeras Tutorial: A MIMO Wrapper for CapsNet Hyperparameter Tuning with Keras

2020 NFL Postseason Predictions from Machine Learning Model — Conference

Why does V 5.0 favor the Bills? Last week I described how the top contributors to the random forest algorithm were mostly power rankings [eatdrinkandsleepfootball], but also efficiency metrics [numberFire]and Elo scores [FiveThirtyEight]. For the Bills and the Chiefs, the power rankings and Elo scores are quite similar, but the efficiency metrics differ a bit, … Read more 2020 NFL Postseason Predictions from Machine Learning Model — Conference

Detection of DeepFakes and other facial image manipulations via AMTENnet

Accuracy and supremacy of AMTENnet AMTENnet was tested against the following state-of-the-art baseline models: In addition, other state-of-the-art modules were used to replace the AMTEN module, thus generating several hybrid models. These modules were SRM filter kernels by [Zhou et al., 2018], Constrained-Conv by [Bayar and Stamm, 2018] and hand-crafted feature extractor by [Mo et … Read more Detection of DeepFakes and other facial image manipulations via AMTENnet

Introducing OddFrames.jl: Data In One Dimension

With my approach to creating a great package to handle one-dimensional data, I wanted something that mixed a lot of concepts, but also really showed the power of the Dictionary datatype. There are many different advantages to using a basic datatype like a dictionary. One of the great things is that dispatch for the type … Read more Introducing OddFrames.jl: Data In One Dimension

How Precision and Recall Affect the Anti-COVID Measures

Now we understand the confusion matrix, we can then get into how we evaluate the results of the testing. For the sake of illustration, I’ll use the following scenario as an example. Instinctively, people would think the accuracy is the right results/the total results. So we have an accuracy of (90+1)/(90+1+1+8)=91%. Wow, that is impressive! … Read more How Precision and Recall Affect the Anti-COVID Measures

Machine Learning Model as a Serverless App using Google App Engine

Create a folder for the project and download the code files for this article from the repository here. Then navigate to this directory using terminal (cd <path_to_dir>) and make sure that the virtual environment is active (conda activate <env_name>). Navigating to the project directory and activating a virtual environment Obviously, you can do the same … Read more Machine Learning Model as a Serverless App using Google App Engine

Detecting Malaria with Deep Learning for Beginners

A beginner’s guide for Image Classification and Convolutional Neural Network (CNN) Image by Cassi Josh In this project, we will go through a dataset provided by the US’ National Institutes of Health for 27,558 different cell images from 150 patients that have been infected by parasites that cause Malaria called Plasmodium falciparum and mixed with … Read more Detecting Malaria with Deep Learning for Beginners

A Zero-Maths Introduction to Bayesian Statistics

Decoding the crusades of the statistics world — Bayesian vs Frequentism This one doesn’t need much introduction. Thousands of articles, papers have been written and a few wars have been fought on Bayesian vs Frequentism. In my experience, most folks start with usual linear regression and work their way up to build more complex models … Read more A Zero-Maths Introduction to Bayesian Statistics

These are the top 10 skills you need to master in 2021

Photo by Dmitry Ratushny on Unsplash A new report by Skillsoft reveals the most useful skills to have in 2021 Suppose you are an employee, a job-seeker, or a manager. In that case, you have many opportunities to refresh your skills or add a new skill to your list at the beginning of the year. … Read more These are the top 10 skills you need to master in 2021

Transformation of a simple movie dataset into a functional Recommender System

The recommender system presented in this article was realized in 4 major steps:– Step 1: Calculation of the weighted average score of each movie in order to propose to the end-user a catalog of the 100 most popular movies of the Cinema– Step 2: Setting up the recommendation of 5 “popular” movies using a machine … Read more Transformation of a simple movie dataset into a functional Recommender System

3 Tips to Succeed as a Data Scientist

Analytical Workflows In this section, I want to describe some general principles for analytical work and how we can design workflows that are reproducible and more efficient. The tools and workflow I use will differ slightly depending on the task but I will walk through an example of the type of problems I have worked … Read more 3 Tips to Succeed as a Data Scientist

Prediction 2021: The Year AI Became Normal

A clear pattern of growth has already emerged in AI: in 2018–19, the phase of experimentation became mature; in 2020, adoptions began in a serious way and suddenly, COVID-19 gave the business leaders an opportunity and impetus to push automation and AI. In 2021, the fallout from a second wave of COVID-19 in the UK … Read more Prediction 2021: The Year AI Became Normal

A Complete Project on Image Classification with Logistic Regression From Scratch in Python

Detailed layout of a logistic regression algorithm with a project Logistic regression is very popular in machine learning and statistics. It can work on both binary and multiclass classification very well. I wrote tutorials on both binary and multiclass classification with logistic regression before. This article will be focused on image classification with logistic regression. … Read more A Complete Project on Image Classification with Logistic Regression From Scratch in Python

Python Beginner Breakthroughs (Functions)

The heart and soul of Python coding… Learning to make clean, simple, and easy to read functions within Python is Priceless. Photo by Shahadat Rahman on Unsplash I think that one could argue that the function programming construct is probably one of the most important concepts in coding. The concept of a function is an … Read more Python Beginner Breakthroughs (Functions)

How to Create a Beautify Combo Chart in Python Plotly

Nobody would deny that line and bar combo chart is one of the most widely used combo charts. In Excel, there is a build-in feature of Combo chart. It is also one of the most popular charts to analyze financial data. In this tutorial, we are going to build a customized combo plot using plotly. … Read more How to Create a Beautify Combo Chart in Python Plotly

4 Machine Learning Concepts I Wish I Knew When I Built My First Model

Feature importance refers to a set of techniques for assigning scores to input variables based on how good they are at predicting the target variable. The higher the score, the more important the feature is in the model. Image created by Author For example, if I wanted to predict the price of a car using … Read more 4 Machine Learning Concepts I Wish I Knew When I Built My First Model

Advanced Options with Hyperopt for Tuning Hyperparameters in Neural Networks

Photo by C M on Unsplash If you’re anything like me, you spent the first several months looking at applications of machine learning and wondering how to get better performance out of the model. I would spend hours, if not days, making minor tweaks to the model, hoping for better performance. Surely, I thought, there … Read more Advanced Options with Hyperopt for Tuning Hyperparameters in Neural Networks

Financial Data from Yahoo Finance with Python

Retrieving company financials from Yahoo Finance In this post, we are going to learn about a super easy to use Python package to retrieve financial data from Yahoo Finance. We will cover the main functionalities of the yfinance library. This will lead us to retrieve both, company financial information (e.g. financial ratios), as well as … Read more Financial Data from Yahoo Finance with Python

Would Jack Realistically Have Died aboard the Titanic?

How machine learning answers the question A walkthrough of Logistic Regression and Naive Bayes. Image source The year was 1912, and the mighty Titanic set sail on her maiden voyage. Jack, a “20 year old” “third class” “male” passenger, won a hand of poker and his ticket to the land of the free. In the … Read more Would Jack Realistically Have Died aboard the Titanic?

A quick reflection on some ethical implications of creative AI

Photo by Nick Morrison on Unsplash AI is increasingly being applied to more creative areas, raising concerns about the protection of intellectual property. Disclaimer: I am not a lawyer, and therefore, this article should not be used as legal advice, so take it as a personal opinion of an experienced observer of emerging technologies and … Read more A quick reflection on some ethical implications of creative AI

Develop a Language Translator System in Python

Wonder how a language detection and translation system works, Use open-sourced Python libraries to develop the same in few lines of code Image by Gerd Altmann from Pixabay Text Language Identification refers to the process of predicting the language of a given text, whereas Text Translation refers to the process of translating a given text … Read more Develop a Language Translator System in Python

Exploratory Factor Analysis vs Principal Components: from concept to application

How to reduce parameters with Exploratory Factor Analysis Photo by Isaac Smith on Unsplash In data science, we often want to measure variables such as social-economic status (SES). Some variables have a lot of parameters (or items), for example, SES can be measured based on income, education, etc. Then, to proceed with the analysis, it … Read more Exploratory Factor Analysis vs Principal Components: from concept to application

Tableau’s relationships are pretty cool

Unlike joining tables into flat files, relationships preserve the native granularity of data. Joins are performed only as needed. Last summer, Tableau introduced a new way of combining data. It is called relationships. The old way of combining data using joins is still available, and I imagine that many of us might stick with the … Read more Tableau’s relationships are pretty cool

Deep learning with containers. Part 2

For siamese training with triplet loss, every training instance should be a group of three samples — anchor, negative, and positive — that is called triplet. We first pass these samples through the neural network to generate three embedding vectors. After that we calculate the Euclidean distance between anchor and negative vectors (D_neg) and anchor … Read more Deep learning with containers. Part 2

The Machine Learning Lifecycle in 2021

In reality, machine learning projects are not straightforward, they are a cycle iterating between improving the data, model, and evaluation that is never really finished. This cycle is crucial in developing an ML model because it focuses on using model results and evaluation to refine your dataset. A high-quality dataset is the most surefire way … Read more The Machine Learning Lifecycle in 2021

Bayesian optimization or how I carved boats from wood. Examples and code.

Examples and code As a kid, I spent my summers with my grandmother in a small village with no peers around and very little entertainment. While reading took up most of my time, I also used to carve small boats from wooden logs. My boats had three features to which I paid particular attention — … Read more Bayesian optimization or how I carved boats from wood. Examples and code.

Forecasting new COVID19 cases in Portugal using Gaussian Processes

Using Python and Bayesian Statistics to forecast 30 days of new cases I would prefer to do this analysis on a different subject and in a different context. Today, Portugal is the country with the biggest absolute number of new cases of COVID-19 per one million people [1]. I don’t want to make this a … Read more Forecasting new COVID19 cases in Portugal using Gaussian Processes

6 NLP Techniques Every Data Scientist Should Know

Towards more efficient natural language processing Photo by Sai Kiran Anagani on Unsplash Natural language processing is perhaps the most talked-about subfield of data science. It’s interesting, it’s promising, and it can transform the way we see technology today. Not just technology, but it can also transform the way we perceive human languages. Natural language … Read more 6 NLP Techniques Every Data Scientist Should Know

Crack Data Science Interviews: Essential Statistics Concepts

Interview questions about missing data look deceivingly easy but challenging. You have to tailor your answers according to the data type and the context. A lot of us, me included, fail to recognize the nature of missing data and tweak their responses accordingly. I’ve done deep research on this topic and come up with the … Read more Crack Data Science Interviews: Essential Statistics Concepts

Texts, Fonts, and Annotations with Python’s Matplotlib

Fonts It might look unimportant, and the default font of Matplotlib is not wrong by any means, so why would you need to change it? Well, one reason might be to conform with some other text of your report. It’s definitely not unusual to have a font family, size, and color pre-defined for a publication. … Read more Texts, Fonts, and Annotations with Python’s Matplotlib

Using Linear Programming to schedule Drivers.

Matrix containing difference in start times for every combination of route and driver. To solve this problem we need to create a decision variable for every single combination of route and driver. Setting up PuLP problem and creating binary decision variables. This sets up the decision variables to be binary only taking a value of … Read more Using Linear Programming to schedule Drivers.

Audio Classification with PyTorch’s Ecosystem Tools

PYTORCH ECOSYSTEM Introduction to torchaudio and Allegro Trains Audio classification with torchaudio and Allegro Trains Audio signals are all around us. As such, there is an increasing interest in audio classification for various scenarios, from fire alarm detection for hearing impaired people, through engine sound analysis for maintenance purposes, to baby monitoring. Though audio signals … Read more Audio Classification with PyTorch’s Ecosystem Tools

How to Upload Ads Data to Google BigQuery

Source:Unsplash Learn how to compare and analyze advertising performance across all your marketing channels in a single, centralized data warehouse, and how to easily upload your ads data to GBQ. If you’re like many marketers, you’re probably running campaigns using a variety of digital marketing channels — including not only Google Ads but also Facebook, … Read more How to Upload Ads Data to Google BigQuery

The validity of psychological and educational tests

Evidence based on content Collect data regarding the representation of items of a test, investigating if they are samples of the domain that they want to measure. The set of items is judged as to its scope, in view of the assessment of the proposed construct. In general, it is done based on the evaluation … Read more The validity of psychological and educational tests

Introduction to DeepMind’s Graph-Nets

A short overview of the core components of Graph-Nets Photo by Iza Gawrych on Unsplash In October 2018, a team of 27 researchers from DeepMind/Google, MIT, and the University of Edinburgh published a paper entitled: “Relational inductive biases, deep learning, and graph networks”. The crux of the paper is — it’s partly a ‘position paper’ … Read more Introduction to DeepMind’s Graph-Nets

How to Connect Azure Data Factory to an Azure SQL Database Using a Private Endpoint

While still in Azure Data Factory, click to create a new linked service. Source: author When you select an integration runtime with virtual network configuration enabled, a managed private endpoint setting will appear in the account selection method section. The setting is read-only and will populate as you enter subscription and server details. If a … Read more How to Connect Azure Data Factory to an Azure SQL Database Using a Private Endpoint

“Exactly-once” semantics across multiple Kafka instances is possible

Tackle cross-cluster transaction problems in Kafka with code Photo Courtesy: Nathan Dumlao on unsplash “Exactly-once” semantics is a challenging problem in a distributed system. To solve it, some notable protocols and algorithms are: Two-phase commit, Paxos, and Raft. This problem becomes even harder when across two instances of a distributed system. Apache Kafka has supported … Read more “Exactly-once” semantics across multiple Kafka instances is possible

Customize your Jupyter Notebooks

Setting up the graphing style Once you are using the themes you will notice that the graphs created with Matplotlib library do not look the best. For example, this is a simple code to create a line chart. import matplotlib.pyplot as plt%matplotlib inlinebp_x = np.linspace(0, 2*np.pi, num=40, endpoint=True)bp_y = np.sin(bp_x)# Make the plotplt.plot(bp_x, bp_y, linewidth=3, … Read more Customize your Jupyter Notebooks

How To Improve Programming Skills, For Data Scientists And Machine Learning Practitioners

Algorithms tend to scare a lot of ML practitioners away, including me. The field of machine learning arose as a method to eliminate the need to implement heuristic algorithms to detect patterns, we left feature detection to neural networks. Still, algorithms have their place in the software and computing domain, and certainly within the machine … Read more How To Improve Programming Skills, For Data Scientists And Machine Learning Practitioners

Why Does Reliability and Stability Matter in Data Science?

Facing our tool’s issue of rapid decline in health and usefulness, it became clear that we needed to define a path forward. This path ahead would improve the reliability and, thus, the code’s stability to ensure problems like this did not arise again and render the application useless. The main area we looked to improve … Read more Why Does Reliability and Stability Matter in Data Science?

Responsible AI at Facebook

Please find below the transcript: Jeremie (00:00): Hey, everyone. Jeremie here. Welcome back to the podcast. I’m really excited about today’s episode because we’ll be talking about how responsible AI is done at Facebook. Now, Facebook routinely deploys recommender systems and predictive models that affect the lives of literally billions of people every day. And … Read more Responsible AI at Facebook

The past, present and future of deep learning

Hebbian Learning Rule In 1949, six years after the inventions of MCP models, Donald O. Hebb strengthened the concept of neurons in his research titled, “The Organization of Behavior”. For his substantial contributions towards deep learning, he’s also called the father of Neural Networks. Donald Hebb (SRC: UBC Open Collections) Now let’s see what Hebbian … Read more The past, present and future of deep learning

How to learn the basics of data science and machine learning for free with Kaggle

A beginners guide to learning data science and machine learning with Kaggle.com Photo by Alexandru Acea on Unsplash What is Kaggle Kaggle is an online community of data scientists and machine learning users. The website has a focus on sharing and analysing data sets; and offers data analysis and machine learning competitions with cash prizes. … Read more How to learn the basics of data science and machine learning for free with Kaggle

Implementing Gradient Descent — Example

A fundamental example of the implementation of gradient descent We have derived an algorithm on how to implement gradient descent here, but there are a lot of nuances when implementing this algorithm in code. If you try to tackle this algorithm with a huge project right out the gate, you will find that it is … Read more Implementing Gradient Descent — Example

What makes for a good movie?

Considering the initial analysis above, let’s try and take some steps to determine the drivers of the differing behaviour of the audience and critics by exploring other aspects of the data set. When thinking about what might influence a viewer’s enjoyment of a movie, an attribute that springs to mind might be its genre. Let’s … Read more What makes for a good movie?

Collaborative Filtering Simplified: The Basic Science Behind Recommendation Systems

Photo by Kari Shea on Unsplash When making consumer decisions, it often seems that we are making conscious choices about the services we use and our preferred products. However, the companies competing for our business are constantly influencing our decisions in subtle ways. Companies often recommend specific products to increase the likelihood that we choose … Read more Collaborative Filtering Simplified: The Basic Science Behind Recommendation Systems