How to Label Audio for Deep Learning in 4 Simple Steps

To create the labels first go Edit > Labels > Add Label at Selection and check the keyboard shortcut (Ctrl+B in Windows) so that you don’t need to go back to the menu every time. Finding the option to add a label. Image by the author. Now that you know the keyboard shortcut select a … Read more How to Label Audio for Deep Learning in 4 Simple Steps

Amazon Personalize enhances Recommendation Filters with filtering on item metadata

Today, we are pleased to announce enhancements to Recommendation Filters in Amazon Personalize, which provide you greater control on recommendations your users receive by allowing you to exclude or include items to recommend based on criteria that you define. For example, when recommending products for your e-retail store you can exclude unavailable items from recommendations; … Read more Amazon Personalize enhances Recommendation Filters with filtering on item metadata

PANDAS: Put Away Novice Data Analyst Status

How Pandas can make you a better data analyst. Learn about one-liners for different steps in the Data Analysis process. Photo by cheese yang on Unsplash Pandas as I call it Put Away Novice Data Analyst Status is a powerful open-source data analysis and manipulation library. It can help you to do various operations on … Read more PANDAS: Put Away Novice Data Analyst Status

Pandas Time/Date Series Functionality

Extensive capabilities and features for time series analysis Photo by Markus Winkler on Unsplash Expanding the Time arrangement, Date functionalities play a major part in monetary information examination. Pandas contain broad capabilities and highlights for working with time series information for all spaces. Utilizing the NumPy datetime64 and timedelta64 dtypes, pandas have solidified a huge … Read more Pandas Time/Date Series Functionality

Manage access to AWS centrally for OneLogin users with AWS Single Sign-On

The interoperability of AWS SSO and OneLogin enables administrators to assign users and groups access centrally to their AWS Organizations accounts and AWS SSO integrated applications. This makes it easier for an AWS administrator to manage access to AWS and ensure OneLogin users have the right access to the right AWS accounts. Ongoing management is … Read more Manage access to AWS centrally for OneLogin users with AWS Single Sign-On

Custom PySpark Accumulators

Photo by Joshua Sortino on Unsplash dict, list and set type of pyspark accumulators Spark, by default, provides accumulators that are int/float that supports the commutative and associative operations. Though spark also provides a class AccumulatorParam to inherit from to support different types of accumulators. One just needs to implement two methods zero and addInPlace. … Read more Custom PySpark Accumulators

Amazon GuardDuty now available in AWS Africa (Cape Town) and Europe (Milan) Regions

Available globally, Amazon GuardDuty continuously monitors for malicious or unauthorized behavior to help protect your AWS resources, including your AWS accounts, access keys, and data stored in Amazon S3. GuardDuty identifies unusual or unauthorized activity, like crypto-currency mining, access to data stores in S3 from unusual locations, or infrastructure deployments in a region that has … Read more Amazon GuardDuty now available in AWS Africa (Cape Town) and Europe (Milan) Regions

Reinforcement Learning — Part 2

Markov Decision Processes Deep Learning at FAU. Image under CC BY 4.0 from the Deep Learning Lecture These are the lecture notes for FAU’s YouTube Lecture “Deep Learning”. This is a full transcript of the lecture video & matching slides. We hope, you enjoy this as much as the videos. Of course, this transcript was … Read more Reinforcement Learning — Part 2

NLP: Classification & Recommendation Project

There are various algorithms that can be used for text classification. Well, I started by exploring these models: Logistic Regression, Naive Bayes, Linear SVC, and Random Forest. My method was, choosing the best model to optimize, after running all my models in this section. Hence, I ran all the models with their default parameters to … Read more NLP: Classification & Recommendation Project

Word Embedding in NLP: One-Hot Encoding and Skip-Gram Neural Network

I’m a poet-turned-programmer who has just begun learning about the wonderful world of natural language processing. In this post, I’ll be sharing what I’ve come to understand about word embedding, with the focus on two embedding methods: one-hot encoding and skip-gram neural network model. Last year, OpenAI released a (restricted) version of GPT-2, an AI … Read more Word Embedding in NLP: One-Hot Encoding and Skip-Gram Neural Network

The math behind Machine Learning Algorithms

How do different machine learning algorithms learn from the data and predict on the unseen data? Photo by Roman Mager on Unsplash The machine learning algorithms are designed in such a manner that they learn from experience and their performance improves as they feed on more and more data. Every algorithm has its own way … Read more The math behind Machine Learning Algorithms

fairmodels: let’s fight with biased Machine Learning models (part 1 — detection)

fairmodels: let’s fight with biased Machine Learning models (part 1 — detection) Author: Jakub Wiśniewski TL;DR The fairmodels R Package facilitates bias detection through model visualizations. It implements few mitigation strategies that could reduce the bias. It enables easy to use checks for fairness metrics and comparison between different Machine Learning (ML) models. Longer version Fairness in ML … Read more fairmodels: let’s fight with biased Machine Learning models (part 1 — detection)

How to Calculate and Analyze Relative Strength Index (RSI) Using Python

The relative strength index is a momentum oscillator commonly used to predict when a company is oversold or overbought. The calculation process is straightforward: Observe the last 14 closing prices of a stock. Determine whether the current day’s closing price is higher or lower than the previous day. Calculate the average gain and loss over … Read more How to Calculate and Analyze Relative Strength Index (RSI) Using Python

Bayesnote v0.0.1 release note

Bayesnote is a frictionless integrated notebook environment for data scientists and data engineers. Bayesnote is a frictionless integrated notebook environment for data scientists and data engineers. It provides a user interface to build dashboards and deploy machine learning models right from a notebook. It also supports the operation of notebooks by a workflow system, Noteflow. … Read more Bayesnote v0.0.1 release note

The Three Questions about AI that Startups Need to Ask

Billion-dollar investments in AI are booming. What does this mean for startups looking to AI for their innovative and competitive edge? The strategy seems simple: take one of humanity’s perennial problems and fix it with machine learning. Google, Facebook, Netflix, and Uber did it. It can often seem like the obvious question is why not … Read more The Three Questions about AI that Startups Need to Ask

Predicting Sentiment of Employee Reviews

In my previous articles, we learned how to scrape, process, and analyze employee reviews from Indeed.com. Feel free to take a look and offer feedback. I would love to hear how you would improve the code. In particular, how to dynamically overcome changes to the website’s HTML. In this article, I would like to take … Read more Predicting Sentiment of Employee Reviews

Impressive Medium Articles on AI/ML This Month

One-stop-shop to get information into the history, development and potential of GPT-3. Julien Lauret’s article is a comprehensive summary of the journey taken so far to create GPT-3. Julien has managed to summarize years of development and introductions of methodology and techniques to model language and solve natural language processing into several small, concise paragraphs. … Read more Impressive Medium Articles on AI/ML This Month

How to Create a GraphQL API using AWS AppSync

Nowadays whenever we talk or think about creating/designing an API what pops to the mind at first is REST. REST(REpresentational State Transfer) has been the go-to standard until recently when developing an API platform. Even though REST became the standard, it did have its own disadvantages. One of the main disadvantages is the inflexibility for … Read more How to Create a GraphQL API using AWS AppSync

What is Data Science?

Exploring the history of data science and understanding what it is now Image by Trist’n Joseph Data has become the driving force behind the world’s industries. Now, more than ever, businesses need individuals who can help them optimize their operations. Because of this, Data Science jobs have been ranked Glassdoor’s number one best job consecutively … Read more What is Data Science?

Classification Model from Scratch

Beginner’s guide in building a Naive Bayes classifier model (simple classification model) from scratch using Python. CAMERON FOXLY “BASIC programming into an old computer” In machine learning, we can use probability to make predictions. Perhaps the most widely used example is called the Naive Bayes algorithm. Not only it is straightforward to understand, but it … Read more Classification Model from Scratch

Best Free Resources to Learn Programming, Software Engineering, Machine Learning, And More

All you need to learn… Source: Unsplash by 🇸🇮 Janko Ferlič Do you know that you can take the courses from MIT, Stanford, and Harvard for free? Lots of their undergraduate and graduate-level course materials are for the students around the globe to use for free. I am going to talk about some of the … Read more Best Free Resources to Learn Programming, Software Engineering, Machine Learning, And More

The Sardinas-Patterson Algorithm in Simple Python

Checking for Unique Decodability in Variable-Length Codes Image by S. Hermann & F. Richter from Pixabay Two fields that often get left on the sidelines in conversations about data science are Information Theory, which studies the quantification, storage, and communication of information, Coding Theory, which studies the properties of codes and their respective fitness for … Read more The Sardinas-Patterson Algorithm in Simple Python

How to Draw Venn Diagrams on Jupyter

➡️ Introduction➡️ A, B➡️ TRUE, FALSE➡️ A AND B, A NAND B➡️ A OR B️, A NOR B️➡️ A XOR B️, A XNOR B️➡️ NOT A, NOT B➡️ A NOT B, B NOT A➡️ Implication, A → B, B → A➡️ Mutually exclusive➡️ Complement➡️ Subset➡️ Conclusion In this article, you will find how to draw … Read more How to Draw Venn Diagrams on Jupyter

Demand Forecasting using FB-Prophet

A seasonal decomposition is performed of the time-series using the statsmodels.tsa.seasonal_decompose function. The charts above show a linear growth in sales over time (across categories and states) along with seasonal effects. Linearity is particularly evident in the latter half of the time-series starting from the year 2014. A yearly seasonality is seen in all states … Read more Demand Forecasting using FB-Prophet

Spatial GLMM(s) using the INLA Approximation

Model setup We have a count outcome (deaths and births), in counties over time, and a set of time-constant covariates. We have several options in the GLM framework with which to model these data, for example: Binomial – \[y_{ij} \sim Bin(\pi_{ij}) \text{: } logit(\pi_{ij} ) = \beta_{0}+ x’\beta_k \] Poisson – \[y_{ij} \sim Pois(\lambda_{ij} E_{ij}) … Read more Spatial GLMM(s) using the INLA Approximation

I like to MVO it!

[This article was first published on R on OSM, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. In our last post, we ran through a bunch of weighting … Read more I like to MVO it!

Explainable ‘AI’ using Gradient Boosted randomized networks Pt2 (the Lasso)

This post is about LSBoost, an Explainable ‘AI’ algorithm which uses Gradient Boosted randomized networks for pattern recognition. As we’ve discussed it last week LSBoost is a cousin of GFAGBM’s LS_Boost. In LSBoost, more specifically, the so called weak learners from LS_Boost are based on randomized neural networks’ components and variants of Least Squares regression … Read more Explainable ‘AI’ using Gradient Boosted randomized networks Pt2 (the Lasso)

rfm 0.2.2

We’re excited to announce the release of rfm 0.2.2 on CRAN! rfm provides tools for customer segmentation using Recency Frequency Monetary value analysis. It includes a Shiny app for interactive segmentation. You can install rfm with: install.packages(“rfm”) In this blog post, we will summarize the changes implemented in the current (0.2.2) and previous release (0.2.1). … Read more rfm 0.2.2

R Package Integration with Modern Reusable C++ Code Using Rcpp – Part 3

[This article was first published on R Views, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Daniel Hanson is a full-time lecturer in the Computational Finance & Risk … Read more R Package Integration with Modern Reusable C++ Code Using Rcpp – Part 3

Feature Leakage, and identifying it with Exploratory data analysis and Machine Learning

library(tidyverse) # Loading some data loan_data <- structure(list(finalClass = c(“Reject/Cancel”, “Success”, “Reject/Cancel”, “Success”, “Success”, “Reject/Cancel”, “Reject/Cancel”, “Success”, “Reject/Cancel”, “Success”, “Reject/Cancel”, “Success”, “Reject/Cancel”, “Success”, “Success”, “Reject/Cancel”, “Success”, “Reject/Cancel”, “Reject/Cancel”, “Success”, “Reject/Cancel”, “Success”, “Reject/Cancel”, “Success”, “Success”, “Reject/Cancel”, “Reject/Cancel”, “Success”, “Success”, “Reject/Cancel”, “Success”, “Reject/Cancel”, “Success”, “Reject/Cancel”, “Reject/Cancel”, “Success”, “Reject/Cancel”, “Reject/Cancel”, “Success”), balance_new_bracket = c(“01. <= 10k”, “01. <= … Read more Feature Leakage, and identifying it with Exploratory data analysis and Machine Learning

Decentralized Reinforcement Learning

Detailed overview of a new paradigm in Reinforcement Learning Many associations in the world like the biological ecosystems, government and corporations are physically decentralized however they are unified in the sense of their functionality. For instance, a financial institution operates with a global policy of maximizing their profits, hence appearing as a single entity; however, … Read more Decentralized Reinforcement Learning

Random Forest on GPUs: 2000x Faster than Apache Spark

Lightning-fast model training with RAPIDS Photo by bady abbas on Unsplash Disclaimer: I’m a Senior Data Scientist at Saturn Cloud — we make enterprise data science fast and easy with Python, Dask, and RAPIDS. Prefer to watch? Check out a video walkthrough here. Random forest is a machine learning algorithm trusted by many data scientists … Read more Random Forest on GPUs: 2000x Faster than Apache Spark

An In-Depth Crash Course on Random Variables

For every random variable is an associated probability distribution function. A probability distribution function essentially gives the probabilities associated with obtaining each possible value or an interval of values. There are three types of probability distribution functions: probability mass function (pmf), probability density function (pdf), and the cumulative distribution function (cdf). Probability Mass Function (pmf) … Read more An In-Depth Crash Course on Random Variables

How to Verify the Distribution of Data using Q-Q Plots?

Given a random distribution, that needs to be verified if it is a normal/gaussian distribution or not. For understanding, we will name this unknown distribution X, and known normal distribution as Y. Generate unknown distribution X: X = np.random.normal(loc=50, scale=25, size=1000) we are generating a normal distribution having 1000 values with mean=50 and standard deviation=25. … Read more How to Verify the Distribution of Data using Q-Q Plots?

3D Object Detection Using Lidar Data for Self Driving Cars

In this blog, we present our research work on 3D Object Detection in real time using lidar data. A novel neural network architecture is used to simultaneously detect and regress bounding box over all the objects present in the image. We used 2D Bird’s Eye View in place of 3D voxel grid data because it … Read more 3D Object Detection Using Lidar Data for Self Driving Cars

Structure from Motion

Stereo vision, Triangulation, Feature Correspondence, Visual SLAM Structure from Motion (SFM) is to determine the spatial and geometric relationship of the target through the movement of the camera, which is a common method of 3D reconstruction. It only needs an ordinary RGB camera, so the cost is lower, and the environment is less restricted, and … Read more Structure from Motion

EC2 Hibernation feature is now available in the Africa (Cape Town) and Europe (Milan) AWS Regions

Hibernation requires an EC2 instance to be an encrypted Amazon EBS-backed instance. This ensures protection of sensitive contents in memory (RAM) as they get copied to EBS upon hibernation. You can now enable Amazon EBS Encryption by Default, to ensure all new EBS volumes created in your account are encrypted. Hibernation is available for On-Demand … Read more EC2 Hibernation feature is now available in the Africa (Cape Town) and Europe (Milan) AWS Regions

Preventing lateral movement in Google Compute EnginePreventing lateral movement in Google Compute EngineSoftware Engineer, Google CloudSoftware Engineer, Google Cloud

When you do have to directly expose a VM with an external IP address, ensure that your firewall rules restrict network access to only the ports and IP addresses that your application needs.  Don’t Do Assign private IP addresses to your VMs; don’t give them public IP addresses at all. Use IAP TCP forwarding to … Read more Preventing lateral movement in Google Compute EnginePreventing lateral movement in Google Compute EngineSoftware Engineer, Google CloudSoftware Engineer, Google Cloud

Devising a Federation broker for Hadoop Access Control

Understanding the data authorization in a federation method for better security. Image by Author Are you a fan of the digital overwatch? Think of an artificial intelligence system that can process huge amounts of data and stop crime in the city? Yes, you have guessed it right, POI or Person of Interest is a strong … Read more Devising a Federation broker for Hadoop Access Control

Twitter analysis of the current political situation in Belarus

Then, using the library SpaCy I separated the most recent tweets (July, 18–26, 2020) related to Lukashenko and his main opponents, Babariko, Tsepkalo and Tikhanovskaya. In order to use sentiment analysis and entity extraction libraries, I had to translate tweets into English. I used Google Translation API. Here is a simple way to do it: … Read more Twitter analysis of the current political situation in Belarus

Will Deep Learning Hit the Wall?

Better algorithms or more computing power? If you are interested in deep learning, then you could already heard about recent paper published by researchers from USA, Korean and Brazilian universities and labs. Neil C. Thompson, MIT Computer Science and A.I. Lab, Kristjan Greenewald, MIT Initiative on the Digital Economy, Keeheon Lee, Underwood International College, Yonsei … Read more Will Deep Learning Hit the Wall?

Implementing SGD From Scratch

Custom Implementation of Stochastic Gradient Descent without SKlearn Before implementing Stochastic Gradient Descent let’s talk about what a Gradient Descent is. Gradient Descent Algorithm is an iterative algorithm used to solve the optimization problem. In almost every Machine Learning and Deep Learning models Gradient Descent is actively used to improve the learning of our algorithm. … Read more Implementing SGD From Scratch

Beginner’s Guide to PyThaiNLP

Utility Add the following import declaration at the top of your Python file. I will be using Jupyter Notebook for this tutorial. import pythainlp.util PyThaiNLP provides us with quite a lot of built-in functions. For example, you can use the following to determine if an input text is Thai. pythainlp.util.isthai(“สวัสดี”)#True Furthermore, you can even get … Read more Beginner’s Guide to PyThaiNLP