An invitation to innovate and transform your business on Azure SQL

The past several months have seen their share of hardship, but despite the challenges, organizations continue to see digital transformation in the cloud as a path to growth. The promises of accelerated innovation, efficient operations, and lower costs among others, have helped many businesses remain viable—if not thrive—and they’re now looking at how they can … Read more

Poisson Process and Poisson Distribution in real-life: modeling peak times at an ice cream shop | by Carolina Bento

Several phenomena in the real world can be represented as counts of things. For example, the number of flights departing from an airport, number customers lining up at the store register, the number of earthquakes occurring in a year at a specific region. Counting events is a relatively simple task, but if you want to … Read more

ggplot2 Extension: corrmorrant for Flexible Correlation Plots in R

[This article was first published on business-science.io, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. This article is part of R-Tips Weekly, a weekly video tutorial that shows … Read more

Categories R Tags ExcerptFavorite

Write and train your own custom machine learning models using PyCaret

A step-by-step, beginner-friendly tutorial on how to write and train custom machine learning models in PyCaret Photo by Rob Lambert on Unsplash PyCaret is an open-source, low-code machine learning library and end-to-end model management tool built-in Python for automating machine learning workflows. It is incredibly popular for its ease of use, simplicity, and ability to … Read more

A small step to understand Generative Adversarial Networks

Overview of the method The objective of GANs is to randomly generate artificial contents similar to some data. Put another way, they are aimed at sampling according to an unknown distribution \(P^\star\), based solely on i.i.d. observations \(X_1, \dots, X_n\) drawn according to \(P^\star\). Obviously, a naive approach would be to: Estimate the distribution \(P^\star\) … Read more

Categories R Tags ExcerptFavorite

The Good, the Bad and the Ugly: how to visualize Machine Learning data

Visualization also greatly simplifies communication of our model and results to decision-makers or the public. Exploratory Data Analysis Exploratory Data Analysis (EDA) is the backbone of data analysis, including those that result in a machine learning model. EDA helps us to understand the data we are working with and put it into context, so that … Read more

Categories R Tags ExcerptFavorite

Python Hockey Analytics Tutorial

The first thing we’re going to do is determine which teams had the most NHL forwards in 2018–2019 who made the jump from the AHL the prior year. (Going forward, I will be referring to the skaters who made this jump, playing at least 20 AHL games in 2017–2018 and 20 NHL games in 2018–2019, … Read more

Amazon RDS Now Supports PostgreSQL 13 in AWS GovCloud (US) Regions

Amazon RDS for PostgreSQL now supports major version 13 in AWS GovCloud (US) Regions. PostgreSQL 13 includes improved functionality and performance from new features and enhancements such as de-duplication of B-tree index entries, improved performance for queries that use partitioned tables, incremental sorting to accelerate data sorts, parallel processing of indexes with the VACUUM command, … Read more

Categories AWS ExcerptFavorite

LSTM Neural Network: The Basic Concept

The Cell State Photo Credit: Aleia Knight The cell state is sort of like a conveyor belt that moves the data along through the cell. While it is not technically a gate, it is crucial for carrying data through each individual cell as well as to other cells. The data flowing through it is altered … Read more

The Intuitive Explanation of Logistic Regression

Strong Fundamental Understanding is Key to Proper Use Image by geralt from Pixabay Logistic regression can be pretty difficult to understand! As such I’ve put together a very intuitive explanation of the why, what, and how of logistic regression. We’ll start with some building blocks that should lend well to clearer understanding so hang in … Read more

Code performance in R: Which part of the code is slow?

[This article was first published on INWT-Blog-RBloggers, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. This is the first part of our series about code performance in R. … Read more

Categories R Tags ExcerptFavorite

Upgrade your infrastructure with the latest Dv5/Ev5 Azure VMs in preview

On April 6, Microsoft participated in the launch of Intel’s 3rd Gen Intel Xeon Scalable processors, where Microsoft announced the upcoming preview of Azure Confidential Computing and general-purpose Microsoft Azure Virtual Machines. Today we are delivering on that promise and announcing the preview of Dv5-series and Ev5-series Azure Virtual Machines for general-purpose and memory-intensive workloads. You can request … Read more

Timeseries analysis in R

[This article was first published on Methods – finnstats, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Timeseries analysis in R, in statistics time series, is one of … Read more

Categories R Tags ExcerptFavorite

Teaming up with Coursera on new Azure Specializations and Scholarships

Over the last year, there has been a significant uptake in digital skilling to fill the exponential rise in new tech jobs. In fact, the World Economic Forum is forecasting that nearly 150 million new tech jobs will be created in the next five years. Microsoft is committed to supporting learners interested in upskilling and … Read more

Fast, Accurate Anomaly Detection Based on Copulas (COPOD)

Copula-Based Outlier Detection — COPOD — is a new algorithm for anomaly detection. It has been implemented in the PyOD python package. It has several key features that make it stand out from competing algorithms: Deterministic No hyperparameters! (This is important because is difficult to tune hyperparameters for outlier detection tasks because the true labels … Read more

5 Deep Learning Trends Leading Artificial Intelligence to the Next Stage

“An artificial neural network is only a rough analogy of how the brain works.” — David Sussillo, Google Brain Group In the decade of 1950, several important scientific breakthroughs lied the groundwork that gave birth to AI. Research in neurology found that the brain is composed of neural networks that “fire in all-or-nothing pulses.” This … Read more

Using RStudio to Amplify Digital Marketing Results

[This article was first published on RStudio Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Photo by Myriam Jessier on Unsplash We recently teamed up with Extendo … Read more

Categories R Tags ExcerptFavorite

Best Practices for Bucketing in Spark SQL

The ultimate guide to bucketing in Spark. Bucketing is a feature supported by Spark since version 2.0. It is a way how to organize data in the filesystem and leverage that in the subsequent queries. There are many resources that explain the basic idea of bucketing, in this article, we will go one step further … Read more

Five Numpy Functions You Should Understand

Numpy is arguably the most used library in data science, just in line with Pandas. It’s the first library you import in every data science project, and in my experience, it’s quite helpful to just know some functions to quickly modify your arrays with the least amount of Googling. import numpy as np Thus, spin … Read more

Designing data driven decision making; Kaggle ColeRidge

[This article was first published on rblog – Code X Value, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. There is an interesting challenge running on Kaggle at … Read more

Categories R Tags ExcerptFavorite

Correlation Vs Causation

Comparing correlation with causation Photo by Benjamin Behre on Unsplash Correlation is also known as an association. It refers to a relation between two different entities or data points. When one thing goes up another comes down and vice-versa which means that they change together. Let’s take an example of an MNC which is studying … Read more

Regression analysis in R-Model Comparison

[This article was first published on Methods – finnstats, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Regression analysis in R, just look at the Boston housing data … Read more

Categories R Tags ExcerptFavorite

Plotting movement data in R using ggmap and ggplot

[This article was first published on JourneyR Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. With ever increasing sources of movement data from GPS in phones, animal … Read more

Categories R Tags ExcerptFavorite

MySQL vs. SQLite

Exploring the differences between two popular databases. Photo by Javardh on Unsplash As a change of pace, I decided to take a trip back to Databases. This time though, I wanted to do a little comparison instead of another tutorial-style post. Thinking about my time in college, I remember a particular class where we discussed … Read more

K-Means Clustering — A Comprehensive Guide to Its Successful Use in Python

K-Means is one the most well-known and most commonly used algorithms due to its simplicity and speed. Although, at the same time, it has some limitations (more on that later). It is designed to make use of a few straightforward steps that are repeated through multiple iterations. You can refer to the below gif for … Read more

How to Run 40 Regression Models with a Few Lines of Code

MACHINE LEARNING Learn how to run over 40 machine learning models using Lazy Predict for regression projects Image by Malte Helmhold. Source: Unsplash Let’s say you need to work on a regression machine learning project. You analyze your data, do some data cleaning, create a few dummy variables, and now it’s time to run a … Read more

Automated Interactive Package for EDA, Modeling, and Hyperparameter Tuning in a few lines of…

PyWedge — Interactive package to speed up data science modeling workflow Image by StockSnap from Pixabay A data scientist spends most of the time performing exploratory data analysis (EDA). There are various components of the data science modeling pipeline, including EDA, data processing, hyperparameter tuning, baseline modeling, and model deployment. There are various open-source Python … Read more

Interpretable Machine Learning: Advantages and Disadvantages

Are interpretable machine learning models always practical? Source: Photo by geralt from Pixabay There is increasing emphasis on interpretable machine learning in the world of data. Models have been growing ever more complex with the use of neural networks becoming more mainstream, along with the sheer size of data being analysed today. In many cases, … Read more

Version 0.11.0 of NIMBLE released

[This article was first published on R – NIMBLE, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. We’ve released the newest version of NIMBLE on CRAN and on … Read more

Categories R Tags ExcerptFavorite

Handling missing values in R

[This article was first published on Methods – finnstats, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Handling missing values in R, one of the common tasks in … Read more

Categories R Tags ExcerptFavorite

A Primer on the EM Algorithm

Christian Zuniga, PhD Figure 1. Example EM model used for mixture models [1] The Expectation-Maximization (EM) algorithm is one of the main algorithms in machine learning for estimation of model parameters [2][3][4]. For example, it is used to estimate mixing coefficients, means, and covariances in mixture models as shown in Figure 1. Its objective is … Read more

Faster, smoother, smaller, more accurate and more robust face alignment models

Face alignment is a crucial component in most face analysis systems. It focuses on identifying the location of several key points of the human faces in images or videos. Although several methods and models are available to developers in popular computer vision libraries such as OpenCV or Dlib, they still struggle with challenges such as … Read more

Introducing OpenHAC— an open source toolkit for digital biomarker analysis and machine learning

Through OpenHAC it is also possible to create, analyze, and extract new digital biomarker features with it’s machine learning classification tools. Combining behavioral characteristics with manual classifications, a user can create effective classifiers for behavioral manifestations such as pain, drowsiness, activity level, and atypical movement — among many others. OpenHAC uses the PyCaret library, powered … Read more

What is the Tukey loss function?

The Tukey loss function The Tukey loss function, also known as Tukey’s biweight function, is a loss function that is used in robust statistics. Tukey’s loss is similar to Huber loss in that it demonstrates quadratic behavior near the origin. However, it is even more insensitive to outliers because the loss incurred by large residuals … Read more

Categories R Tags ExcerptFavorite

Image Captions with Deep Learning: State-of-the-Art Architectures

INTUITIVE IMAGE CAPTIONS SERIES A gentle guide to image feature encoders, sequence decoders, attention, and multi-modal architectures, in plain English Photo by Brett Jordan on Unsplash Image Captioning is a fascinating application of deep learning that has made tremendous progress in recent years. What makes it even more interesting is that it brings together both … Read more

Curious about Google Cloud Bare Metal Solution? Start here.Curious about Google Cloud Bare Metal Solution? Start here.Developer Advocate, Google

How does it work? Bare Metal Solution provides purpose-built bare metal machines in regional extensions that are connected to Google Cloud by a managed, high-performance connection with a low-latency network fabric. Google Cloud provides and manages the core infrastructure, the network, the physical and network security, and hardware monitoring capabilities in an environment from which … Read more

10 Tips And Tricks For Data Scientists Vol.6

[This article was first published on R – Predictive Hacks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. We have started a series of articles on tips and … Read more

Categories R Tags ExcerptFavorite

Data Wrangling Solutions — Working With Dates — Part 2

Convert columns from non-date to date datatypes Photo by Steinar Engeland on Unsplash In the last tutorial, we looked at the various approaches to read the data files containing the date-time variables. In this tutorial, we will discuss how to handle columns of an existing dataframe that contains date-time data but are still read by … Read more

15 Business Questions about Mobile Marketing Campaigns: ROAS (Return On Ad Spend)

Q1. Are there more Vungle or Facebook campaigns? From a total of 89 campaigns, 77 are running on Facebook and only 12 through the Vungle platform, corresponding to 87% and 13% respectively. # Side tableabsolut = df[“media source”].value_counts().to_frame()percent = (df[“media source”].value_counts(normalize=True)*100).to_frame().rename(columns={“media source”:”percent”})out_bal = pd.concat([absolut,percent],axis=1).round(decimals=2)display(out_bal)# Pie graphabsolut.plot(kind=’pie’, subplots=True, autopct=’%1.2f%%’, explode= (0.05, 0.05), startangle=80, legend=False, fontsize=12, figsize=(16,7))# … Read more

Run Shiny Apps Locally

[This article was first published on R – Hosting Data Apps, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. You might be eager to deploy your Shiny app … Read more

Categories R Tags ExcerptFavorite

Changing dynamics of market crises: a review of country, sector and equity behaviours over the…

A mathematical study of evolutionary financial market behaviours In this paper, we study the behaviours and similarity profile of country financial indices, sector financial indices and equities over the past 20 years. First, we introduce a new methodology to determine the similarity of 20 countries and 11 sectors with respect to their time series’ structural … Read more

The Data Science Trilogy: NumPy, Pandas and Matplotlib basics

NumPy, Pandas and Matplotlib basics Img src: https://www.pexels.com/photo/business-charts-commerce-computer-265087/ So you are new to Python. Or perhaps you are already familiar with these libraries, but wanted to get a quick refresher. Whatever the case may be, Python has become without a doubt one of the most popular programming languages today, as shown by the following graph … Read more

Extracting Emotions from Faces with Face++ (and Microsoft Azure)

In the following, I first show the general flow of the main function with pseudo code. A pseudo code describes the steps of the algorithm in plain language. After that, the actual code is explained in the example of Face++. Accessing Azure works very similarly. The full R-Scripts for API calls to both Face++ and … Read more

Categories R Tags ExcerptFavorite

ten computer codes that transformed science

[This article was first published on R – Xi’an’s Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. In a “Feature” article of 21 January 2021, Nature goes … Read more

Categories R Tags ExcerptFavorite