Rethinking Continuous Integration for Data Science

Software Engineering for Data Science A widely used practice in software engineering deserves its own flavor in our field Photo by Yancy Min on Unsplash As Data Science and Machine learning get wider industry adoption, practitioners realize that deploying data products comes with a high (and often unexpected) maintenance cost. As Sculley and co-authors argue … Read more

10 Minutes to Building a Fully-Connected Binary Image Classifier in TensorFlow

Photo by Waranont (Joe) on Unsplash How to build a binary image classifier using fully-connected layers in TensorFlow/Keras This is a short introduction to computer vision — namely, how to build a binary image classifier using only fully-connected layers in TensorFlow/Keras, geared mainly towards new users. This easy-to-follow tutorial is broken down into 3 sections: … Read more

Le Monde puzzle [#1149]

[This article was first published on R – Xi’an’s Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. The weekly puzzle from Le Monde is a leaking variant … Read more

Categories R Tags ExcerptFavorite

beta: Evidence-based Software Engineering – book

[This article was first published on The Shape of Code » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. My book, Evidence-based software engineering: based on the … Read more

Categories R Tags ExcerptFavorite

K Nearest Neighbors by hand: A Computer Science exercise for the Data Scientist

Opening the “black box” and understanding the algorithm within Data scientists sometimes talk about a “black box” approach to data science. That is, when you understand the use cases for different machine learning algorithms and how to plug in the data without understanding how the algorithm works beneath the surface. But the algorithms are just … Read more

New IT Cost Assessment program: Unlock value to reinvest for growthNew IT Cost Assessment program: Unlock value to reinvest for growthVP, Solutions Engineering

If you’re in IT, chances are you’re under pressure to prioritize investments and optimize costs in response to the current economic climate. According to a recent survey of our customers1, that situation describes 84% of IT decision makers. Likewise, Forrester Research has said CIOs could face a minimum of 5% budget cuts in 20202, and … Read more

Measuring Financial Risk: A Step-by-Step Guide

To calculate our own VaR and ES, we’ll use data for the Wilshire 5000, a stock market index widely considered to be the broadest measure of U.S. stock prices. We can use quantmod to import our data from FRED, the Federal Reserve Economic Database. We’ll also use ggplot2 to visualize our data. Let’s load our … Read more

Serverless BERT with HuggingFace and AWS Lambda

A typical transformers model consists of a pytorch_model.bin, config.json, special_tokens_map.json, tokenizer_config.json, and vocab.txt. The pytorch_model.bin has already been extracted and uploaded to S3. We are going to add config.json, special_tokens_map.json, tokenizer_config.json, and vocab.txt directly into our Lambda function because they are only a few KB in size. Therefore we create a model directory in our … Read more

Machine Learning Basics: Multiple Linear Regression

Learn to Implement Multiple Linear Regression with Python programming. In the previous story, I had given a brief of Linear Regression and showed how to perform Simple Linear Regression. In Simple Linear Regression, we had one dependent variable (y) and one independent variable (x). What if the marks of the student depended on two or … Read more

Deploying Python script to Docker container and connect to external SQL Server(in 10 minutes)

Finally we want to build and run the image. #Build the imagedocker build -t my-app .#Run itdocker run my-app#Find container namedocker ps –last 1#Check logsdocker logs <container name> If you want to explore the container and run the script manually then modify last line of the Dockerfile, build and run again: #CMD [“python”,”-i”,”main.py”]CMD tail -f … Read more

How Can AI Boost Call Center Moral?

What if we used these technologies to actually make call center agents’ lives better? I don’t mean coaching them to do a better job. “Feedback overload” is already a recognized problem in call centers. I mean helping them cope with the fact that their job is emotionally draining. Remember how frustrating it was the last … Read more

Anything2Vec: Mapping Reddit into Vector Spaces

Generalizing Word2Vec away from word embeddings “Subreddit Embedding” and the 100 closest subreddits to /r/nba A common problem in ML, natural language processing (NLP), and AI at large surrounds representing objects in a way computers can process. And since computers understand numbers — which we have a common language for comparing, combining and manipulating — … Read more

Impressions from e-Rum2020

[This article was first published on Mirai Solutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. e-Rum2020 conference connects many hundreds of R enthusiasts in virtual space! With … Read more

Categories R Tags ExcerptFavorite

New Azure Firewall features in Q2 CY2020

We are pleased to announce several new Azure Firewall features that allow your organization to improve security, have more customization, and manage rules more easily. These new capabilities were added based on your top feedback: Custom DNS support now in preview. DNS Proxy support now in preview. FQDN filtering in network rules now in preview. IP … Read more

The Correct Way to Measure Inference Time of Deep Neural Networks

The network latency is one of the more crucial aspects of deploying a deep network into a production environment. Most real-world applications require blazingly fast inference time, varying anywhere from a few milliseconds to one second. But the task of correctly and meaningfully measuring the inference time, or latency, of a neural network, requires profound … Read more

Is Facial Recognition Technology Racist? State of the Art algorithms explained

Let’s break this function down one by one. The first component, face classification, simply penalizes the model for saying that there is a face at a location, while no face exists in the image. “Face box regression” is a fancy term for the distance between the bounding box coordinates of the predicted face and the … Read more

A Complete Beginner’s Guide to Deal with NULL Values in SQL

We cannot use comparison operators=,<,>,<>to test for NULL values. Instead, we have to use IS NULL and IS NOT NULL predicates. IS NULL: Return rows that contain NULL values Syntax: expression IS NULL SELECT ID, Student,Email1,Email2FROM tblSouthParkWHERE Email1 IS NULL AND Email2 IS NULLORDER BY ID The above query yields all records where both Email1 … Read more

AI pseudoscience and scientific racism

Recent attempts to predict criminality from facial features recall a long tradition of unethical and racist pseudoscience Source: Wikimedia Commons A recent paper about to be published by Harrisburg University caused quite a stir earlier this month. Titled “A Deep Neural Network Model to Predict Criminality Using Image Processing,” the paper promised: With 80 percent … Read more

Coronavirus: Which country got it right?

Note from the editors: Towards Data Science is a Medium publication primarily based on the study of data science and machine learning. We are not health professionals or epidemiologists, and the opinions of this article should not be interpreted as professional advice. To learn more about the coronavirus pandemic, you can click here. In this … Read more

Why Building an AI Decentralized Autonomous Organization (AI DAO)

Beyond the already complex challenge of implementing AI, some companies have started analyzing the possible benefits of building an AI Decentralized Autonomous Organizations (AI DAOs). During my latest mission, I had to help create new business models, identify the right AI approach, and create a roadmap for the creation of several AI DAOs proof of … Read more

The Bechdel test and the X-Mansion with tidymodels and #TidyTuesday

[This article was first published on rstats | Julia Silge, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Lately I’ve been publishingscreencasts demonstrating how to use thetidymodels framework, … Read more

Categories R Tags ExcerptFavorite

Future-Proofing Your Data Science Team

[This article was first published on RStudio Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Photo by Brian McGowan on Unsplash This is a guest post from … Read more

Categories R Tags ExcerptFavorite

Introducing Polished.tech

[This article was first published on Posts on Tychobra, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Polished.tech is our new software service that makes it easier than … Read more

Categories R Tags ExcerptFavorite

ANOVA for Regression

y=α+βx would give the predicted values and we calculate the values of α & β from the above formula where β is the slope and α is the y-intercept. The goal of the simple linear regression is to create a linear model that minimizes the sum of squares of the residuals(error). An interesting fact about … Read more

one bridge further

[This article was first published on R – Xi’an’s Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Jackie Wong, Jon Forster (Warwick) and Peter Smith have just … Read more

Categories R Tags ExcerptFavorite

Find your most expensive lines of code and improve code quality with Amazon CodeGuru – now generally available

Developers can use Amazon CodeGuru Profiler to identify the most expensive lines of code by helping them understand the runtime behavior of their applications, identify and remove code inefficiencies, improve performance, and significantly decrease compute costs. Amazon CodeGuru Profiler provides visualizations and recommendations on how to fix performance issues and the estimated cost of running … Read more

Categories AWS ExcerptFavorite

Amazon Virtual Private Cloud (VPC) customers can now use their own Prefix Lists to simplify the configuration of security groups and route tables

VPC security groups and route tables are used to control access and routing policies. Customers often have a common set of CIDR blocks for security group and route table configurations. Prefix Lists allows you to group multiple CIDR blocks into a single object, and use it as a reference in your security groups or route … Read more

Categories AWS ExcerptFavorite

Learn How to Create Web Data Apps in Python

import streamlit as stimport pandas as pdimport plotly.express as pximport pydeck as pdkimport numpy as np#Load and Cache the data@st.cache(persist=True)def getmedata():url = ‘https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv’df = pd.read_csv(url, delimiter=’,’, header=’infer’)df.rename(index=lambda x: df.at[x, ‘Country/Region’], inplace=True)dft = df.loc[df[‘Province/State’].isnull()]dft = dft.transpose()dft = dft.drop([‘Province/State’, ‘Country/Region’, ‘Lat’, ‘Long’])dft.index = pd.to_datetime(dft.index)return(dft, df)df1 = getmedata()[0]st.title(‘Building a Data Dashboard with Streamlit’)st.subheader(‘while exploring COVID-19 data’)#####In Scope Countriescountrylist … Read more

Predicting Future Wars

Insights from Open Data and Machine Learning I know what you are thinking: Wars are rare and complicated events, one can’t expect to take into account their entire complexity. And you are right, they spring from an intricate array of political, economic, and historical reasons without forgetting the thick coat of randomness, thus they should … Read more

Neural Networks using Tensorflow via Keras in R – Video

[This article was first published on http://r-addict.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. On June 25th we had a pleasure to host Why R? Webinar with … Read more

Categories R Tags ExcerptFavorite

Black-Scholes Option Pricing is Wrong

Theory, assumptions, problems, and solutions for practitioners Photo by Pixabay from Pexels The equation offered by Black and Scholes (1973) is the standard theoretical pricing model for European options. The keyword being theoretical as the Black-Scholes model makes some key assumptions that are immediately violated in practice. Key model assumptions: No transaction costs No arbitrage … Read more

Kernel Live Patching for Amazon Linux 2 is now generally available

Many AWS customers introduce security updates by rolling out patched machine images (AMI) or by in-place patching instances followed by rolling restarts. This process is usually time consuming and may result in disruptions to running applications. Kernel Live Patching in Amazon Linux provides a way to reduce disruption and accelerate a rollout by applying a … Read more

Categories AWS ExcerptFavorite

How to Avoid Potential Machine Learning Pitfalls

This post is for all those data science aficionados out there who recently jumped on to the machine learning bandwagon. Whether you studied data science in college or are autodidactic, most aspiring data scientists get a reality check when trying their hand on a machine learning project in a practical setting. I struggled with the … Read more

A guide to setting up monitoring for object creation in Cloud StorageA guide to setting up monitoring for object creation in Cloud StorageBig Data and Analytics Cloud ConsultantStrategic Cloud Engineer

Cloud Storage provides worldwide, highly durable object storage that scales to exabytes of data. This means customers of all sizes and industries can use it to store and protect any amount of data for a range of use cases, such as websites, mobile applications, backup and restore, archive, and big data analytics. The objects are … Read more

How to become a data scientist in 30 days?

[This article was first published on R – Hi! I am Nagdev, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. On a late evening, I was scrolling through … Read more

Categories R Tags ExcerptFavorite