Now it’s even easier to connect JetBrains IDEs to Amazon RDS or Redshift Databases

Customers can use database features with DataGrip and other premium JetBrains IDEs such as IntelliJ IDEA Ultimate, PyCharm Professional, WebStorm and Rider.  The AWS Toolkit for JetBrains is an open-source plugin lets you leverage the integrated development environment (IDE) for the creation, debugging, and deployment of software applications on Amazon Web Services. This new feature … Read more Now it’s even easier to connect JetBrains IDEs to Amazon RDS or Redshift Databases

How to do more with less data ?— Active learning

It goes without saying that choosing an evaluation set is the most important step in any machine learning process. This becomes even more crucial when it comes to active learning since this will be our measure of how well our model performance improves during our iterative labelling process. Furthermore, it also helps us decide when … Read more How to do more with less data ?— Active learning

AWS Copilot CLI launches v0.4 focused on autoscaling and operations

Today, the AWS Copilot CLI for Amazon Elastic Container Service (ECS) launched version 0.4.0. Starting with this release, you can enable autoscaling for services based on average CPU and memory utilization and provide a maximum and minimum number of tasks. AWS Copilot will also retain the service’s desired count after autoscaling occurred, so that if … Read more AWS Copilot CLI launches v0.4 focused on autoscaling and operations

Become an Expert at the Technical Interview — Part I

If you’re reading this, then you’re most likely going through the grind of preparing for technical interviews (software engineers, data scientists, etc). By now you should know that tech interviews are not like regular ‘old school’ interviews — we can’t just woo the hiring manager with our charm and talk out of our ass — … Read more Become an Expert at the Technical Interview — Part I

Logistic Regression for Binary Classification

Supervised Learning Methods in Machine Learning Image from ¹wikicommons In previous articles, I talked about deep learning and the functions used to predict results. In this article, we will use logistic regression to perform binary classification. Binary classification is named this way because it classifies the data into two results. Simply put, the result will … Read more Logistic Regression for Binary Classification

Free workshop on Deep Learning with Keras and TensorFlow

[This article was first published on Shirin’s playgRound, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Workshop announcement Because this year’s UseR 2020 in Munich couldn’t happen as … Read more Free workshop on Deep Learning with Keras and TensorFlow

3 Books for Successful Programmers in 2020 and beyond

Especially for recent graduates aspiring to become the next tech superstars Courtesy of Janko Ferlič from Unsplash It is well know that Leaders are Readers, and this does hold true in the tech industry just like it does anywhere else. However, this article is not about preaching to you about the necessity of reading for … Read more 3 Books for Successful Programmers in 2020 and beyond

AI Platform Prediction goes GA with improved reliability & ML workflow integrationAI Platform Prediction goes GA with improved reliability & ML workflow integrationStaff EngineerSenior Engineer

Machine learning (ML) is transforming businesses and lives alike. Whether it be finding rideshare partners, recommending products or playlists, identifying objects in images, or optimizing marketing campaigns, ML and prediction is at the heart of these experiences. To support  businesses like yours that are revolutionizing the world using ML, AI Platform is committed to providing … Read more AI Platform Prediction goes GA with improved reliability & ML workflow integrationAI Platform Prediction goes GA with improved reliability & ML workflow integrationStaff EngineerSenior Engineer

How Waze predicts carpools with Google Cloud’s AI PlatformHow Waze predicts carpools with Google Cloud’s AI PlatformSenior Data Scientist, Waze

Model vs. Serving complexity In the early stage of our product, we started with simple logistic regression models to estimate the likelihood of users sending/accepting offers. The models were trained offline using scikit learn. The training set was obtained using a “log and learn” approach (logging signals exactly as they were during serving time) over … Read more How Waze predicts carpools with Google Cloud’s AI PlatformHow Waze predicts carpools with Google Cloud’s AI PlatformSenior Data Scientist, Waze

Data Wrangling in Pandas : A Downloadable Cheatsheet

DATA ANALYSIS Turn raw data into functional form Following acquisition of raw data, data wrangling is the most essential step to transform raw data into more functional form for data analysis, model building and data visualization. If involves preprocessing, restructuring and cleaning operations and the end product is a dataset in a readily accessible format, … Read more Data Wrangling in Pandas : A Downloadable Cheatsheet

Deploy Angular Apps to Firebase From GitHub

The directory .github\workflows, when pushed to GitHub, will be read by GitHub and used as the GitHub Actions folder — where we store YAML files for our workflows! But what is a YAML file? Well, according to our beloved Wikipedia — “YAML is a human-readable data-serialization language. It is commonly used for configuration files and … Read more Deploy Angular Apps to Firebase From GitHub

Top 3 Insights from using Query-based ML to Automate Invoice Processing

ML in RPA 1) can be easy, 2) it requires a different mindset and 3) it has numerous use cases In the Posti project, a software robot used machine learning to fill the purchase invoice process forms. Photo by Mantas Hesthaven on Unsplash The RPA team of Posti, the Finnish logistics giant, started to use … Read more Top 3 Insights from using Query-based ML to Automate Invoice Processing

Why R? 2020 Conference Starts 2020-09-26

[This article was first published on http://r-addict.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. The fourth edition of Why R? Conference (2020.whyr.pl) is startin 2020-09-26. Below are … Read more Why R? 2020 Conference Starts 2020-09-26

Accelerating genomics workflows and data analysis on Azure

Genomics is foundational to the development of targeted therapeutics and precision medicine. Advances in DNA sequencing technologies has driven a revolution in genomics-based research and is helping facilitate better understanding of human biology and disease conditions. This expanded knowledge is leading to the proliferation of personalized medicine strategies to prevent, diagnose, and treat diseases. The … Read more Accelerating genomics workflows and data analysis on Azure

Free text in surveys – important issues in the 2017 New Zealand Election Study by @ellis2013nz

This is a quick post looking at using biterm topic modelling, a new technique for me, on the free text responses to a survey question. I’m interested in whether this type of topic modelling might be a shortcut to analysing free text, quicker than having a human read the answers and code them. The question … Read more Free text in surveys – important issues in the 2017 New Zealand Election Study by @ellis2013nz

Lessons learned from 500+ Data Science interviews

[This article was first published on http://r-addict.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Dear R Enthusiasts! Today at Why R? 2020 conference we host a Discussion … Read more Lessons learned from 500+ Data Science interviews

An extensible, interactive visualization framework to measure gender bias in the news

Modularity and code readability A Dash app can be conceptualized as being made of three main blocks of code. The first of these is the application layout, which contains the structure of the web page, including all the interactive widgets and core components (dropdowns, text inputs, etc.) required by the user to modify an app’s … Read more An extensible, interactive visualization framework to measure gender bias in the news

Exploring the Indian Premier League using PageRank

The eponymous PageRank algorithm, which is under the hood of the famed Google search, is surprisingly simple in its bare bones avatar. Most algorithms, when they are made to work in practice, undergo a zillion minor changes to be effective and efficient, these modifications to PageRank are out of the current scope of this article. … Read more Exploring the Indian Premier League using PageRank

Based on your activity, you should like this: Instagram vs TikTok!

Evaluating how recommender systems learn about users by comparing Instagram to TikTok Image by Trist’n Joseph “Based on your activity, we think that you might like this.” That phrase is one of the most common phrases seen on almost every social media platform, and there is a good reason for it. The conversion rate on … Read more Based on your activity, you should like this: Instagram vs TikTok!

Generating Novel Content without Dataset

Rewriting the rules in GAN: Copy & paste features contextually Editing StyleGAN model to add a helmet to the horse [Source] GAN architecture has been the standard for generating content through AI, but can it actually invent new content outside what’s available in the training dataset? Or it’s just imitating the training data and mixing … Read more Generating Novel Content without Dataset

Is it Possible to Make Machine Learning Algorithms without Coding?

Orange is an open-source tool that allows us to perform a wide range of data-manipulation tasks such as data visualization, exploration, preprocessing and modeling creation without the need to use Python, R or any other piece of code. It’s ideal if you’re taking your first steps in this long learning-path. It’s also suitable for more … Read more Is it Possible to Make Machine Learning Algorithms without Coding?

10 Essential Jupyter Notebook Extensions for Data Scientists

Photo by Jess Bailey on Unsplash [12]. So there you go, these were 10 essential Jupyter Notebook extensions for Data Scientists or anyone else who likes to code and develop in a Jupyter Notebook as well. I hope you learned some new tricks that can make your job as a Data Scientist easier. Keep in … Read more 10 Essential Jupyter Notebook Extensions for Data Scientists

A Brief Introduction to Software Testing

Functional testing has many categories, and you can use different ones based on your goals of performing the testing process. To better discuss the difference between the various functional testing categories, let’s discuss the 5 most used functional testing techniques. №1: Unit testing Unit testing is usually performed by the developer writing different unit tests … Read more A Brief Introduction to Software Testing

How to Analyze Emotions and Words of the Lyrics From your Favorite Music Artist

An interesting way of performing Text and Sentiment Analysis to song lyrics using Python. Photo by Gabriel Bassino on Unsplash Music is a powerful language to express our feelings and in many cases is used as a therapy to deal with tough moments in our lives. The different sounds, rhythms, and effects used in music … Read more How to Analyze Emotions and Words of the Lyrics From your Favorite Music Artist

Building a Movie Recommender using Python

Content based recommender is a recommendation model that returns a list of items based on a specific item. A nice example of this recommenders are Netflix, YouTube, Disney+ and more. For example, Netflix recommends similar shows that you watched before and liked more. With this project, you will have a better understanding of how these … Read more Building a Movie Recommender using Python

Rewiring Your Brain from Python to Java

Seven conceptual hurdles you might face when learning a new programming language Confession: my personal experience is almost the complete opposite of the title of this article. I actually started with C++ in college, moved to Java to teach AP Computer Science A, and then entered Python territory to work with all of the snazzy … Read more Rewiring Your Brain from Python to Java

5 Must-Read Data Science Papers (and How to Use Them)

#1 — Hidden Technical Debt in Machine Learning Systems 🙈 The team at Google Research provides clear instructions on antipatterns to avoid when setting up your data science workflow. This paper borrows the metaphor of technical debt from software engineering and applies it to data science. via DataBricks As the next paper explores in greater … Read more 5 Must-Read Data Science Papers (and How to Use Them)

Automatic Speech Recognition for the Indian Accent

After you request the dataset, IITM will give you access to their Google Drive links for seven days. Because I needed the data for an extended period, I transferred all the ZIP files to a Google Cloud Bucket. Each ZIP file will have a folder containing the .wav files and the corresponding metadata file named … Read more Automatic Speech Recognition for the Indian Accent

Statistical test for MCAR in python…

Have you wondered if missing “age” is related to the “Salary” of the respondant in a survey? Have you ever thought of analysing associations between various missing values in a dataset? How can you be sure that the absent data are because of no definite pattern? — Answer to these questions is fairly straight forward … Read more Statistical test for MCAR in python…

Profile and log your ML data | by Isaac Backus

Production logging approaches for AI and data pipelines By Isaac Backus and Bernease Herman Photo by Petri R on Unsplash It’s 2020 and most of us still don’t know when, where, why, or how our models go wrong in production. While we all know that “what can go wrong, will go wrong,” or that “the … Read more Profile and log your ML data | by Isaac Backus

Running an R Script on a Schedule: Docker Containers on gitlab

In this tutorial/howto I show you how to run a docker container on a schedule on gitlab. Docker containers are awesome because, once made, they run everywhere! It does not matter what type of computer^[Though I believe there is a problem with ARM based vs other CPU’s]. you have. Once I build a container you … Read more Running an R Script on a Schedule: Docker Containers on gitlab

RStudio Named Strong Performer in the Forrester Wave™: Notebook-Based Predictive Analytics and Machine Learning, Q3 2020

[This article was first published on RStudio Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. As we’ve discussed in recent blog posts on Serious Data Science, Interoperability, … Read more RStudio Named Strong Performer in the Forrester Wave™: Notebook-Based Predictive Analytics and Machine Learning, Q3 2020

Writing conundrums

[This article was first published on R on OSM, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. We’re taking a break from our portfolio series and million sample … Read more Writing conundrums

Amazon Aurora PostgreSQL Supports pglogical Extension

Starting today, Amazon Aurora PostgreSQL supports the pglogical extension. pglogical is an open source PostgreSQL extension that helps customers replicate data between independent Aurora PostgreSQL databases while maintaining consistent read-write access and a mix of private and common data in each database. Amazon Aurora pglogical uses logical replication to copy data changes between independent Aurora … Read more Amazon Aurora PostgreSQL Supports pglogical Extension

Gadfly.jl — The Pure Julia Plotting Library From Your Dreams

Gadfly.jl on the surface follows the simple methodology that most Julia users are familiar with. We can create a simple plot using the plot() method, with two key-word arguments, X and Y. using Gadflydata = Dict([:X => [5,10,32,31,51,43], :Y => [82, 33, 21, 26, 11, 22]])plot(x=data[:X], y=data[:Y]) (Image by author) Here is where things get … Read more Gadfly.jl — The Pure Julia Plotting Library From Your Dreams

All together now: Fleet-wide monitoring for your Compute Engine VMsAll together now: Fleet-wide monitoring for your Compute Engine VMsProduct Manager, Google Cloud

Troubleshoot issues with VM instances fleet-wide view You’ve always been able to view and filter all your VM instances in Cloud Monitoring, and now you can do much more. The VM Instances dashboard now includes agent visibility and installation, and its new tabs let you see fleet-wide information across key metrics. View top VMs across … Read more All together now: Fleet-wide monitoring for your Compute Engine VMsAll together now: Fleet-wide monitoring for your Compute Engine VMsProduct Manager, Google Cloud

Trust: Why AI Ethics and MLOps go hand-in-hand

The worst thing that can happen to your AI endeavor, is that you end up damaging people’s lives through good intentions. Photo by lauren lulu taylor on Unsplash The worst thing that can happen to your AI endeavor, is that you end up damaging people’s lives through good intentions. Without proper AI ethics practices, your … Read more Trust: Why AI Ethics and MLOps go hand-in-hand

Outlier or Inlier: Types of Algorithms to Detect Anomalous Behavior

Anomaly detection is a critical problem that has been researched within diverse research areas and application disciplines. This article aims to construct a structured and comprehensive overview of the selected algorithms for anomaly detection by targeting data scientists, data analysts, and machine learning specialists as an audience. Concept of Anomaly Detection An unexpected change that … Read more Outlier or Inlier: Types of Algorithms to Detect Anomalous Behavior

Gold-Mining Week 3 (2020)

[This article was first published on R – Fantasy Football Analytics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Favorite

️ Sentiment Analysis: Idioms and their Importance

An investigation into the role of idioms in sentiment analysis Note: The methodology behind the approach discussed in this post stems from my PhD thesis and can be found in this academic paper. Sentiment analysis (or opinion mining) aims to automatically extract and classify sentiments (the subjective part of an opinion) and/or emotions (the projections … Read more ️ Sentiment Analysis: Idioms and their Importance

US Excess Mortality

[This article was first published on R on kieranhealy.org, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. The CDC recently released some new data on mortality counts by … Read more US Excess Mortality

How Azure IoT enables business resilience

Businesses around the world have been harnessing the Internet of Things (IoT) to reduce operational expenses, power breakthrough customer experiences, and create new revenue streams. Now, faced with the unprecedented challenges of our times, businesses are also harnessing IoT to create safer workplaces as employees return to work, reduce expenses to weather downturns, reduce environmental … Read more How Azure IoT enables business resilience

shiny.semantic 0.4.0 Update Brings CSS Grid UI Functionality to Shiny Apps

tl;dr shiny.semantic is an R package created by Appsilon that brings the Fomantic UI library to Shiny apps. With shiny.semantic, you can rapidly create great looking Shiny apps with non-standard UI components. We have just released version 0.4.0, which includes standardized syntax, a useful grid functionality for easily positioning UI elements, and multiple quality of … Read more shiny.semantic 0.4.0 Update Brings CSS Grid UI Functionality to Shiny Apps