Ethics in AI: Decisions by Algorithms

ethics — Google Search.jpeg Like anything, boundaries and frameworks need to be established, and artificial intelligence should be no different. Whether we have realized it or not, AI is changing the way we live. It’s present in the way social media feeds are organised; the way predictive searches show up on Google; and how music … Read moreEthics in AI: Decisions by Algorithms

Ultimate Setup for Your Next Golang Project

Note: This was originally posted at martinheinz.dev For me, the biggest struggle when starting new project has always been trying to set the project up “perfectly”. I always try to use the best directory structure so everything is easy to find and imports work nicely, setup all commands so that I’m always one click/command away … Read moreUltimate Setup for Your Next Golang Project

Kickstart Your First Clustering Project in San Francisco Neighborhoods

In this project, we will use the Foursquare API to explore neighborhoods in San Francisco, get the most common venue categories in each neighborhood, use the k-means clustering algorithm to find similar neighborhoods, use the Folium library to visualize the neighborhoods in San Francisco and their emerging clusters. Project Flow This is the clustering map … Read moreKickstart Your First Clustering Project in San Francisco Neighborhoods

Does news coverage boost support for presidential candidates in the Democratic primary?

[This article was first published on R on Jacob Long, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Matt Grossmann noted the close relationship betweenthe amount of news … Read moreDoes news coverage boost support for presidential candidates in the Democratic primary?

Announcing tidyUSDA: An R Package for Working with USDA Data

I’m proud to announce the release of an R package that has cured one of my own personal itches: pulling and working with USDA data, specifically Quick Stats data from NASS. tidyUSDA is a minimal package for doing just that. The following is cut out from the package vignette, which you can find here: https://github.com/bradlindblad/tidyUSDA … Read moreAnnouncing tidyUSDA: An R Package for Working with USDA Data

Getting Stuff Done at Hackathons for Rookies

I thoroughly enjoyed my first hackathon (you can read about my experience about scope from a previous post). The opportunity arose through BetaNYC to participate in the Mobility for All Abilities Hackathon, part of the larger National Day of Civic Hacking of 2019. I was on the Reliable Access to Subways team, partnered with TransitCenter … Read moreGetting Stuff Done at Hackathons for Rookies

My First Data Science Project — Family-Friendly Neighborhoods in London

It’s great news to see that there are more family-friendly neighborhoods in London than there are neighborhoods to avoid. In fact, there are 136 neighborhoods to choose from. Here is a simple breakdown: So for any families like my own who are looking for the best family-friendly neighborhoods in London, England. I suggest you start … Read moreMy First Data Science Project — Family-Friendly Neighborhoods in London

P-value Explained Simply for Data Scientists

We create a Sampling Distribution of the mean of the WeightLoss samples assuming our Null hypothesis is True. Central Limit Theorem: The central limit theorem simply states that if you have a population with mean μ and standard deviation σ, and take random samples from the population, then the distribution of the sample means will … Read moreP-value Explained Simply for Data Scientists

Where should you go for college?

What your expected salary will be after graduating based on college degree and college region. Teenagers reach that point in their life where they need to pursue their goals in life. Some have ambitions that require college education. Some are still unsure about their goals or ambitions so they go to college to find them. … Read moreWhere should you go for college?

Is There a Difference Between Open Data and Public Data?

There is a general consensus that when we talk about open data we are referring to any piece of data or content that is free to access, use, reuse, and redistribute. Due to the way most governments have rolled out their open data portals, however, it would be easy to assume that the data available … Read moreIs There a Difference Between Open Data and Public Data?

A Non-Confusing Guide to Confusion Matrix

After reading all of that stuff about positive and negatives (a couple of times preferably), you now have a basic idea and intuition about confusion matrix, and you see that it’s not that confusing after all — it just needs to “sink in” properly. But is that all about confusion matrix? I hope you’re kidding. … Read moreA Non-Confusing Guide to Confusion Matrix

Why Kaggle Is Not Inclusive and How to Improve It.

‘If you want to be good at swimming in pools, that is fine, go for Kaggle. If you want to be good on the open sea, go for Omdena’ — Leonardo Sanchez, Omdena challenge collaborator from Brazil. ‘What I learned in the past couple of months in Omdena’s AI challenge is much more than what … Read moreWhy Kaggle Is Not Inclusive and How to Improve It.

Power BI as a Tool for Business Intelligence

Following my previous article on the Strata Data Science Conference, I started to ponder future developments in data science and business intelligence — namely, how these two simple terms will change the way we work, think, and live. To be honest, “data science” seems somewhat distant to me; however, the concept of “business intelligence” can … Read morePower BI as a Tool for Business Intelligence

The crux of word embedding layers -Part 1

In the last story, we discussed RASA NLU which is an open-source conversational AI Tool. We used Tensorflow pipeline which is used for intent classification. The pipeline has different components such as tokenizer, featurizer, entity extractor, and intent classifier. Our intent classifier itself has sub-components such as TensorFlow embedding. Now we are going to discuss … Read moreThe crux of word embedding layers -Part 1

Harry Potter and the Power of Bayesian Constrained Inference

[This article was first published on Fabian Dablander, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. If you are reading this, you are probably a Ravenclaw. Or a … Read moreHarry Potter and the Power of Bayesian Constrained Inference

Introduction to Digital Images

This is the first post in the series of “Digital Image Processing”. In this series, we will be discussing digital images and how to process them. Let’s discuss what an Image is. If you are from Signal processing background, then you might consider image as a two-dimensional signal i.e., a function with two dimensions f(x,y) … Read moreIntroduction to Digital Images

When ‘Growth Hacks’ Meets ‘Growing Pain’

‘Growth Hacks’ vs ‘Growing Pain’ Some people accused him of fraud, I’m not so sure about that. This to me looks more like an honest mistake due to lack of experience in scaling up and entering into a field he is not familiar with. See, most of his more popular videos are entry-level tutorials with … Read moreWhen ‘Growth Hacks’ Meets ‘Growing Pain’

Making Data Science Interviews Better

What’s the actual objective of a business case interview? It’s to test the ability of a candidate to both think critically and creatively when faced with an open-ended problem. But as an interviewer how do you assess these things? The thinking critically part is not as hard — if the person is stumbling through basic … Read moreMaking Data Science Interviews Better

Using Keras and TensorFlow to Predict Dengue Fever Outbreaks

Machine learning prediction models using time-series weather data. Image licensed from Adobe Stock Dengue, commonly called dengue fever, is a mosquito-borne disease that occurs in tropical and sub-tropical parts of the world. In mild cases, symptoms are similar to the flu: fever, rash, and muscle and joint pain. In severe cases, Dengue can cause severe … Read moreUsing Keras and TensorFlow to Predict Dengue Fever Outbreaks

10,000 Ways That Won’t Work

Lesson 3 of “Practical Deep Learning for Coders” by fast.ai “I have not failed. I’ve just found 10,000 ways that won’t work.” ~Thomas Edison I’m a math adjunct working my way through Lesson 3 of “Practical Deep Learning for Coders” by fast.ai, and this week has been a major pride-swallower for me. At the end … Read more10,000 Ways That Won’t Work

Visualizing Tesla Superchargers in France

Learn visualization using Python and Folium, from scratch Data visualization is not merely science, it is an art. The way our human brain works, it is really easy to process information in the form of visualization. After almost 25 years into digital mapping and many companies using machine learning to collect mass amounts of data, … Read moreVisualizing Tesla Superchargers in France

New Quick Starts deploy JFrog Artifactory on AWS

Three new Quick Starts deploy JFrog Artifactory on the Amazon Web Services (AWS) Cloud in 30-45 minutes. The available options for deployment use your choice of Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Container Service (Amazon ECS), or Amazon Elastic Kubernetes Service (Amazon EKS). Favorite

Coding algorithms in R for models written in Stan

[This article was first published on R – Statisfaction, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Stanislaw Ulam’s auto-biography, “adventures of a mathematician”, originally published in 1976 … Read moreCoding algorithms in R for models written in Stan

AWS IoT Core Introduces Beta Feature To Simplify Device Certificate Registration

AWS IoT is announcing a new feature for AWS IoT Core called “Multi-Account Registration,” which is now available in beta. The new feature allows customers to quickly move devices between their AWS accounts by specifying the account information when the device connects to AWS IoT Core. Customers opting to use this feature will use Server … Read moreAWS IoT Core Introduces Beta Feature To Simplify Device Certificate Registration

Artificial Intelligence: Do stupid things faster with more energy!

Why we need a new breed of leader in the data-fueled era Multiple choice time! What’s the best kind of worker? A) Reliable workers who carry out orders precisely, quickly, and efficiently. B) Unreliable workers who may or may not feel like doing what they’re told. If you think this is a no-brainer and reliable … Read moreArtificial Intelligence: Do stupid things faster with more energy!

How to get the Python Environment of all Spark Cluster Nodes

For our analyses of anonymized mobile phone location data here at Invenium we use, amongst others, Apache Spark™. In our applications, we interface it directly using the Java API as well as using the Python API pyspark. Recently we noticed an unusual performance drop when running our algorithms. After making sure that we haven’t made … Read moreHow to get the Python Environment of all Spark Cluster Nodes

Get Involved With SciPy!

SciPy wants your ideas to help it become more user-friendly You’ve heard of SciPy. You’ve probably used it. You might have looked through some of the technical documentation and user guides. You might even have an opinion of the documentation… But have you given any thought to actually getting involved and letting SciPy know how … Read moreGet Involved With SciPy!

A classic bedtime story: Cinderella of Neural Networks

Endgame for “AI Winter” How a competition, ImageNet, along with a noisy algorithm, Stochastic Gradient Descent, changed the fate of AI? Picture from The Elders Scroll | Skyrim In the early 1980s, Winter was coming for Artificial Intelligence (AI) with a period of reduced funding and interest in AI research, which will later be called … Read moreA classic bedtime story: Cinderella of Neural Networks

Mapping the Underlying Social Structure of Reddit

Reddit is a popular website for opinion sharing and news aggregation. The site consists of thousands of user-made forums, called subreddits, which cover a broad range of subjects, including politics, sports, technology, personal hobbies, and self-improvement. Given that most Reddit users contribute to multiple subreddits, one might think of Reddit as being organized into many … Read moreMapping the Underlying Social Structure of Reddit

Handling dates and times in R: a free online course

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. If you ever need to work with data involving dates, times or durations … Read moreHandling dates and times in R: a free online course

Amazon ECS supports Automated Draining for Spot Instances running ECS Services

Automated Spot Instance Draining will automatically place Spot instances in “DRAINING” state upon the receipt of two minute interruption notice. ECS tasks running on Spot instances will automatically be triggered for shutdown before the instance terminates and replacement tasks will be scheduled elsewhere on the cluster. No new ECS service tasks will be started on … Read moreAmazon ECS supports Automated Draining for Spot Instances running ECS Services

Introduction to Principal Component Analysis (PCA) — with Python code

Variance as Information In Machine Learning, we need features for the algorithm to figure out patterns that help differentiate classes of data. More the number of features, more the variance (variation in data) and hence model finds it easy to make ‘splits’ or ‘boundaries’. But not all features provide useful information. They can have noise … Read moreIntroduction to Principal Component Analysis (PCA) — with Python code

101 Data Science Interview Questions, Answers, and Key Concepts

In October 2012, the Harvard Business Review described “Data Scientist” as the “sexiest”  job of the 21st century. Well, as we approach 2020 the description still holds true! The world needs more data scientists than there are available for hire. All companies – from the smallest to the biggest – want to hire for a … Read more101 Data Science Interview Questions, Answers, and Key Concepts

Artificial Intelligence: Explainable in every language

What is Ai?It depends on who you ask. When the term was coined in 1956, “Artificial Intelligence” has endured a lifetime of misunderstanding. Explainability is the missing link and the reason why it’s misunderstood. The problem lies in the interpretation of the word “intelligence.” In the words of legendary computer scientist Edsger Dijkstra: “The question … Read moreArtificial Intelligence: Explainable in every language

Data warehouse migration challenges and how to meet themData warehouse migration challenges and how to meet themStrategic Cloud EngineerGroup Product Manager

Editor’s note: This is the second in a series on modernizing your data warehouse. Find part 1 here. In the last blog post, we discussed why legacy data warehouses are not cutting it any more and why organizations are moving their data warehouses to cloud. We often hear that customers feel that migration is an … Read moreData warehouse migration challenges and how to meet themData warehouse migration challenges and how to meet themStrategic Cloud EngineerGroup Product Manager

6 strategies for scaling your serverless applications6 strategies for scaling your serverless applicationsSolution Architect

Lets review the basic way in which serverless functions scale as you take a function from your laptop to the cloud.  At a basic level, a function takes input, and provides an output response.  That function can be repeated with many inputs, providing many outputs.   A serverless platform like Cloud Functions manages elastic, horizontal scaling … Read more6 strategies for scaling your serverless applications6 strategies for scaling your serverless applicationsSolution Architect