Build and Compare 3 Models — NLP Sentiment Prediction

Finally, I used the Random Forest algorithm, which is just a combination of a number of decision trees. In my example, I chose to use 300 trees, but I could change that number depending on the kind of accuracy I want from the model. X~i || Fitting Random Forest Classification to the Training set classifier … Read more Build and Compare 3 Models — NLP Sentiment Prediction

An API for @racently

[This article was first published on R | datawookie, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. @racently is a side project that I have been nursing along … Read more An API for @racently

AWS Config Adds Support for AWS Key Management Service and Amazon Elasticsearch Service

AWS Config will automatically record the history of configuration changes for these resource types, if you have configured AWS Config to record all resource types in your account. You can use this information for operational troubleshooting, configuration audit, and change management. You can also create change-triggered AWS Config rules to help you verify whether these … Read more AWS Config Adds Support for AWS Key Management Service and Amazon Elasticsearch Service

Why I Donate All of My Book’s Proceeds to Girls Who Code

Grace Hopper, Ph.D. (Vassar Archives) Doing a small part to help close a gender gap Few, if any, of my classmates shared my fascination with the Mark I Computer that was on display in our university’s Science Center. It is hard to blame them. Towering at 8 feet and filled with rotary switches, crystal diodes, … Read more Why I Donate All of My Book’s Proceeds to Girls Who Code

Fake Face Generator Using DCGAN Model

The implementation part is broken down into a series of tasks from loading data to defining and training adversarial networks. At the end of this section, you’ll be able to visualize the results of your trained Generator to see how it performs; your generated samples should look fairly like realistic faces with small amounts of … Read more Fake Face Generator Using DCGAN Model

Deploy A Text Generating API With Hugging Face’s DistilGPT-2

For the better part of a year, OpenAI’s GPT-2 has been one of the hottest topics in machine learning — and for good reason. The text generating model, which initially was dubbed “too dangerous” to be released in full, is capable of producing uncanny outputs. If you haven’t seen any examples, I recommend looking at … Read more Deploy A Text Generating API With Hugging Face’s DistilGPT-2

What can we really expect to learn from a pilot study?

[This article was first published on ouR data generation, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. I am involved with a very interesting project – the NIA … Read more What can we really expect to learn from a pilot study?

Community Call – Last Night, Testing Saved my Life

To the uninitiated, software testing may seem variously boring, daunting or bogged down in obscure terminology. However, it has the potential to be enormously useful for people developing software at any level of expertise, and can often be put into practice with relatively little effort. Our 1-hour Call will include two speakers and at least … Read more Community Call – Last Night, Testing Saved my Life

From Dev to Prod – All you need to know to get your Flask application running on AWS

Getting the right configurations, making sure it is secured, ensuring resource access through endpoints and having a pretty rendering, … all of them made easy thanks to AWS! As a machine-learning engineer, I never really faced the issue of putting my algorithms out there myself. Well, that was until recently, when I decided to start … Read more From Dev to Prod – All you need to know to get your Flask application running on AWS

Building a machine learning classifier model for diabetes

Based on medical diagnostic measurements Python codes are available: https://github.com/JNYH/diabetes_classifier The Pima Indians of Arizona and Mexico have the highest reported prevalence of diabetes of any population in the world. A small study has been conducted to analyse their medical records to assess if it is possible to predict the onset of diabetes based on … Read more Building a machine learning classifier model for diabetes

AiPM

So what? The cost scales exponentially and unpredictably. The example we shared is just to manage one model, for one business line, and for one model cycle (a different issue may happen in the future). Now, imagine scaling this process to hundreds of models for multiple business units and functions. The bottom line: companies cannot … Read more AiPM

How Spotify Recommends Your New Favorite Artist

A story of data, taste, and a very effective recommender system. Just a short few days ago, I was discussing the impact of recommender systems with some students on a course I’m teaching. Netflix, Amazon, Facebook, and many other online services, use our data to suggest other products we might like. Is this helpful, or … Read more How Spotify Recommends Your New Favorite Artist

Using R and H2O Isolation Forest For Data Quality

[This article was first published on R-Analytics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. suppressWarnings( suppressMessages( library( h2o ) ) ) suppressWarnings( suppressMessages( library( dygraphs ) ) … Read more Using R and H2O Isolation Forest For Data Quality

Exploring the machine learning models behind Cloud IAM RecommenderExploring the machine learning models behind Cloud IAM RecommenderSoftware EngineerProduct Manager

To help you fine-tune your Google Cloud environment, we offer a family of ‘recommenders’ that suggest ways to optimize how you configure your infrastructure and security settings. But unlike many other recommendation engines, which use policy-based rules, some Google Cloud recommenders use machine learning (ML) to generate their suggestions. In this blog post, we’ll take … Read more Exploring the machine learning models behind Cloud IAM RecommenderExploring the machine learning models behind Cloud IAM RecommenderSoftware EngineerProduct Manager

What do campaign contributions tell us about the federal election?

With Canada’s 43rd Federal Election not too far in the rearview mirror, we at ThinkData Works were curious as to what we can learn about our most recent election by stepping back from the punditry and analyzing some data. After all, using government data is a great way to understand how our government works. There … Read more What do campaign contributions tell us about the federal election?

Quickly Build and Deploy an Application with Streamlit

With the launch of Streamlit, developing a dashboard for your machine learning solution has been made incredibly easy. Streamlit is an open source app framework specifically designed for ML engineers working with Python. It allows you to create a stunning looking application with only a few lines of code. I want to take this opportunity … Read more Quickly Build and Deploy an Application with Streamlit

The hardest question you’ve been asked in a data science interview

What’s the most difficult question you ever encountered in a data science interview? I’ll share mine: “How many years of experience do you have in language X?” This is really hard to answer: Do I count the years I used it in academia? Do I count the years I used it in my hobby projects? … Read more The hardest question you’ve been asked in a data science interview

Take your Machine Learning Models to Production with these 5 simple steps

I have created this impressive ML model, it gives 90% accuracy, but it takes around 10 seconds to fetch a prediction. Is that acceptable? For some use-cases maybe, but really no. In the past, there have been many Kaggle competitions whose winners ended up creating monster ensembles to take the top spots on the leaderboard. … Read more Take your Machine Learning Models to Production with these 5 simple steps

Announcing the general availability of the new Azure HPC Cache service

If data-access challenges have been keeping you from running high-performance computing (HPC) jobs in Azure, we’ve got great news to report! The now-available Microsoft Azure HPC Cache service lets you run your most demanding workloads in Azure without the time and cost of rewriting applications and while storing data where you want to—in Azure or … Read more Announcing the general availability of the new Azure HPC Cache service

Full Stack Development Tutorial: Serverless REST API running on AWS Lambda

Serverless computing is a cloud-computing execution model in which the cloud provider runs the server, and dynamically manages the allocation of machine resources. Pricing is based on the actual amount of resources consumed by an application, rather than on pre-purchased units of capacity. — Wikipedia Photo by Anthony Cantin on Unsplash (This is the second … Read more Full Stack Development Tutorial: Serverless REST API running on AWS Lambda

How I Use AI Across One of My Favorite Hobbies — Photography

Neural Networks for labeling, compression, effects and more! You can read the article and follow along with the code in the repo: Poseyy/AI-Photography You can’t perform that action at this time. You signed in with another tab or window. You signed out in another tab or… github.com An obvious application of AI to photography is … Read more How I Use AI Across One of My Favorite Hobbies — Photography

Online Marketing Measurement: Which Half?

A constant presence on today’s internet are ads. They power Google and Facebook and follow us everywhere. As with all marketing spend, they’re an investment. As with most investments, it’s crucial to measure their return. What makes online marketing different is the unprecedented possibility of building accurate measurement tools. In this post I’ll describe a … Read more Online Marketing Measurement: Which Half?

How to Identify Hotel Deals — Using Machine Learning

Web Scraping I used BeautifulSoup and Selenium in parallel to scrape 3 months of hotel listing information from Hotel.com. Some of the information I scraped were the checkin and checkout dates, number of adults and children, distance to city and convention centers, hotel addresses, hotel reviews and ratings, TripAdvisor’s ratings and reviews, hotel amenities, and … Read more How to Identify Hotel Deals — Using Machine Learning

STL decomposition : How to do it from Scratch?

Figure out what STL decomposition is and how it works. This article will help you understand what is STL decomposition and how to do it from scratch. At the end, I will use statsmodel library too, to get the results in seconds. So, STL stands for Seasonal and Trend decomposition using Loess. This is a … Read more STL decomposition : How to do it from Scratch?

Managerial Analytics and Data Science

Previously, we learned about two general areas of machine learning: Supervised and Unsupervised learning. Here, we’ll investigate two special fields of machine learning: time series prediction and natural language processing. Time Series Forecasting Time series forecasting refers to any type of supervised Machine Learning where time is an important feature. A good time series forecast … Read more Managerial Analytics and Data Science

Enter Analytics: From Boot Camp to working in Data Science

We covered a lot in a short amount of time… almost too many topics, actually. Just when you start getting comfortable and ready to do more advanced things, they change the topic. It is really up to you to decide what direction you want to take things outside of the classroom. For example, I am … Read more Enter Analytics: From Boot Camp to working in Data Science

Teaching A Computer To Land On The Moon

I spent a fair amount of time last year catching up on what’s happening in machine learning. The tools available now are really impressive — you can implement a complex neural net in just a few lines of code now with the libraries that are available. I’ve always been fascinated by the idea of machines … Read more Teaching A Computer To Land On The Moon

Few tips you can use while collecting data

1- Organize your scripts I consider each scripts executing specific tasks separate to keep my Jupyter Notebook clean and neat… It is important to stay organized while collecting data. That helps find out easily where your mistakes are. Writing one block of code wouldn’t help. I propose two habits to develop: keep the codes commented … Read more Few tips you can use while collecting data

Beginner’s Guide to Encoding Data

As you can see the Book_Table column has been encoded into numerical values of 0/1. The output of le.fit_transform(df[“Book_Table”]) is a Dataframe/Series depending on no. of columns encoded. Mostly Binary Columns (Book_Table from df)are encoded using Label Encoder. For Multiclass it will give different (0 to n_classes-1) values for different classes eg. 0,1,2,3,….,n-1; which are … Read more Beginner’s Guide to Encoding Data

How to graph a Bar Chart Race and realize I don’t need one?

I can totally read the chart with hindsight bias, campaign funding money doesn’t predict the performance of the candidate. But other than that, a cool animated bar graph doesn’t tell you that much going forward. After all, if we want to read a truly scientific and analytical piece, making some visualization is far not enough. … Read more How to graph a Bar Chart Race and realize I don’t need one?

Building a Pseudorandom Number Generator

This giant formula can be read like this: The probability that an algorithm in the class of probabilistic polynomial time problems (BPP) could distinguish a sequence between a real random source and a PRNG tends to zero faster than any polynomial as the length of the seed increases. Therefore, a PRNG is an algorithm that … Read more Building a Pseudorandom Number Generator

Preventing Data Leakage in Your Machine Learning Model

It goes without saying that knowledge about the dataset you are working with is necessary in order to be able to perform effective analyses and develop sound models. However, what is not often said, with regards to data leakage, is that you should refrain from studying the distributions or basic statistics your dataset until after … Read more Preventing Data Leakage in Your Machine Learning Model

A comparison of methods for predicting clothing classes using the Fashion MNIST dataset in RStudio and Python (Part 1)

[This article was first published on R Views, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Florianne Verkroost is a PhD candidate at Nuffield College at the University … Read more A comparison of methods for predicting clothing classes using the Fashion MNIST dataset in RStudio and Python (Part 1)

Statistical uncertainty with R and pdqr

General description Statistical estimation usually has the following setup. There is a sample (observed, usually randomly chosen, set of values of measurable quantities) from some general population (whole set of values of the same measurable quantities). We need to make conclusions about the general population based on a sample. This is done by computing summary … Read more Statistical uncertainty with R and pdqr

Cleaning the Table

[This article was first published on R on kieranhealy.org, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. While I’m talking about getting data into R this weekend, here’s … Read more Cleaning the Table

A Hybrid Neural Machine Translation Model (Luong & Manning):

Luong & Manning published a recent paper entitled “Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models.” the contents of which I summarize below. For a quick summary of the current State-of-the-Art in Neural machine Translation (NMT) you can take a quick look at my other post here Currently the general word-based NMTs generate … Read more A Hybrid Neural Machine Translation Model (Luong & Manning):

Exoplanets III: Habitability and Conclusion

Two possibilities exist: either we are alone in the Universe or we are not. Both are equally terrifying. -Arthur C. Clarke Although the search for other planets is partly motivated by our efforts to understand their formation and to improve the understanding of our own solar system, the ultimate goal is to find extraterrestrial life. … Read more Exoplanets III: Habitability and Conclusion

Exoplanets II: Interpretation of Data

Now that we’ve seen and understood the historical background, the scientific value of the research, and the implications of the discoveries, we will look at the actual data found and compiled by space missions, so that we can relate them to actual physics. By plotting graphs of the data, we can visually see the correlations … Read more Exoplanets II: Interpretation of Data

Exoplanets I: Methods and Discoveries

Mankind has long since speculated about planetary systems other than our own. Philosophers hypothesized centuries ago that our solar system was not unique; that there were in fact countless more that existed in the seemingly limitless ocean of stars. The possibility of life existing on a planet orbiting another star was not just a plausible … Read more Exoplanets I: Methods and Discoveries

How AI Will Redefine Economics

RESOLVING THE PROBLEM OF CAUSATION WITH BIG DATA For decades, economists have made their analyses of the economy based on data sets only as large as their research assistants could handle, hence severely limiting the scope and precision of their work. AI and machine learning will enable economists to dramatically enlarge these data sets and … Read more How AI Will Redefine Economics

Credit Card Fraud Detection using Self Organizing FeatureMaps

What are self organising feature maps ? A self-organizing map ( SOM) is a type of artificial neural network (ANN) that is trained using unsupervised learning to produce a low-dimensional (typically two-dimensional), discretized representation of the input space of the training samples, called a map, and is therefore a method to do dimensionality reduction. Self-organizing … Read more Credit Card Fraud Detection using Self Organizing FeatureMaps

Multi-Label Text Classification with XLNet

Click here for the Colab notebook accompanying this article. First, let’s install the necessary library, actually just transformers. Next, we import the necessary libraries. Check if the GPU is available. Mount your google drive to your Colab notebook. For our example, we will create a Data folder in our google drive and put the datasets … Read more Multi-Label Text Classification with XLNet

Nuances in the usage of Word Embeddings: Semantic and Syntactic Relationships

Note: Super short post ahead. Just food for thought I guess? 🙂 In the past weeks, I’ve been writing about Word Embeddings. How I created word embeddings from scratch for a colloquial language such as Singlish, and how I augmented it to handle misspellings or out-of-vocabulary words with translation vectors. In the latter article, I … Read more Nuances in the usage of Word Embeddings: Semantic and Syntactic Relationships