Best Use of Train/Val/Test Splits with Tips for Medical Data

This post addresses the appropriate way to split data into a training set, validation set, and test set, and how to use each of these sets to their maximum potential. It also discusses concepts specific to medical data with the motivation that the basic unit of medical data is the patient, not the example. If … Read more Best Use of Train/Val/Test Splits with Tips for Medical Data

Lessons from doing Data Science for Eric Garcetti, Mayor of Los Angeles

My internship required me to put on different hats at work. When I focused on building a working solution, I had the inclination to put my engineer hat on to code, plug the data and chug. Sometimes the tasks at hands are straight-forward, for instance, producing visualizations and summary statistics of some dataset. But often … Read more Lessons from doing Data Science for Eric Garcetti, Mayor of Los Angeles

Posts

[This article was first published on Random R Ramblings, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Related If you got this far, why not subscribe for updates … Read more Posts

Solving My Wife’s Problem ‘What Should I Wear Today?’ With AI

Next problem to solve is, to only trigger the capture when the person is near the door area, which may indicates an intent to exit the house. While, an example footage below shows a false positive detection where a person is simply walking pass from front living room towards dining room. False Positive Detection Looking … Read more Solving My Wife’s Problem ‘What Should I Wear Today?’ With AI

The Ultimate Beginner’s Guide to Data Scraping, Cleaning, and Visualization

How to take your model from unremarkable to amazing simply by cleaning and preprocessing your data If you have a model that has acceptable results but isn’t amazing, take a look at your data! Taking the time to clean and preprocess your data the right way can make your model a star. Photo by Burst … Read more The Ultimate Beginner’s Guide to Data Scraping, Cleaning, and Visualization

Neural Architecture Search — Limitations and Extensions

For the past couple of years, researchers and companies have been trying to make deep learning more accessible to non-experts by providing access to pre-trained computer vision or machine translation models. Using a pre-trained model for another task is known as transfer learning, but it still requires sufficient expertise to fine-tune the model on another … Read more Neural Architecture Search — Limitations and Extensions

Train a lines segmentation model using Pytorch

Now that we have the data ready and the network that we want to train defined, it is time to build a basic training pipeline. First is defining a torch dataset and iterate through it using a DataLoader from torch.utils.data import Dataset, DataLoaderfrom torchvision import transforms, utilsclass FormsDataset(Dataset): def __init__(self, images, masks, num_classes: int, transforms=None):self.images … Read more Train a lines segmentation model using Pytorch

Automating Machine Learning Models on AWS

Using AWS Lambda, S3 and EC2 Photo by Samuel Zeller on Unsplash Working as a Research Assistant under Professor Gordon Gao, at the University of Maryland, I have had the opportunity to combine both my Data Engineering and Science interests to automate machine learning models in the cloud. Assisting one of Professor Gao’s Phd fellows, … Read more Automating Machine Learning Models on AWS

Integrative analysis of single-cell multi-omics data using deep learning

Image generated using Seurat Single-cell RNA sequencing (scRNA-seq) has offered a comprehensive and unbiased approach to profile immune cells including T cells with a single-cell resolution using next‑generation sequencing. More recently, exciting technologies such as cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) have been developed to extend scRNA-seq by jointly measuring multiple molecular … Read more Integrative analysis of single-cell multi-omics data using deep learning

CountVectorizer|HashingTF

I came across these terms while matching company names with their corresponding business names. However, due to confidentiality reasons I will walk you through them using some more common example. Before diving deep into CountVectorizer and HashingTF, lets first broadly understand what they do. CountVectorizer and HashingTF estimators are used to generate term frequency vectors. … Read more CountVectorizer|HashingTF

Crawlab — The Ultimate Live Dashboard For Web Crawler

By using my crawler as an example, snapshot above shows the output of my crawler and most importantly, you can download in CSV format by just clicking the Download CSV button. For the item_desc field, it is showing undefined because my item_desc is in JSON format, but Crawlab is not supporting the output JSON field … Read more Crawlab — The Ultimate Live Dashboard For Web Crawler

RSwitch 1.5.0 Release Now Also Corrals RStudio Server Connections

[This article was first published on R – rud.is, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. RSwitch is a macOS menubar application that works on macOS 10.14+ … Read more RSwitch 1.5.0 Release Now Also Corrals RStudio Server Connections

September 2019 Democratic Debates Added to {ggchicklet}

[This article was first published on R – rud.is, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. The latest round of the 2020 Democratic debates is over and … Read more September 2019 Democratic Debates Added to {ggchicklet}

Reinforcement Learning — TD(λ) Introduction(3)

Extend TD(λ) on Q function with Sarsa(λ) In last posts, we have learnt the idea of TD(λ) with eligibility trace, which is a combination of n-step TD method, and have applied it on random walk example. In this post, let’s extend the idea of lambda to more general use cases — instead of learning a … Read more Reinforcement Learning — TD(λ) Introduction(3)

[Theory of AI for Manufacturing] Part 1: Definition

This summer I decided to deepen my foundational knowledge on Artificial Intelligence (AI) by reading a book from Russell and Norvig, AI: A Modern Approach. I will attempt to distill some of the book’s points that might be relevant to the manufacturing industry in a series of articles. This first post is largely extracted from … Read more [Theory of AI for Manufacturing] Part 1: Definition

Rise and Fall of Symbolic AI

Philosophical presuppositions of AI Symbolic AI The early successful paradigm of AI, which was considered as road to general intelligence Allen Newell, Herbert A. Simon — Pioneers in Symbolic AI The work in AI started by projects like the General Problem Solver and other rule-based reasoning systems like Logic Theorist became the foundation for almost … Read more Rise and Fall of Symbolic AI

Implement Face Detection in Less Than 3 Minutes Using Python

Add face detection feature to your app using this simple code Face detection is one of the most common applications of Artificial Intelligence. From camera applications in smartphones to Facebook’s tag suggestions, the use of face detection in applications is increasing every single day. Face detection is the ability of a computer program to identify … Read more Implement Face Detection in Less Than 3 Minutes Using Python

All you need to know about NLP based predictive modeling in one diagram

Uncluttering the NLP predictive modeling scene The takeaway: 5 minutes from now you will become a pro, At engineering features from natural language datasets to build a classic ML model At building embeddings that learn cool representations from complex natural language datasets to build a DNN model. NLP as a field is moving at warp … Read more All you need to know about NLP based predictive modeling in one diagram

Decoding Basics: An Ultimate Guide To Full Stack Developers

Full stack development is one of the most trending topics in the market today. People have put up a lot of queries on the social media platforms regarding full stack web development. But why do these questions arise and why do you actually need a full stack developer? This is something that you will find … Read more Decoding Basics: An Ultimate Guide To Full Stack Developers

Pandas.Series : A Part of the backbone for Machine Learning in Python

ONE OF THE KEYS TO UNDERSTANDING PANDAS IS TO UNDERSTAND THE DATA model. At the core of pandas are three data structures: Series — 1D (can be understood as columns of a spreadsheet) DataFrame — 2D (can be understood as a single spreadsheet) Panel — 3D (can be understood as a group of spreadsheets) In … Read more Pandas.Series : A Part of the backbone for Machine Learning in Python

Artificial Intelligence: More than Machine Learning

Artificial Intelligence. When people see these two words together, they commonly have two thoughts. The first is that computers and robots are going to take over the world. The second is usually another pair of words: Machine Learning. While the first is irrational and never going to happen (or will it?!) the second thought to … Read more Artificial Intelligence: More than Machine Learning

Clustering FEC Quarterly Campaign Contributions

Cluster job roles in FEC data to create animated bar races visualizing contribution by occupation Polls get plenty of attention from week to week, but the news generally lacks deep analysis of the FEC filings. The filings are rich with self-identifying donor information including job position, industry, and location. In the 2020 primary almost all … Read more Clustering FEC Quarterly Campaign Contributions

Explain Your Model with the SHAP Values

Better Interpretability Leads to Better Adoption Is your highly-trained model easy to understand? A sophisticated machine learning algorithms usually can produce accurate predictions, but its notorious “black box” nature does not help adoption at all. Think about this: If you ask me to swallow a black pill without telling me what’s in it, I certainly … Read more Explain Your Model with the SHAP Values

A single function to streamline image classification with Keras

We show, step-by-step, how to construct a single, generalized, utility function to pull images automatically from a directory and train a convolutional neural net model. Plenty has been written about deep learning frameworks such as Keras and PyTorch, and how powerful yet simple to use they are for constructing and playing with wonderful deep learning … Read more A single function to streamline image classification with Keras

Scraping the Web: A fast and simple way to scrape Amazon

Photo by Nicolas Picard on Unsplash As a data scientist or data enthusiast, one is always hungry for lots and lots of DATA. I can imagine the heart-eyes when you see lots of data in a website and your desire to grab all the data, perform all sorts of techniques you have learnt, apply statistics, … Read more Scraping the Web: A fast and simple way to scrape Amazon

AWS Storage Gateway adds Amazon CloudWatch logging and metrics for File Gateway

File Gateway provides on-premises applications with file-based, cached access to virtually unlimited cloud storage. With this launch, you can configure Amazon CloudWatch Logging to log configuration errors, such as insufficient bucket access privileges, or when applications use the gateway to access files that have transitioned to long-term Amazon S3 storage classes. You can subscribe to … Read more AWS Storage Gateway adds Amazon CloudWatch logging and metrics for File Gateway

AWS Elemental MediaLive Now Supports HEVC and 4K/UHD Outputs for Live Channels

Encoding with HEVC offers a number of advantages. While UHD video requires an advanced codec beyond H.264 (AVC), high frame rate (HFR) or High Dynamic Range (HDR) content in HD also benefit from HEVC’s advancements. In addition, benefits can be achieved with HD and SD content even if HDR and HFR are not needed. HEVC … Read more AWS Elemental MediaLive Now Supports HEVC and 4K/UHD Outputs for Live Channels

AWS Elemental MediaStore Now Supports Stale Manifest Deletion

AWS Elemental MediaStore is a video origination and storage service that offers the high performance, predictable low latency, and immediate consistency required for live streaming media combined with the security and durability AWS offers across its services. MediaStore offers an inexpensive method for pass-through and low-latency segmented video content delivery, with pay-as-you-go pricing. The service … Read more AWS Elemental MediaStore Now Supports Stale Manifest Deletion

Tell Me a Story: How to Generate Textual Explanations for Predictive Models

TL;DR: If you are going to explain predictions for a black box model you should combine statistical charts with natural language descriptions. This combination is more powerful than SHAP/LIME/PDP/Break Down charts alone. During this summer Adam Izdebski implemented this feature for explanations generated in R with DALEX library. How he did it? Find out here: … Read more Tell Me a Story: How to Generate Textual Explanations for Predictive Models

How APIs help National Bank of Pakistan modernize the banking experienceHow APIs help National Bank of Pakistan modernize the banking experienceTechnical Lead, National Bank of PakistanHead of Mobile Financial Services

Editor’s note: Today we hear from Zohaib Ali Khan, head of mobile financial services, and Nadir Ikram, technical lead at the National Bank of Pakistan (NBP), the country’s largest government-owned bank. Read on to learn more about how NBP uses APIs to help implement digital banking and reduce the burden of legacy manual processes. NBP, … Read more How APIs help National Bank of Pakistan modernize the banking experienceHow APIs help National Bank of Pakistan modernize the banking experienceTechnical Lead, National Bank of PakistanHead of Mobile Financial Services

Customer Segmentation Using RFM in Apache Spark

Source RFM segmentation is a great method to divide customers into equal groups depending on three criteria (which compose the RFM acronym): Recency. How much time has elapsed since a customer’s last activity or transaction with the company? Frequency. How often has a customer transacted with the company during a particular period of time? Monetary. … Read more Customer Segmentation Using RFM in Apache Spark

Experian: From credit bureau to technology company with APIsExperian: From credit bureau to technology company with APIsAPI Product Manager, Experian

Editor’s note: Today we hear from Dang Nguyen, API Platform Product Owner at Experian, on how the company uses the Apigee API management platform to digitally transform from a traditional credit bureau to a true technology and software provider. Read on to learn how Experian uses APIs to help businesses make smarter decisions and individuals … Read more Experian: From credit bureau to technology company with APIsExperian: From credit bureau to technology company with APIsAPI Product Manager, Experian

Quantum Metric gets answers from customer data at light speedQuantum Metric gets answers from customer data at light speedFounder & CEO, Quantum Metric

Editor’s note: Today we’re hearing from the founder of Quantum Metric, a digital intelligence platform that analyzes huge amounts of digital customer data to improve the customer experience, enhance sales, and increase loyalty. The company credits a huge leap in innovation—along with a 10-fold increase in business—to their decision to adopt Google Cloud. Here’s more … Read more Quantum Metric gets answers from customer data at light speedQuantum Metric gets answers from customer data at light speedFounder & CEO, Quantum Metric

5 Powerful Habits of the Most Productive Teams.

“Never give up. Never stop believing. Never stop fighting!” As a developer with a career spanning across 2 decades, there are two glowing events, which are permanently etched within my memory. · As part of the worst-performing team which was sacked (including me). · As part of the best performing team which won the president’s … Read more 5 Powerful Habits of the Most Productive Teams.

Curiosity + Data + Customer Segmentation = Goodies

TL; DR  I used a Kaggle database to show you how to separate your customers into distinct groups based on their purchase behavior.  With this method, store managers can customize interactions with  existing and potential customers to increase loyalty and eventually, all of the goodies that come with consistent purchases.  For the R enthusiasts out … Read more Curiosity + Data + Customer Segmentation = Goodies

Simple Football Data-set Exploration with Pandas

International Football Results From 1872 to 2019 This tutorial article details how the Python Pandas library can be used to explore a data-set efficiently. Specifically, this example will use the data-set, International football results from 1872 to 2019, which is available from the Kaggle website. Chiefly, this tutorial will explore simple visualizations and how they can be … Read more Simple Football Data-set Exploration with Pandas

Getting on the meet-up bandwagon – our first meet up event

My company Draper and Dash have tasked me with organising a wider meet-up event for anyone who is interested in AI / ML in healthcare. This wider working group consists of people from different sectors, however they are interested in how we can apply AI / ML methods in their organisations. Why did we choose … Read more Getting on the meet-up bandwagon – our first meet up event

Data Science and Machine Learning Adoption in Middle East Countries | Clevered

Market Trends Recently, the Middle East has seen a swarm of investments pooling in from domestic as well as international markets to boost their efforts in innovation and technological advancements. When Amazon accessed a regional online shopping site ‘Souq.com’ for around $580m, it became a milestone for the technology sector and digital economy in the … Read more Data Science and Machine Learning Adoption in Middle East Countries | Clevered

Natural Language Processing: A Crash Course!

Next word prediction is one of the applications of NLP Natural Language Processing (NLP) is an increasingly important sub-domain of Machine Learning that deals with deriving general understanding through developing various language models from various large corpora of text. In this article, we go over 5 common types of NLP problems and their relevant models: … Read more Natural Language Processing: A Crash Course!

Building an ecosystem of partners to help broadcasters transform their businessBuilding an ecosystem of partners to help broadcasters transform their businessGlobal Head of Google Cloud Media & Entertainment Partnerships

The cloud has made it possible for audiences to find the content they love anywhere, on any device, and as a result many broadcasters are looking to the cloud to help them grow and meet customer needs. Broadcasters are using the cloud to streamline content management workflows, modernize their video delivery infrastructure, and develop deeper … Read more Building an ecosystem of partners to help broadcasters transform their businessBuilding an ecosystem of partners to help broadcasters transform their businessGlobal Head of Google Cloud Media & Entertainment Partnerships

What Makes the News?

Before attempting any modeling, I pre-processed the data using NLTK’s WordNet Lemmatizer to reduce the words to their base forms. I then tokenized the data using scikit-learn’s CountVectorizer, which generates a basic count of words in the text in a bag-of-words approach. This approach created a sparse matrix of approximately 4,600 unigrams and bigrams. Some … Read more What Makes the News?

Azure Media Services’ new AI-powered innovation

Animated character recognition, multilingual speech transcription and more now available At Microsoft, our mission is to empower every person and organization on the planet to achieve more. The media industry exemplifies this mission. We live in an age where more content is being created and consumed in more ways and on more devices than ever. … Read more Azure Media Services’ new AI-powered innovation