Writing your first Neural Net in less than 30 lines of code with Keras.

https://unsplash.com/@tvick Reminiscing back to when I first started my journey into AI, I remember all too well how daunting some of the concepts seemed. Reading a simple explanation on what a Neural Network is can quickly lead to a scientific paper where every second sentence is a formula with symbols you’ve never even seen before. … Read moreWriting your first Neural Net in less than 30 lines of code with Keras.

Object Detection with Less Than 10 Lines of Code Using Python

Find out what objects are in the image What to know what objects are in the image? Or perhaps you want to count the number of apples in an image? In this post, I will show you how to create your own object detection program using Python in less than 10 lines of code. You … Read moreObject Detection with Less Than 10 Lines of Code Using Python

4 Graph Algorithms on Steroids for data Scientists with cuGraph

We all know how clustering works? You can think of Connected Components in very layman’s terms as a sort of a hard clustering algorithm which finds clusters/islands in related/connected data. As a concrete example: Say you have data about roads joining any two cities in the world. And you need to find out all the … Read more4 Graph Algorithms on Steroids for data Scientists with cuGraph

Ranking The Best UFC Fighters Using PageRank and Neo4j

UFC Heavyweight, Brock Lesnar: 100% All-American muscle (and “supplements”) MMA fans and pundits love to debate which fighters should be considered the all-time greats. Many times however, these rankings are often a subjective mix of fighter records, individual biases, analyses from Reddit keyboard-warriors, and gut feelings. Is there a more data-driven way to assemble fighter … Read moreRanking The Best UFC Fighters Using PageRank and Neo4j

Getting Started With Bounding Box Regression In TensorFlow

Bounding box regression could be your first wonderful step in the world of object detection. Wanna get started with Object Detection? Any ML learner could like to see nice bounding boxes around an object in an image ( at least for me! ). We’ll now learn a basic concept in Object Detection called Bounding Box … Read moreGetting Started With Bounding Box Regression In TensorFlow

Geocode with Python

How to Convert physical addresses to Geographic locations → Latitude and Longitude Datasets are rarely complete and often require pre-processing. Imagine some datasets have only an address column without latitude and longitude columns to represent your data geographically. In that case, you need to convert your data into a geographic format. The process of converting … Read moreGeocode with Python

Introduction to Natural Language Processing for Noobs

A general introduction to NLP and its basic pipeline using Kaggle Competition This post is my attempt to give an overview of basic concepts which might help noobs. For better understanding, I will be following a completed Kaggle competition Quora Insincere Questions Classification. Here, we are provided 1.31 million questions with labels and 376 thousand … Read moreIntroduction to Natural Language Processing for Noobs

Beginner Guide to Machine Learning Pipeline Monitoring

A less discussed, yet important, topic when it comes to the development of Machine Learning pipeline Many companies are using the power machine learning to provide prediction, recommendation, or classification on both front-end and back-end of their applications. A report from The Verge states that “Eventually, pretty much everything will have [machine learning] somewhere inside.” … Read moreBeginner Guide to Machine Learning Pipeline Monitoring

Introduction to Machine Learning in C# with ML.NET

Loading in and preprocessing a data-set in ML.NET is quite different than when working with other machine learning packages/frameworks because it requires us to explicitly state the structure of our data. To do so we create a file called ModelInput.cs inside a folder called DataModels. Inside this file, we will state all the columns of … Read moreIntroduction to Machine Learning in C# with ML.NET

BERT for dummies —  Step by Step Tutorial

DIY Practical guide on Transformer. Hands-on proven PyTorch code for Intent Classification in NLU with BERT fine-tuned. This article introduces everything you need in order to take off with BERT. We provide a step-by-step guide on how to fine-tune Bidirectional Encoder Representations from Transformers (BERT) for Natural Language Understanding and benchmark it with LSTM. Chatbots, … Read moreBERT for dummies —  Step by Step Tutorial

Best Use of Train/Val/Test Splits with Tips for Medical Data

This post addresses the appropriate way to split data into a training set, validation set, and test set, and how to use each of these sets to their maximum potential. It also discusses concepts specific to medical data with the motivation that the basic unit of medical data is the patient, not the example. If … Read moreBest Use of Train/Val/Test Splits with Tips for Medical Data

Lessons from doing Data Science for Eric Garcetti, Mayor of Los Angeles

My internship required me to put on different hats at work. When I focused on building a working solution, I had the inclination to put my engineer hat on to code, plug the data and chug. Sometimes the tasks at hands are straight-forward, for instance, producing visualizations and summary statistics of some dataset. But often … Read moreLessons from doing Data Science for Eric Garcetti, Mayor of Los Angeles

Solving My Wife’s Problem ‘What Should I Wear Today?’ With AI

Next problem to solve is, to only trigger the capture when the person is near the door area, which may indicates an intent to exit the house. While, an example footage below shows a false positive detection where a person is simply walking pass from front living room towards dining room. False Positive Detection Looking … Read moreSolving My Wife’s Problem ‘What Should I Wear Today?’ With AI

The Ultimate Beginner’s Guide to Data Scraping, Cleaning, and Visualization

How to take your model from unremarkable to amazing simply by cleaning and preprocessing your data If you have a model that has acceptable results but isn’t amazing, take a look at your data! Taking the time to clean and preprocess your data the right way can make your model a star. Photo by Burst … Read moreThe Ultimate Beginner’s Guide to Data Scraping, Cleaning, and Visualization

Neural Architecture Search — Limitations and Extensions

For the past couple of years, researchers and companies have been trying to make deep learning more accessible to non-experts by providing access to pre-trained computer vision or machine translation models. Using a pre-trained model for another task is known as transfer learning, but it still requires sufficient expertise to fine-tune the model on another … Read moreNeural Architecture Search — Limitations and Extensions

Train a lines segmentation model using Pytorch

Now that we have the data ready and the network that we want to train defined, it is time to build a basic training pipeline. First is defining a torch dataset and iterate through it using a DataLoader from torch.utils.data import Dataset, DataLoaderfrom torchvision import transforms, utilsclass FormsDataset(Dataset): def __init__(self, images, masks, num_classes: int, transforms=None):self.images … Read moreTrain a lines segmentation model using Pytorch

Automating Machine Learning Models on AWS

Using AWS Lambda, S3 and EC2 Photo by Samuel Zeller on Unsplash Working as a Research Assistant under Professor Gordon Gao, at the University of Maryland, I have had the opportunity to combine both my Data Engineering and Science interests to automate machine learning models in the cloud. Assisting one of Professor Gao’s Phd fellows, … Read moreAutomating Machine Learning Models on AWS

Integrative analysis of single-cell multi-omics data using deep learning

Image generated using Seurat Single-cell RNA sequencing (scRNA-seq) has offered a comprehensive and unbiased approach to profile immune cells including T cells with a single-cell resolution using next‑generation sequencing. More recently, exciting technologies such as cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) have been developed to extend scRNA-seq by jointly measuring multiple molecular … Read moreIntegrative analysis of single-cell multi-omics data using deep learning

CountVectorizer|HashingTF

I came across these terms while matching company names with their corresponding business names. However, due to confidentiality reasons I will walk you through them using some more common example. Before diving deep into CountVectorizer and HashingTF, lets first broadly understand what they do. CountVectorizer and HashingTF estimators are used to generate term frequency vectors. … Read moreCountVectorizer|HashingTF

Crawlab — The Ultimate Live Dashboard For Web Crawler

By using my crawler as an example, snapshot above shows the output of my crawler and most importantly, you can download in CSV format by just clicking the Download CSV button. For the item_desc field, it is showing undefined because my item_desc is in JSON format, but Crawlab is not supporting the output JSON field … Read moreCrawlab — The Ultimate Live Dashboard For Web Crawler

Reinforcement Learning — TD(λ) Introduction(3)

Extend TD(λ) on Q function with Sarsa(λ) In last posts, we have learnt the idea of TD(λ) with eligibility trace, which is a combination of n-step TD method, and have applied it on random walk example. In this post, let’s extend the idea of lambda to more general use cases — instead of learning a … Read moreReinforcement Learning — TD(λ) Introduction(3)

Rise and Fall of Symbolic AI

Philosophical presuppositions of AI Symbolic AI The early successful paradigm of AI, which was considered as road to general intelligence Allen Newell, Herbert A. Simon — Pioneers in Symbolic AI The work in AI started by projects like the General Problem Solver and other rule-based reasoning systems like Logic Theorist became the foundation for almost … Read moreRise and Fall of Symbolic AI

Implement Face Detection in Less Than 3 Minutes Using Python

Add face detection feature to your app using this simple code Face detection is one of the most common applications of Artificial Intelligence. From camera applications in smartphones to Facebook’s tag suggestions, the use of face detection in applications is increasing every single day. Face detection is the ability of a computer program to identify … Read moreImplement Face Detection in Less Than 3 Minutes Using Python

All you need to know about NLP based predictive modeling in one diagram

Uncluttering the NLP predictive modeling scene The takeaway: 5 minutes from now you will become a pro, At engineering features from natural language datasets to build a classic ML model At building embeddings that learn cool representations from complex natural language datasets to build a DNN model. NLP as a field is moving at warp … Read moreAll you need to know about NLP based predictive modeling in one diagram

Decoding Basics: An Ultimate Guide To Full Stack Developers

Full stack development is one of the most trending topics in the market today. People have put up a lot of queries on the social media platforms regarding full stack web development. But why do these questions arise and why do you actually need a full stack developer? This is something that you will find … Read moreDecoding Basics: An Ultimate Guide To Full Stack Developers

Pandas.Series : A Part of the backbone for Machine Learning in Python

ONE OF THE KEYS TO UNDERSTANDING PANDAS IS TO UNDERSTAND THE DATA model. At the core of pandas are three data structures: Series — 1D (can be understood as columns of a spreadsheet) DataFrame — 2D (can be understood as a single spreadsheet) Panel — 3D (can be understood as a group of spreadsheets) In … Read morePandas.Series : A Part of the backbone for Machine Learning in Python

Clustering FEC Quarterly Campaign Contributions

Cluster job roles in FEC data to create animated bar races visualizing contribution by occupation Polls get plenty of attention from week to week, but the news generally lacks deep analysis of the FEC filings. The filings are rich with self-identifying donor information including job position, industry, and location. In the 2020 primary almost all … Read moreClustering FEC Quarterly Campaign Contributions

Explain Your Model with the SHAP Values

Better Interpretability Leads to Better Adoption Is your highly-trained model easy to understand? A sophisticated machine learning algorithms usually can produce accurate predictions, but its notorious “black box” nature does not help adoption at all. Think about this: If you ask me to swallow a black pill without telling me what’s in it, I certainly … Read moreExplain Your Model with the SHAP Values

A single function to streamline image classification with Keras

We show, step-by-step, how to construct a single, generalized, utility function to pull images automatically from a directory and train a convolutional neural net model. Plenty has been written about deep learning frameworks such as Keras and PyTorch, and how powerful yet simple to use they are for constructing and playing with wonderful deep learning … Read moreA single function to streamline image classification with Keras

Scraping the Web: A fast and simple way to scrape Amazon

Photo by Nicolas Picard on Unsplash As a data scientist or data enthusiast, one is always hungry for lots and lots of DATA. I can imagine the heart-eyes when you see lots of data in a website and your desire to grab all the data, perform all sorts of techniques you have learnt, apply statistics, … Read moreScraping the Web: A fast and simple way to scrape Amazon

Customer Segmentation Using RFM in Apache Spark

Source RFM segmentation is a great method to divide customers into equal groups depending on three criteria (which compose the RFM acronym): Recency. How much time has elapsed since a customer’s last activity or transaction with the company? Frequency. How often has a customer transacted with the company during a particular period of time? Monetary. … Read moreCustomer Segmentation Using RFM in Apache Spark

5 Powerful Habits of the Most Productive Teams.

“Never give up. Never stop believing. Never stop fighting!” As a developer with a career spanning across 2 decades, there are two glowing events, which are permanently etched within my memory. · As part of the worst-performing team which was sacked (including me). · As part of the best performing team which won the president’s … Read more5 Powerful Habits of the Most Productive Teams.

Simple Football Data-set Exploration with Pandas

International Football Results From 1872 to 2019 This tutorial article details how the Python Pandas library can be used to explore a data-set efficiently. Specifically, this example will use the data-set, International football results from 1872 to 2019, which is available from the Kaggle website. Chiefly, this tutorial will explore simple visualizations and how they can be … Read moreSimple Football Data-set Exploration with Pandas

Data Science and Machine Learning Adoption in Middle East Countries | Clevered

Market Trends Recently, the Middle East has seen a swarm of investments pooling in from domestic as well as international markets to boost their efforts in innovation and technological advancements. When Amazon accessed a regional online shopping site ‘Souq.com’ for around $580m, it became a milestone for the technology sector and digital economy in the … Read moreData Science and Machine Learning Adoption in Middle East Countries | Clevered

Natural Language Processing: A Crash Course!

Next word prediction is one of the applications of NLP Natural Language Processing (NLP) is an increasingly important sub-domain of Machine Learning that deals with deriving general understanding through developing various language models from various large corpora of text. In this article, we go over 5 common types of NLP problems and their relevant models: … Read moreNatural Language Processing: A Crash Course!

What Makes the News?

Before attempting any modeling, I pre-processed the data using NLTK’s WordNet Lemmatizer to reduce the words to their base forms. I then tokenized the data using scikit-learn’s CountVectorizer, which generates a basic count of words in the text in a bag-of-words approach. This approach created a sparse matrix of approximately 4,600 unigrams and bigrams. Some … Read moreWhat Makes the News?

Industry 4.0: Evolution to knowledge-based enterprises

Focus Area I: Defining purpose of knowledge for the enterprise Challenge: More information, less understanding Possible solution: Mission-performance-value knowledge assessment We are living in a highly information intense environment today. However, for enterprises, not all forms of information and knowledge are relevant. At the highest level, the purpose of knowledge should be derived (or understood) … Read moreIndustry 4.0: Evolution to knowledge-based enterprises

A Minimalist End-to-End Scrapy Tutorial (Part IV)

Systematic Web Scraping for Beginners Photo by Paweł Czerwiński on Unsplash Part I, Part II, Part III, Part IV In the previous three parts, you have developed a spider that extracts quote information from http://quotes.toscrape.com and stores the data into a local SQLite database. In this part, I will show you how to deploy the … Read moreA Minimalist End-to-End Scrapy Tutorial (Part IV)

The What, Why, and How of Sankey Diagrams

Step One: Design First, solidify your purpose and the most important take-away for your audience. To avoid wasting time rebuilding your diagram or building an ineffective Sankey diagram, here are some questions I would recommend asking yourself before you start: Are you using this Sankey for exploratory data analysis? Are you using it to tell … Read moreThe What, Why, and How of Sankey Diagrams

Train and deploy the Mighty Transformer NLP models using FastBert and AWS SageMaker

In my earlier introduction to FastBert, I described it as a library that will allow developers and data scientists to train and deploy BERT based models for NLP tasks beginning with Text Classification. The scope of BERT (read Transformers) based models have widened a bit since I wrote my earlier blog and includes BERT, XLNet, … Read moreTrain and deploy the Mighty Transformer NLP models using FastBert and AWS SageMaker

Throwing dice with maximum entropy principle

“nobody knows what entropy really is, so in any discussion you will always have an advantage” von Neunmann Sounds as a good reason to dive into the meaning of entropy. This post is all about dice and maximum entropy. Photo by Jonathan Petersson on Unsplash The post has four parts. In the first part, I … Read moreThrowing dice with maximum entropy principle