Train a TensorFlow Model in Amazon SageMaker

Example traffic signs from the dataset Amazon SageMaker is a cloud machine-learning platform that enables developers to create, train, and deploy machine-learning models in the cloud. I previously used TensorFlow 2 to classify traffic signs with my onboard CPU. Today, I am going to do it in Amazon SageMaker. SageMaker has several advantages: it offers … Read more Train a TensorFlow Model in Amazon SageMaker

Topic Modeling with NLP on Amazon Reviews: An Application of Latent Dirichlet Allocation (LDA)

There are many ways to create topic modelings. By filtering the text for certain criteria, we can obtain different topic groups. In this study, we will create topic modeling by using: All the text data Only nouns from the text Only nouns and adjectives Topic Modeling — Attempt 1 (with all the review data) As … Read more Topic Modeling with NLP on Amazon Reviews: An Application of Latent Dirichlet Allocation (LDA)

How did I learn Data Science in 2 months?

Here’s the complete road-map to learn Data Science from “Zero to Hero” with source links provided. Photo by Campaign Creators on Unsplash Introduction Soft skills required for Data Science Python — basics Python libraries — Numpy, Pandas and Matplotlib SQL Tableau/Power BI Data analysis using Python Statistics and Probability Machine learning Algorithms Deep learning (Neural … Read more How did I learn Data Science in 2 months?

Gradient Descent animation: 1. Simple linear Regression

This is the first part of a series of articles on how to create animated plots visualizing gradient descent. The Gradient Descent method is one of the most widely used parameter optimization algorithms in machine learning today. Python’s celluloid-module enables us to create vivid animations of model parameters and costs during gradient descent. In this … Read more Gradient Descent animation: 1. Simple linear Regression

K Nearest Neighbors by hand: A Computer Science exercise for the Data Scientist

Opening the “black box” and understanding the algorithm within Data scientists sometimes talk about a “black box” approach to data science. That is, when you understand the use cases for different machine learning algorithms and how to plug in the data without understanding how the algorithm works beneath the surface. But the algorithms are just … Read more K Nearest Neighbors by hand: A Computer Science exercise for the Data Scientist

How to create Latex tables directly from Python code

Copying tables of results from the console into a Latex report can be tedious and error fraught — so why not automate it? Making tables should be simple and elegant (Photo by Roman Bozhko on Unsplash). Creating tables of results plays a major part in communicating the outcomes of experiments in data science. Various solutions … Read more How to create Latex tables directly from Python code

Deploying Python script to Docker container and connect to external SQL Server(in 10 minutes)

Finally we want to build and run the image. #Build the imagedocker build -t my-app .#Run itdocker run my-app#Find container namedocker ps –last 1#Check logsdocker logs <container name> If you want to explore the container and run the script manually then modify last line of the Dockerfile, build and run again: #CMD [“python”,”-i”,”main.py”]CMD tail -f … Read more Deploying Python script to Docker container and connect to external SQL Server(in 10 minutes)

Build your own deep learning classification model in Keras

Step #6: Create our model In this task we will build a classification convolutional neural network from scratch and train it to recognize the 20 target classes in the Pascal Voc dataset. Our Model architecture will be based on the popular VGG-16 architecture. This is a CNN with a total of 13 convolutional layers (cfr. … Read more Build your own deep learning classification model in Keras

How to scrape ANY website with python and beautiful soup

Now you don’t need to know how HTML/CSS works (although, it can be really helpful if you do). The only thing that’s important to know is that you can think of every HTML tag as an object. These HTML tags have attributes that you can query, and each one is different. Each line of code … Read more How to scrape ANY website with python and beautiful soup

Measuring Agreement with Cohen’s Kappa Statistic

This lesser-known metric can help you better evaluate how models perform on imbalanced data A lot of the most intriguing — to me — use cases for classifications are to identify outliers. The outlier may be a spam message in your inbox, a diagnosis of an extremely rare disease, or an equity portfolio with extraordinary … Read more Measuring Agreement with Cohen’s Kappa Statistic

How (NOT) To Predict Stock Prices With LSTMs

Not so recently, a brilliant and ‘original’ idea suddenly struck me — what if I could predict stock prices using Machine Learning. After all, a time series can be easily modeled with an LSTM. I could see myself getting rich overnight! If this is so easy, why hasn’t anyone done it yet? Very excited at … Read more How (NOT) To Predict Stock Prices With LSTMs

What Covid-related topics are being discussed in Spotify Podcasts?

To my excitement, Spotify already has a developer API in which we can use to get data from Spotify or trigger certain actions for Spotify users. What we need to do is just register to the site, create an app, and get the API token. Then, we can use spotipy package in Python to retrieve … Read more What Covid-related topics are being discussed in Spotify Podcasts?

Coloring an Image using Crayola Colors (Python)

Creating an array of colors. Firstly, I picked the corresponding RGB values for a 120 crayons Crayola box and copied them into a list. colorsFile = open(“colors.txt”,”r”)colors = []for line in colorsFile.readlines():colorset = line.strip().split(” “)rgbFormat = [int(x) for x in colorset[2].split(“,”)]colors.append(rgbFormat) Secondly, I started by picking an image and resizing it to a smaller size. … Read more Coloring an Image using Crayola Colors (Python)

What Python package is best for getting data from Twitter? Comparing Tweepy and Twint.

What does that all mean? Practically, if the only thing that you are looking to do is collect a large number tweets, Twint is probably a better tool, whereas Tweepy is better suited for collecting a richer set of metadata, allows for flexibility and potentially scalability as well for those using the official API. That’s … Read more What Python package is best for getting data from Twitter? Comparing Tweepy and Twint.

State-of-the-art python project setup

An opinionated setup guide for your next python project Python is one of the fastest growing programming languages. It’s tooling is evolving fast to catch up. I have been writing python for over 10 years now and sometimes it’s hard to keep up with all the new tooling out there. Recently, I had an opportunity … Read more State-of-the-art python project setup

Modules and Packages in Python: Fundamentals for Data Scientists

Understand the basics with a concrete example! Photo by Matthew Fournier on Unsplash When your Python code grows in size, most probably it becomes unorganised over time. Keeping your code in the same file as it grows makes your code difficult to maintain. At this point, Python modules and packages help you to organize and … Read more Modules and Packages in Python: Fundamentals for Data Scientists

Why doesn’t Python support i++ increment syntax

On a normal day, this line drew my attention when someone asking for the reason of ++ not as an operator in Python. If you want to know the original reason, you’ll have to either wade through old Python mailing lists or ask somebody who was there (eg. Guido) ~ By stackoverflow And this enforces … Read more Why doesn’t Python support i++ increment syntax

Single Line of Code to Interchange Between Python Objects and JSON

Python Programming Tips The easiest way to serialise/deserialise between Python objects and JSON — Attr and Cattr In one of my previous article, I have introduced probably the best practice of Object-Oriented Programming (OOP) in Python, which is using the library “Attrs”. Probably the Best Practice of Object-Oriented Python — Attr Makes Python Object-Oriented Programming … Read more Single Line of Code to Interchange Between Python Objects and JSON

Figuring Out What’s “On Brand” for Corporate Instagram Posts Through Unsupervised Machine Learning

Data Science Using web-scraping, dimensionality reduction, and unsupervised machine learning on 15,000+ Instagram posts from the 500 largest companies in the US to try and constitutes the style of a normal corporate post. Back when I used to help plan and manage college events, the phrase “Style Guide” was thrown around every day. It’s the … Read more Figuring Out What’s “On Brand” for Corporate Instagram Posts Through Unsupervised Machine Learning

Statistical Thinking in Python (Part 2)

Every year for the past 40-plus years, Peter and Rosemary Grant have gone to the Galápagos island of Daphne Major and collected data on Darwin’s finches. Using your skills in statistical inference, you will spend this chapter with their data, and witness first hand, through data, evolution in action. It’s an exhilarating way to end … Read more Statistical Thinking in Python (Part 2)

Understanding Overfitting and Underfitting

For a supervised machine learning task we want our model to do well on the test data whether it’s a classification task or a regression task. This phenomenon of doing well on test data is known as generalize on test data in machine learning terms. So the better a model generalizes on test data, the … Read more Understanding Overfitting and Underfitting

Dynamic Cryptocurrency Trading Backtesting Platform — Python

allthestock.com I’ve recently been very interested in cryptocurrency day trading. While a quick internet search returns a slew of various indicator ideas and trading strategies, such as EMA/SMA crossings, RSI strategies, On Balance Volume, etc., I’ve found myself never able to find a resource that allows me to answer the question: “Well, does this actually … Read more Dynamic Cryptocurrency Trading Backtesting Platform — Python

How to Build a DCGAN with PyTorch

In this tutorial, we’ll be building a simple DCGAN in PyTorch and training it to generate handwritten digits. As part of this tutorial we’ll be discussing the PyTorch DataLoader and how to use it to feed real image data into a PyTorch neural network for training. PyTorch is the focus of this tutorial, so I’ll … Read more How to Build a DCGAN with PyTorch

A gentle introduction to iterators in C++ and Python

Part 2: A look at iterators in Python, using itertools Source: Ciaran Cooney (drawn in Powerpoint). In my previous post (here), I went into a general discussion on the virtues of using iterators in your code and ran through some beginner-level examples in C++. Here, I am going to extend this introduction to iterators by … Read more A gentle introduction to iterators in C++ and Python

Abstract Base Classes in Python: Fundamentals for Data Scientists

Understand the basics with a concrete example! Photo by Joshua Aragon on Unsplash In Python, abstract base classes provide a blueprint for concrete classes. They don’t contain implementation. Instead, they provide an interface and make sure that derived concrete classes are properly implemented. Abstract base classes cannot be instantiated. Instead, they are inherited and extended … Read more Abstract Base Classes in Python: Fundamentals for Data Scientists

Tips for Writing Authentic Loops In Python

Programming Guide Detailed recommendations to enhance the experience with Python loops in different scenarios Photo by Tine Ivanič on Unsplash Loop is a commonly used program control structure. We often say that one of the greatest advantages of machines over humans is that machines can do something repeatedly without sleep, but humans cannot. “Circulation” is … Read more Tips for Writing Authentic Loops In Python

6 Lesser-Known Yet Awesome Tricks in Pandas

Tricks I wish I knew sooner to get more value out of Pandas As the most popular Python library for analytics, Pandas is a big project that offers various data manipulation and processing capabilities. It is probably no exaggeration to say that data scientists, myself included, use Pandas on a day-to-day basis in our work. … Read more 6 Lesser-Known Yet Awesome Tricks in Pandas

Improve your map by scaling the area to visually convey underlying data

How to scale states or counties by population — or any other value — for your choropleth data visualization Many map visualizations that use color as a dimension are misleading. Social data is often put on a map (a choropleth), and colored to show some activity or behavior. The problem is that the data is … Read more Improve your map by scaling the area to visually convey underlying data

The potato train — using Python with extremely large numbers and arbitrary precision for binomial…

Math is hard, let’s go shopping — for tutorials, that is. I definitely wish I had read this tutorial before trying some things in Python that involve extremely large numbers (binomial probability for large values of n) and my code started to crash. But wait, I hear you saying, Python can handle arbitrarily large numbers, … Read more The potato train — using Python with extremely large numbers and arbitrary precision for binomial…

Get the Most out of scikit-learn with Object-Oriented Programming

Walk-through a data science example and enhance your Python coding chops! As data scientists, we are all familiar with scikit-learn, one of the most used machine learning and data analysis libraries available in Python. I’ve personally used it, along with pandas, for the majority of my professional projects. It is only recently, however, that I … Read more Get the Most out of scikit-learn with Object-Oriented Programming

Step-up Your RegEx Game in Python

These are some of the most incredibly useful regex methods I have ever used. Near the top of this page is my name and in the HTML — my username. It looks like this: <a href=”/@jamescalam?source=post_page—–22e4e63463af———————-” class=”cg ch au av aw ax ay az ba bb it be ck cl” rel=”noopener”>James Briggs</a> Say what we’re … Read more Step-up Your RegEx Game in Python

Reinforcement Learning With Python | Part 1 | Creating The Environment

This class facilitates the communication between the environment and the agent, it is designed to with an RL agent or with a human player. Main Components Needed by the RL Agent: 1- ENVIRONMENT_SHAPE attribute: used by the DQN to set the shape of the input layer. 2- ACTION_SPACE attribute: used by the DQN to set … Read more Reinforcement Learning With Python | Part 1 | Creating The Environment

The Pursuit of Happiness for The Confined Artificial Neural Network

“I have been years seeking the ideal place. And I have come to the conclusion that the only way I can possibly find it is to be it.” — Alan Watts More often than not, to feel happy is a choice one makes. When we are neither happy nor sad, it falls upon us to … Read more The Pursuit of Happiness for The Confined Artificial Neural Network

Dive into Queue Module in Python — It’s more than FIFO

Thread Programming There are many applications of queue data structure. A well-known use case must be the message queue in which we have a producer and a consumer. The producer and consumer usually sit in different threads. Luckily, Queue() class has a thread-safe implementation with all the required locking mechanism. So producer and consumer from … Read more Dive into Queue Module in Python — It’s more than FIFO

The Computational Cost of Writing Clean Code

Especially in a language like Python, where there are at least ten ways to write anything, developers will often cram several operations into one line. Multiple variable assignment allows the reader to take in the functions applied in smaller batches. Additionally, it makes it easier to pick through the layers of parenthesis present when more … Read more The Computational Cost of Writing Clean Code

Ranking algorithms — know your multi-criteria decision solving techniques!

Suppose you have a decision to make — like buying a house, or a car, or even a guitar. You don’t want to choose randomly or get biased by someone’s suggestion, but want to make an educated decision. For this, you gathered some information about the entity you want to buy (let’s say it’s a … Read more Ranking algorithms — know your multi-criteria decision solving techniques!

Simple Movie Recommendation System using Python

Photo by Noom Peerapong on Unsplash Recommendation systems are quite popular now. They can be used in either movie recommendation or shopping or e-commerce websites or on social media websites. It’s used intensively. There are three types of recommendation systems — Content-based, Popularity based and Collaborative based. Popularity based is simple and recommends on the … Read more Simple Movie Recommendation System using Python

Streamlit + Heroku = Magic?

Note: I’m assuming you already have streamlit installed and already know a little bit on how to work with it. There are four essential components required to launch your streamlit application on Heroku. setup.sh requirements.txt Procfile your_application.py setup.sh — no credentials needed Note: You do not need to name this file exactly setup.sh, it can … Read more Streamlit + Heroku = Magic?

Unifying remote and local AzureML environments

Microsoft and Python Machine Learning: a modern love story, Part 2 of 2 Microsoft Azure is conquering our hearts as AI practitioners and wooing us with support for open-source frameworks such as PyTorch, Tensorflow and Scikit-learn on AzureML. Here we build a workflow around the tools that MS gives us and it is up to … Read more Unifying remote and local AzureML environments

A Bayesian Approach to Linear Mixed Models (LMM) in R/Python

Implementing these can be simpler than you think There seems to be a general misconception that Bayesian methods are harder to implement than Frequentist ones. Sometimes this is true, but more often existing R and Python libraries can help simplify the process. Simpler to implement ≠ throw in some data and see what sticks. (We … Read more A Bayesian Approach to Linear Mixed Models (LMM) in R/Python

The Best Way to Invest in the Market

Empirical analysis comparing lump sum investment vs. dollar cost averaging using Python Working in finance, I inevitably get asked “What stocks should I buy?”. Since I’m well aware that I have no chance of consistently beating the market, I respond with the same three words every time, “Buy the SPY.” For those unfamiliar, the SPY … Read more The Best Way to Invest in the Market

Exploring Line Charts for Data Visualization

Exploratory Data Visualization A Practical Guide to Understand, Visualize, and Interpret Data. Data Visualization is a discipline that focuses on the visual representation of data. We as humans possess powerful visual processing capabilities. We tend to find patterns quickly. Unfortunately, when represented in a textual or tabular form we are unable to take advantage of … Read more Exploring Line Charts for Data Visualization

Twitter Analytics: “WeRateDogs”

Most of the focus of this project was on data wrangling. So what exactly is data wrangling? Data wrangling refers to the process of cleaning, restructuring and enriching the raw data available into a more usable format. I have used various python libraries in this project, below are the ones I got started with. import … Read more Twitter Analytics: “WeRateDogs”

Efficient Web Scraping with Scrapy

https://unsplash.com/@markusspiske New features of Scrapy to make your scraping efficient Scrapy as a framework for web scraping is powerful and scalable. It has an active user base and new features coming out with each update. Here in this article we will run through some of those features to get the most out of your scraping … Read more Efficient Web Scraping with Scrapy

Grabbing Geodata From Your Photos Library Using Python

Katy Mould www.katymould.com Use python to automate sorting through your photo collection So one of the many talents my girlfriend has is photography. However she has over 100,000 photos which require sorting out primarily into year and city. She primarily does location based photography and finds writing essays about certain areas of the world easier … Read more Grabbing Geodata From Your Photos Library Using Python