Breaking Down Richard Sutton’s Policy Gradient With PyTorch And Lunar Lander

Theory Behind The Policy Gradient Algorithm Before we can implement the policy gradient algorithm, we should go over specific math involved with the algorithm. The math is very straight-forward and very easy to follow and for the most part, is reinterpreted from the OpenAI resource mentioned above. First, we define tau to be a trajectory … Read more

How to Find a Descent Learning Rate using Tensorflow 2

Taken from http://www.merzpraxis.de/index.php/2016/06/13/der-suchende/ When it comes to building and training Neural Networks, you need to set a massive amount of hyper-parameters. Setting those parameters right has a tremendous influence on the success of your net and also on the time you spend heating the air, aka training you model. One of those parameters that you … Read more

Kepler.GL & Jupyter Notebooks: Geospatial Data Visualization with Uber’s opensource Kepler.GL

Plot Geospatial data inside Jupyter notebook & Easily interact with Kepler’s User interface to tweak the visualisation. kepler.gl for Jupyter is an excellent tool for big Geospatial data visualisation. Combine world-class visualisation tool, easy to use User interface (UI), and flexibility of python and Jupyter notebooks (3D Visualization GIF below, more in the article). 3D … Read more

Fear Tells Us What We Have To Do

My deep learning self-study for 09/30/19–10/07/19 I’m a math lecturer and aspiring data scientist hoping to participate in artificial general intelligence research, and this week I decided to start keeping a weekly blog of what I’ve been doing, both for my own reference and potentially to help others on a similar path, following the advice … Read more

Models as Serverless Functions

Source: Wikimedia Chapter 3 of “Data Science in Production” I recently published Chapter 3 of my book-in-progress on leanpub. The goal with this chapter is to empower data scientists to leverage managed services to deploy models to production and own more of DevOps. Data Science in Production Building Scalable Model Pipelines with Python towardsdatascience.com Serverless … Read more

Data Science with SQL in Python

Python Application in SQL Ever hear about the database programming language, Sequel (SQL)? How can we use Python code to harness the power of SQL databases & be able to retrieve, manipulate & delete that information stored in the database, with Python? In this article, I plan on giving a thorough beginner’s tutorial on Sequel … Read more

Twitter — Or where my bot talks to your bot

Photo by Safar Safarov on Unsplash My activities on Twitter were mind-numbingly repetitive. From what Kirk was doing, it also didn’t exactly seem like he was reading everything that he was posting about. And whenever something is done over and over again, it’s typically a prime candidate for automation. I found tweepy, a Python library … Read more

An Easier Way to Encode Categorical Features

Photo by Ash Edmonds on Unsplash Using the python category encoder library to handle high cardinality variables in machine learning I have recently been working on a machine learning project which had several categorical features. Many of these features were high cardinality, or in other words, had a high number of unique values. The simplest … Read more

Playing with object detection

I’ll follow my jupyter notebook to make things easier to show. Feel free to either simply run it or implement the code on your own. Keep in mind that some code snippets use functions implemented in previous snippets, therefore the order of occurrence matters. All mentioned files in this post are available in my GitHub. … Read more

A Guide to Integrating Text Analytics into Tableau

Credit: Freddie Marriage Data is often dirty and messy. Sometimes, it doesn’t even come in the right form for quick analysis and visualization. While Tableau (and Prep) had several tools to deal with numeric, categorical, and even spatial data, one consistent missing piece was handling unstructured text data. Not anymore. In the latest edition of … Read more

Using Python To Get SalesForce Data

Photo by Denys Nevozhai on Unsplash I work at a startup that heavily uses SalesForce. When I first started we would have to log in through the Salesforce site. Go to the reports tab, create a report with the necessary fields. Download a Comma Separated Value spreadsheet. Do some data cleaning here and there. Mostly … Read more

PyTorch v1.3 — What’s new?

Support for Android and iOS, Named Tensor, TPU Support, Quantization and more. Facebook just released PyTorch v1.3 and it is packed with some of the most awaited features. The three most attractive ones are: Named Tensor — Something that would make the life of machine learning practitioners much easier. Quantization — For performance critical systems … Read more

Line Detection: Make an Autonomous Car see Road Lines

Step by step you can turn a video stream into a line detector via Computer Vision Fully self-driving passenger cars are not “just around the corner”. Elon Musk claims that Teslas will have a “full self-driving” capability by the end of 2020. Especially, he says that Tesla’s hardware is already ready for Autonomous drive, and … Read more

Proper Balancing for Cross Validation

Importing & splitting of the data: import pandas as pdimport numpy as npfrom sklearn import datasetsfrom sklearn.model_selection import cross_validatefrom sklearn.metrics import accuracy_score, precision_scorefrom sklearn.linear_model import LogisticRegressionimport matplotlib.pyplot as pltfrom sklearn.model_selection import train_test_split,StratifiedKFoldfrom imblearn.over_sampling import SMOTEfrom imblearn.under_sampling import RandomUnderSamplerdf = pd.read_csv(‘creditcard.csv’).sample(50000, random_state=0)train, test = train_test_split(df, test_size=0.3, random_state=0, shuffle=True)y_train = np.array(train[“Class”])y_test = np.array(test[“Class”])del train[“Class”]del test[“Class”]train = train.reset_index(drop=True)test … Read more

TDD shouldn’t be TDDious

I still come across the age old “how to test” debate, but can we make it fun to test things? I’ve been working as an engineer for over a decade now, and still come across the age old “how to test” debate. I’m a Lead Engineer and that means working with my team on how … Read more

Cheat sheet for Python dataframe ↔ R dataframe syntax conversions

A mini-guide for those who’re familiar with data analysis using either Python or R and want to quickly learn the basics for the other language Photo by Mad Fish Digital on Unsplash In this guide, for Python, all the following commands are based on the ‘pandas’ package. For R, the ‘dplyr’ and ‘tidyr’ package are … Read more

Training Yolo for Object Detection in PyTorch with Your Custom Dataset — The Simple Way

In a previous story, I showed how to do object detection and tracking using the pre-trained Yolo network. Now I want to show you how to re-train Yolo with a custom dataset made of your own images. For this story, I’ll use my own example of training an object detector for the DARPA SubT Challenge. … Read more

Deploying a React App on Heroku: the Python perspective

How to deploy a React frontend paired with a Flask backend Coming from a Python background, Heroku is a fantastic place to deploy. I’ve got a variety of static and Flask-based websites which have been trivial to configure and easy to integrate with Github for smooth continuous deployment. This post is not about deploying with … Read more

Decomposing Signal Using Empirical Mode Decomposition — Algorithm Explanation for Dummy

What kind of ‘beast’ is Empirical Mode Decomposition (EMD) is? It’s an algorithm to decompose signals. And when I say signal, what I mean is a time-series data. We inputting a signal to the EMD and we will get some decomposed signal a.k.a ‘basic ingredient’ of our signal input. It’s similar to the Fast Fourier … Read more

Keras data generators and how to use them

You probably encountered a situation where you try to load a dataset but there is not enough memory in your machine. As the field of machine learning progresses, this problem becomes more and more common. Today this is already one of the challenges in the field of vision where large datasets of images and video … Read more

How to use Selenium as life-saver when dealing with boring tasks?

Automate never-ending repetitive tasks the Selenium way photo by elmnet If you are a developer then probably you do not need an intro to selenium. Selenium is a powerful tool built to interact with the web server for processing requests in a programmatic way. It is used in automating a wide variety of tasks involving … Read more

8 Useful Pandas Features for Data-Set Handling

This article presents 8 simple, but useful Pandas operations which showcase how the Python’s Pandas library can be usefully used for data-set exploration. The Data-set I will use for this tutorial piece is entitled ‘International football results from 1872 to 2019’ and can be sourced here, in case any of the code snippet examples presented … Read more

FastText sentiment analysis for tweets: A straightforward guide.

FastText is an open-source NLP library developed by facebook AI and initially released in 2016. Its goal is to provide word embedding and text classification efficiently. According to their authors, it is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation. [1] This … Read more

How to get started with Data Science : A brief tutorial on using Anaconda, Python, Jupyter…

In this article, I wanted to write about my experience of overcoming the initial hurdle and getting started with learning Data Science. Learning data science is a journey and you will keep learning once you get started. In this article we will go through following 5 starting steps for getting into the field of learning … Read more

Python Tips and Tricks, You Haven’t Already Seen, Part 2

Note: This was originally posted at martinheinz.dev Few weeks ago I posted an article (here) about some not so commonly known Python features and quite a few people seemed to like it, so here comes another round of Python features that you hopefully haven’t seen yet. Using lots of hardcoded index values can quickly become … Read more

Pedestrian detection using Non Maximum Suppression

A complete pipeline for detecting pedestrians on the road Pedestrian detection is still an unsolved problem in computer science. While many object detection algorithms like YOLO, SSD, RCNN, Fast R-CNN and Faster R-CNN have been researched a lot to great success but still pedestrian detection in crowded scenes remains an open challenge. In recent years, … Read more

Avengers, resemble!

Finding the ideal costume through facial recognition One of the more interesting and specialized uses of computer vision is for facial detection and recognition. Humans are incredibly adept at recognizing faces, but it is a fairly recent trend that we have been able to train computers to do a close enough job to warrant using … Read more

6 of the Best Niche Platforms to Learn SQL and Python

w3schools is a simple, no-frills tool for learning web development skills, including SQL and Python Depending on your preferences, you will probably either love or hate w3school’s approach to learning. w3schools claims to be the world’s largest web developer site, so their methods clearly work for many people. Essentially, the method of teaching here is … Read more

4 Tips to Get the Best Out of PyCharm

The choice of editor usually does not matter much when you are simply experimenting with Machine Learning or coding for short projects that do not require complex folder structures or scripts organized in modules. The problems and preferences usually come up when projects become larger, with several scripts, modules, tests and programmers collaborating on the … Read more