Productionalize Your Machine Learning Model Using Flask And Google App Engine

This small tutorial will help you understand how a trained machine learning model is used in production. Nowadays you can find lots of tutorials, MOOCs and videos for learning Data Science and Machine Learning. But none of them explain what happens to your machine learning model after you train and optimize one at your local … Read moreProductionalize Your Machine Learning Model Using Flask And Google App Engine

Web Scraping of 10 Online Shops in 30 Minutes with Python and Scrapy

Photo by Nguyen Bui on Unsplash Scrapy Spiders are classes which define how a certain site (or a group of sites) will be scraped, including how to perform the crawl (i.e. follow links) and how to extract structured data from their pages (i.e. scraping items). In other words, Spiders are the place where you define … Read moreWeb Scraping of 10 Online Shops in 30 Minutes with Python and Scrapy

Jupyter notebooks tips and tricks

There are many great extensions in jupyter_contrib_nbextensions. You should be using Jupyter lab though instead. First you need to instead jupyter_contrib_nbextensions and then you can install various useful extensions. pip install jupyter_contrib_nbextensionsjupyter contrib nbextension install –user These are the ones I love: code_prettify backed by autopep8 is great for reformatting code in notebook code cells … Read moreJupyter notebooks tips and tricks

Time Series Analysis From Scratch in Python: Part 2

What resampling means in a nutshell, is a way of aggregating the data. Here I’m using ‘M’ as a resampling rule, which stands for month, and I’m using mean as an aggregation function. That will do the following: Fetch all prices for a given month Calculate the mean by dividing the sum of all prices … Read moreTime Series Analysis From Scratch in Python: Part 2

Scraping Web Articles Using NewsAPI in Python

This post is meant to provide a gentle introduction to the scraping and use of the popular news web scraping tool NewsAPI. When scraping relevant news articles, there are a variety of options to choose from. Bing News Search, Bloomberg, and New York Times all have very useful API programs. However, NewsAPI is the jack-of … Read moreScraping Web Articles Using NewsAPI in Python

Why AutoML is An Essential New Tool For Data Scientists

Machine learning (ML) is the current paradigm for modeling statistical phenomena by harnessing algorithms that exploit computer intelligence. It is common place to build ML models that predict housing prices, aggregate users by their potential marketing interests, and use image recognition techniques to identify brain tumors. However, up until now these models have required scrupulous … Read moreWhy AutoML is An Essential New Tool For Data Scientists

Streamlit 101: An in-depth introduction

Deep dive into Streamlit with Airbnb NYC data Streamlit is an awesome new tool that allows engineers to quickly build highly interactive web applications around their data, machine learning models, and pretty much anything. The best thing about Streamlit is it doesn’t require any knowledge of web development. If you know Python, you’re good to … Read moreStreamlit 101: An in-depth introduction

How to fine-tune BERT with pytorch-lightning

First, we will take a look at the task that we are tackling today, Microsoft Research Paraphrase Corpus, a task that given two documents, models are asked to predict if they have the same meanings. For instance, two sentences like following, “It is an excellent day for a picnic!” and “In a day like this, … Read moreHow to fine-tune BERT with pytorch-lightning

A python flask app that predicts the personality type on the basis of user entries using text…

The painting above (called vertical flow) by Irene Rice Perreria is one that I find most interesting. To me, it describes the heterogeneity in colors and also simultaneously how these different colors originate from the same essence. Colors at the end of the day operate on a spectrum rather than as individual points. One could … Read moreA python flask app that predicts the personality type on the basis of user entries using text…

My first small project in Python for browsing Reddit in office safely (Part 2)

Newly added features: 1. lcop (Leave comment on a post) param: submission_id lcop allows you to leave a comment to a submission. All you need to provide is the submission id of the post. Since login is required for leaving a comment, before proceeding any action, an automatic check of whether a user is logged … Read moreMy first small project in Python for browsing Reddit in office safely (Part 2)

Top Python Libraries: Numpy & Pandas

In this tutorial, I’ll try to make a brief description about two of the most important libraries in Python Numpy and Pandas. Without further delay lets go through Numpy first. Numpy numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object and tools for working with these arrays. … Read moreTop Python Libraries: Numpy & Pandas

Easy Text-to-Speech with Python

Text to Speech Source: thenextweb.com Text-to-speech (TTS) technology reads aloud digital text. It can take words on computers, smartphones, tablets and convert them into audio. Also, all kinds of text files can be read aloud, including Word, pages document, online web pages can be read aloud. TTS can help kids who struggle with reading. Many … Read moreEasy Text-to-Speech with Python

Machine Learning and Data Analysis — Inha University (Part-4)

In this part of the series Machine learning and Data analysis offer by Inha University, Rep. of Korea I’ll try to narrate Built-in and User-defined Functions and Modules in Python. From my point of view, it’ll be helpful for the new learners in python to understand it clearly. If you like to start from the … Read moreMachine Learning and Data Analysis — Inha University (Part-4)

Walkthrough: Mapping Basics with bokeh and GeoPandas in Python

Goal Create a map of the contiguous US that shows the state population. Within each state, show where lead was found in 2018. Creating The Contiguous USA In order to create a map, you will need a shapefile (.shp). In this case, I downloaded a shapefile from the US Census Bureau here. The file tells … Read moreWalkthrough: Mapping Basics with bokeh and GeoPandas in Python

Global Warming and Malaria in Developing Regions: An Analysis in Python

The data set only includes 4 points per region and there are 127 regions. There isn’t much data per region so any analysis should be taken with a grain of salt. Knowing that developing regions are more vulnerable to the risks that climate change pose, it would be useful to narrow our scope. Time Magazine … Read moreGlobal Warming and Malaria in Developing Regions: An Analysis in Python

Using Airflow and Spark To Crunch US Immigration Data

For example, prior to any triggering of Spark Jobs, the data-sets needed to be downloaded from s3 and unzipped. Using Airflow documents this dependency, and if a downstream task fails (syncing of files from s3) then the dependent upstream task won’t be invoked. This improved the stability of the pipeline and prevented runaway code from … Read moreUsing Airflow and Spark To Crunch US Immigration Data

Email Automation with Python

Automate emails with attachments in python https://www.helpsystems.com/resources/guides/automated-operations-5-benefits-your-organization When I first started using python I saw it as an upgrade to excel. A tool I could use to improve my work in data analysis. The better I got at python the more streamlined my analysis became, and I started to realize python was more than a … Read moreEmail Automation with Python

Difference Between NFD, NFC, NFKD, and NFKC Explained with Python Code

The difference between Unicode normalization forms Photo by Joel Filipe on Unsplash Recently I am working on an NLP task in Japanese, one problem is to convert special characters to a normalized form. So I have done a little research and write this post for anyone who has the same need. Japanese contains different forms … Read moreDifference Between NFD, NFC, NFKD, and NFKC Explained with Python Code

How to add machine learning-powered text summarization to any project

Have you ever wondered how media organizations are able to produce the raw volume of content they output? How is that the Associated Press, in addition to all of their other coverage, is able to cover 4,400 quarterly earning reports each year? How does The Washington Post run such hyperlocal coverage — like covering every … Read moreHow to add machine learning-powered text summarization to any project

Python Input, Output and Import

In this tutorial let us understand the Input and Output built-in-functions used in python, also we will learn how to import libraries and use them in our programs. Image Credits: Data Flair Before getting started let us understand what are built-in-functions? Any function that is provided as part of a high-level language and can be … Read morePython Input, Output and Import

And The Star of the Show is — PYTHON

The overall contributions to the open-source projects are seen from all the continents and Asia is on the top with most of its contributions coming from China. The below graph shows us contributions from different continents. The top 50 packages in each language ecosystem have a massive amount of dependent projects. The top npm packages … Read moreAnd The Star of the Show is — PYTHON

How To Compute Satellite Image Statistics And use It In Pandas

The Sentinel 2 image of the area( only Band 3) is shown below. Let us also read the buildings table which we will use to store the statistical summaries derived from the satellite image. Please know that you can use other polygons, like districts, rectangular grids instead of the building polygons for this example. We … Read moreHow To Compute Satellite Image Statistics And use It In Pandas

Google vision API for image analysis with python

Google Vision API detects objects, faces, printed and handwritten text from images using pre-trained machine learning models. You can upload each image to the tool and get its contents. But, if you have a large set of images on your local desktop then using python to send requests to the API is much feasible. This … Read moreGoogle vision API for image analysis with python

What I Discovered About Opportunity Zones From Analyzing Half a Million Data Points

There has been a lot of buzz about Opportunity Zones recently and understandably so; it is the newest federal effort to create long-term investments in low-income urban and rural census tract areas. Once designated as a qualified Opportunity Zone, these places are able to receive investments through Opportunity Funds, which are created specifically to invest … Read moreWhat I Discovered About Opportunity Zones From Analyzing Half a Million Data Points

Teaching A Computer To Land On The Moon

I spent a fair amount of time last year catching up on what’s happening in machine learning. The tools available now are really impressive — you can implement a complex neural net in just a few lines of code now with the libraries that are available. I’ve always been fascinated by the idea of machines … Read moreTeaching A Computer To Land On The Moon

Beginner’s Guide to Encoding Data

As you can see the Book_Table column has been encoded into numerical values of 0/1. The output of le.fit_transform(df[“Book_Table”]) is a Dataframe/Series depending on no. of columns encoded. Mostly Binary Columns (Book_Table from df)are encoded using Label Encoder. For Multiclass it will give different (0 to n_classes-1) values for different classes eg. 0,1,2,3,….,n-1; which are … Read moreBeginner’s Guide to Encoding Data

Building a Pseudorandom Number Generator

This giant formula can be read like this: The probability that an algorithm in the class of probabilistic polynomial time problems (BPP) could distinguish a sequence between a real random source and a PRNG tends to zero faster than any polynomial as the length of the seed increases. Therefore, a PRNG is an algorithm that … Read moreBuilding a Pseudorandom Number Generator

How to visualize data on top of a map in python using the geoviews library

For the purposes of this tutorial, we are going to make a plot to visualize the passengers volume for the busiest airports in my country, Greece, and the neighbor country, Turkey, for comparison reasons. First, we need to import the libraries and the methods we are about to use. import pandas as pdimport numpy as … Read moreHow to visualize data on top of a map in python using the geoviews library

A brief intro to the Central Limit Theorem

According to wikipedia. In probability theory, the central limit theorem (CLT) establishes that, in some situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution (informally a “bell curve”) even if the original variables themselves are not normally distributed. Translation: If you take enough samples from a population, the … Read moreA brief intro to the Central Limit Theorem

The Easy Way to Extend Pandas API

In this article, you’ll learn how to tailor pandas API to your business, research, or personal workflow using by using pandas_flavour. Pandas-flavor is a library that introduces API for extending Pandas. This API handled the boilerplate code for registering custom accessors onto Pandas objects. There are plenty of examples of extensions in the wild including: … Read moreThe Easy Way to Extend Pandas API

Cleaning Web-Scraped Data with Pandas (Part II)

As I mentioned in my previous post, cleaning data is a prerequisite to machine learning. Measuring the sanity of your data can also give you a good indication of how precise or accurate your model would be. When it comes to web-scraped data, you would often lose a lot of information in the process of … Read moreCleaning Web-Scraped Data with Pandas (Part II)

How to understand Numpy documentation

When we start to learn Data Science, Machine Learning, Deep Learning or any excited fields that will be using Python as programming language, most probably all of us will be using numpy as well. In this post, I will be writing numpy basics and how to read documentation properly based on my experience of using … Read moreHow to understand Numpy documentation

Web Scrape Twitter by Python Selenium (Part 1)

Begin of tutorial PS: For a new beginner, I would suggest you work in Jupyter Notebook first because you will face more errors than anytime before. By using Jupyter Notebook you can run the script step by step so that you know where the problem is. Access to twitter frontpage The first step is to … Read moreWeb Scrape Twitter by Python Selenium (Part 1)

Reduce Memory Usage and Make Your Python Code Faster Using Generators

A hands on guide to create iterators in a very pythonic manner Photo by Createria on Unsplash When I started learning about python generators, I had no idea how important it would turn out to be. It has helped me immensely while writing custom functions throughout my machine learning journey. Generator functions allow you to … Read moreReduce Memory Usage and Make Your Python Code Faster Using Generators

5 Minute Guide to Detecting Holidays in Python

With Pandas, it’s fairly straightforward to construct a list of dates, let’s say for the whole year of 2019: Great. Now we can construct a DataFrame object from those dates — let’s put them into Dates column: Now here comes a slight problem. The dates look to be stored in a string format, just like … Read more5 Minute Guide to Detecting Holidays in Python

A Quick Primer on Databricks Koalas

Interact with Spark Dataframes with Pandas vocabulary Photo by Jordan Whitt on Unsplash In a project of mine, I extensively used Spark to manage working with some large data files. Though it is often known for the many benefits for use with large distributed systems, it works equally well locally for projects working with large … Read moreA Quick Primer on Databricks Koalas

Let’s calculate Z-scores for Airbnb prices in New York

Z-score, also called standard score, according to wikipedia. In statistics, the standard score is the signed fractional number of standard deviations by which the value of an observation or data point is above the mean value of what is being observed or measured. Translation: a measure of how far a value is from its population … Read moreLet’s calculate Z-scores for Airbnb prices in New York

Integrating Python & Tableau

Bring your analyses to life with engaging data visualizations. When performing in-depth analyses on large and unstructured datasets, the power of Python and relevant machine learning libraries cannot be understated. Matplotlib serves as a great tool to help us visualize results, but it’s stylization options are not always optimal for use in presentations and dashboards. … Read moreIntegrating Python & Tableau

Managing virtual environment with pyenv

Most Python developers and data scientist have already heard of virtual environments. However, managing tens of environments created for different projects can be daunting. pyenv will help you to streamline the creation, management and activating virtual environments. In the old days, before the virtualenv became popular, I would keep a single global workspace for all … Read moreManaging virtual environment with pyenv

Machine Learning and Data Analysis — Inha University (Part-2)

Welcome to the second part of Machine learning and data analysis series based on a graduate course offered by Inha University, Rep. of Korea. In this part, we will discuss Data structures in python. However, if you are viewing this for the first time then we encourage you to follow the first part first where … Read moreMachine Learning and Data Analysis — Inha University (Part-2)

How to Write Python Command-Line Interfaces like a Pro

Photo by Kelly Sikkema on Unsplash We as Data Scientists face doing many repetitive and similar tasks. That includes creating weekly reports, executing extract, transform, load (ETL) jobs, or training models using different parameter sets. Often, we end up having a bunch of Python scripts, where we change parameters in code every time we run … Read moreHow to Write Python Command-Line Interfaces like a Pro

Let’s build an Intelligent chatbot

Modern chatbots do not rely solely on text, and will often show useful cards, images, links, and forms, providing an app-like experience. Depending on way bots are programmed, we can categorize them into two variants of chatbots: Rule-Based (dumb bots) & Self Learning (smart bots). Rule-Based Chatbots: This variety of bots answer questions based on … Read moreLet’s build an Intelligent chatbot

Custom Transformers in Python — Part II

Data Cleaning is the most important part of any Machine Learning project. The fact that your data may be in multiple formats and spread across different systems makes it imperative that the data is properly massaged before it’s fed to an ML Model. Data preparation is one of the most tedious and time-consuming steps in … Read moreCustom Transformers in Python — Part II