What Happened When I Tried Market Prediction With Machine Learning

Image by Pexels on Pixabay Right around the time you get your first basic regression or classification model going, it will at least cross your mind. The vast piles of time series data, coupled with the possibility of retiring young has the irresistible pull of finding an old treasure map in your grandfather’s attic. How … Read more

The Cardio of Audio

A structured data usually lives in an RDBMS or a database that you can easily search records, see the numbers and compare them. For example, a record can have names, id, date of birth, salary, address, etc. The data is arranged in a structured tabular like format and it’s simple to work with them. Unstructured … Read more

Build your own pandas (like) library

We can now use conda activate pandas_cub and conda deactivate to activate and deactivate our environment. Step 4: Checking tests All the tests are included in a file called test_dataframe.py located in the tests directory. Run all the tests: $ pytest tests/test_dataframe.py Run a particular class of tests: $ pytest tests/test_dataframe.py::ClassName Run a particular function … Read more

Ordinal, Nominal, …Who Cares?

My cousin is a senior in high school, and when I saw her recently I inquired, “How are your classes going?” She rolled her teenage eyes and expressed her displeasure at all the vocabulary she had to learn for her statistics class. “Ordinal, nominal, … who cares?” she lamented. Well, Anna, I’ll tell you. Data … Read more

Predict electricity consumption using Time Series analysis

A step by step approach to forecasting/prediction using the ARIMA model. Pic Credit: https://gfycat.com/frailofficialdegus “Time series models are used to forecast future events based on previous events that have been observed (and data collected) at regular time intervals.” We will be taking a small forecasting problem and try to solve it till the end learning … Read more

A Better Excel Goal Seek using Python

Solving has been artifically slowed down for visualization. Microsoft Excel has a great feature, “Goal Seek”, under the Data / What If… menu. Given a formula, it will calculate what the input needs to be in order for that formula to equal a specific result. It’s a very powerful but simple tool. The standard “Goal … Read more

OHLC Charts With Python Libraries

There are several graphing libraries available to Python coders. Two of the most popular ones are Matplotlib and Plotly. Picking the right library depends on your situation. I recreated the OHLC (open-high-low-close) chart in both libraries to highlight these differences and included the code below. Matplotlib is the first graphing and plotting library I learned … Read more

Use AI to Write Captions for Images with Cloudsight + Python

A Pythonic API lets you automatically write human-readable captions for your images How would you caption this? Credit: Gado Images. There are lots of solutions out there on the market today for using Artificial Intelligence and Machine Learning to tag images. Solutions from IBM Watson, Imagga, Cloudsight, Google Vision and Microsoft Azure all perform well, … Read more

Scraping Reddit for Healthcare Information

Source In this post, we will be using a python package called PRAW to scrape Reddit for healthcare information. A basic knowledge of python syntax is required. First, install PRAW: pip install praw Next, you need to register an application of the appropriate type here: Once you redirect to the registration page, sign up by … Read more

Everything After Training Your Deep Learning Model to a web app.

Most of the folks know about this step and often call this as serializing. After training the model of some problem-solving use cases. Every one usually saves their entire model(architecture + weights + optimizer state) or just weights with python or API library (Pickle, Keras) of their choice to prevent the training model again. One … Read more

RSS Feed Parser in Python

Finally let’s build the script connecting all the pieces together…1. Imports: The following lines will import the required modules and objects from data_parser import get_soup, parse_record, store_tagsfrom db_connect import get_connection, get_max_records,execute_queryfrom apscheduler.schedulers.blocking import BlockingScheduler 2. We’ll define some global variables also (I know not a recommended practice, but given the time constraint, it was a … Read more

How to Build Slim Docker Images Fast

Do you remember those days, when you wrote awesome software but you couldn’t install it on someone else’s machine or it crashed there? Though this is never a nice experience, we could always say Nowadays, that’s not an excuse any more due to containerization. Very briefly, with containerization, you pack your application and all necessary … Read more

Productionalize Your Machine Learning Model Using Flask And Google App Engine

This small tutorial will help you understand how a trained machine learning model is used in production. Nowadays you can find lots of tutorials, MOOCs and videos for learning Data Science and Machine Learning. But none of them explain what happens to your machine learning model after you train and optimize one at your local … Read more

Web Scraping of 10 Online Shops in 30 Minutes with Python and Scrapy

Photo by Nguyen Bui on Unsplash Scrapy Spiders are classes which define how a certain site (or a group of sites) will be scraped, including how to perform the crawl (i.e. follow links) and how to extract structured data from their pages (i.e. scraping items). In other words, Spiders are the place where you define … Read more

Jupyter notebooks tips and tricks

There are many great extensions in jupyter_contrib_nbextensions. You should be using Jupyter lab though instead. First you need to instead jupyter_contrib_nbextensions and then you can install various useful extensions. pip install jupyter_contrib_nbextensionsjupyter contrib nbextension install –user These are the ones I love: code_prettify backed by autopep8 is great for reformatting code in notebook code cells … Read more

Why AutoML is An Essential New Tool For Data Scientists

Machine learning (ML) is the current paradigm for modeling statistical phenomena by harnessing algorithms that exploit computer intelligence. It is common place to build ML models that predict housing prices, aggregate users by their potential marketing interests, and use image recognition techniques to identify brain tumors. However, up until now these models have required scrupulous … Read more

Streamlit 101: An in-depth introduction

Deep dive into Streamlit with Airbnb NYC data Streamlit is an awesome new tool that allows engineers to quickly build highly interactive web applications around their data, machine learning models, and pretty much anything. The best thing about Streamlit is it doesn’t require any knowledge of web development. If you know Python, you’re good to … Read more

A python flask app that predicts the personality type on the basis of user entries using text…

The painting above (called vertical flow) by Irene Rice Perreria is one that I find most interesting. To me, it describes the heterogeneity in colors and also simultaneously how these different colors originate from the same essence. Colors at the end of the day operate on a spectrum rather than as individual points. One could … Read more

My first small project in Python for browsing Reddit in office safely (Part 2)

Newly added features: 1. lcop (Leave comment on a post) param: submission_id lcop allows you to leave a comment to a submission. All you need to provide is the submission id of the post. Since login is required for leaving a comment, before proceeding any action, an automatic check of whether a user is logged … Read more

Top Python Libraries: Numpy & Pandas

In this tutorial, I’ll try to make a brief description about two of the most important libraries in Python Numpy and Pandas. Without further delay lets go through Numpy first. Numpy numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object and tools for working with these arrays. … Read more

Easy Text-to-Speech with Python

Text to Speech Source: thenextweb.com Text-to-speech (TTS) technology reads aloud digital text. It can take words on computers, smartphones, tablets and convert them into audio. Also, all kinds of text files can be read aloud, including Word, pages document, online web pages can be read aloud. TTS can help kids who struggle with reading. Many … Read more

Using Airflow and Spark To Crunch US Immigration Data

For example, prior to any triggering of Spark Jobs, the data-sets needed to be downloaded from s3 and unzipped. Using Airflow documents this dependency, and if a downstream task fails (syncing of files from s3) then the dependent upstream task won’t be invoked. This improved the stability of the pipeline and prevented runaway code from … Read more

Email Automation with Python

Automate emails with attachments in python https://www.helpsystems.com/resources/guides/automated-operations-5-benefits-your-organization When I first started using python I saw it as an upgrade to excel. A tool I could use to improve my work in data analysis. The better I got at python the more streamlined my analysis became, and I started to realize python was more than a … Read more

Difference Between NFD, NFC, NFKD, and NFKC Explained with Python Code

The difference between Unicode normalization forms Photo by Joel Filipe on Unsplash Recently I am working on an NLP task in Japanese, one problem is to convert special characters to a normalized form. So I have done a little research and write this post for anyone who has the same need. Japanese contains different forms … Read more

How to add machine learning-powered text summarization to any project

Have you ever wondered how media organizations are able to produce the raw volume of content they output? How is that the Associated Press, in addition to all of their other coverage, is able to cover 4,400 quarterly earning reports each year? How does The Washington Post run such hyperlocal coverage — like covering every … Read more

Writing scrapers the good way

If you start with from bs4 import BeautifulSoup 95 % of the times you are doing it wrong. (burp community edition is enough to follow this tutorial) (not) important disclaimer: always check the policy of the website you are trying to scrape, this is only for educational purposes this can be skipped like I always … Read more

Python Input, Output and Import

In this tutorial let us understand the Input and Output built-in-functions used in python, also we will learn how to import libraries and use them in our programs. Image Credits: Data Flair Before getting started let us understand what are built-in-functions? Any function that is provided as part of a high-level language and can be … Read more

And The Star of the Show is — PYTHON

The overall contributions to the open-source projects are seen from all the continents and Asia is on the top with most of its contributions coming from China. The below graph shows us contributions from different continents. The top 50 packages in each language ecosystem have a massive amount of dependent projects. The top npm packages … Read more

Pandas’ groupby explained in detail

To demonstrate some advanced grouping functionalities, we will use the simplest version of the apply step (and count the rows in each group) via the size method. We do this so that we can focus on the groupby operations. We will go into much more detail regarding the apply methods in section 2 of the … Read more

What I Discovered About Opportunity Zones From Analyzing Half a Million Data Points

There has been a lot of buzz about Opportunity Zones recently and understandably so; it is the newest federal effort to create long-term investments in low-income urban and rural census tract areas. Once designated as a qualified Opportunity Zone, these places are able to receive investments through Opportunity Funds, which are created specifically to invest … Read more

Teaching A Computer To Land On The Moon

I spent a fair amount of time last year catching up on what’s happening in machine learning. The tools available now are really impressive — you can implement a complex neural net in just a few lines of code now with the libraries that are available. I’ve always been fascinated by the idea of machines … Read more

Beginner’s Guide to Encoding Data

As you can see the Book_Table column has been encoded into numerical values of 0/1. The output of le.fit_transform(df[“Book_Table”]) is a Dataframe/Series depending on no. of columns encoded. Mostly Binary Columns (Book_Table from df)are encoded using Label Encoder. For Multiclass it will give different (0 to n_classes-1) values for different classes eg. 0,1,2,3,….,n-1; which are … Read more

Building a Pseudorandom Number Generator

This giant formula can be read like this: The probability that an algorithm in the class of probabilistic polynomial time problems (BPP) could distinguish a sequence between a real random source and a PRNG tends to zero faster than any polynomial as the length of the seed increases. Therefore, a PRNG is an algorithm that … Read more

How to visualize data on top of a map in python using the geoviews library

For the purposes of this tutorial, we are going to make a plot to visualize the passengers volume for the busiest airports in my country, Greece, and the neighbor country, Turkey, for comparison reasons. First, we need to import the libraries and the methods we are about to use. import pandas as pdimport numpy as … Read more