Ten Tricks To Speed Up Your Python Codes

1. Familiar with built-in functions Figure 1 | Built-in Functions in Python 3 Python comes with many built-in functions implemented in C, which are very fast and well maintained (Figure 1). We should at least familiar with these function names and know where to find it (some commonly used computation-related functions are abs(), len(), max(), … Read more Ten Tricks To Speed Up Your Python Codes

Real or Spurious Correlations: Attractive People You Date Are Nastier

But how do we know if the negative correlation between attractiveness and personality is real? Data scientists deal correlations regularly, and a good way to gain more intuition about the data and learn analysis methods is via simulation. So let’s simulate some data to test our intuition. First, let’s import the common data science modules: … Read more Real or Spurious Correlations: Attractive People You Date Are Nastier

Scrapy: This is how to successfully login with ease

Sheng Li from Unsplashed Demystifying the process of logging in with Scrapy. Once you understand the basics of Scrapy one of the first complication is having to deal with logins. To do this its useful to get an understanding of how logging in works and how you can observe that process in your browser. We … Read more Scrapy: This is how to successfully login with ease

Getting Started With Jupyter Notebooks in Visual Studio Code

Insert and Delete Cells To insert a cell, click the plus sign in the toolbar or the one to the left side of the cell. To delete a cell, click the delete sign (i.e., the trash symbol) on the cell’s right side. Insert and Delete Cells Switch Cell Content Type and State To switch the … Read more Getting Started With Jupyter Notebooks in Visual Studio Code

A Beginner’s Guide to Simulating Dynamical Systems with Python

Numerically Integrate ODEs in Python Photo by Dan Meyers on Unsplash Consider the simple pendulum. We’ve just got a mass of m hanging from a string with length L that is swinging back and forth. It’s basically as simple of a system as we can work with. Don’t let this simplicity fool you though, it … Read more A Beginner’s Guide to Simulating Dynamical Systems with Python

Using ColumnTransformer to combine data processing steps

Create cohesive pipelines for processing data where different columns require different techniques This scikit-learn tool comes in extremely handy, but also has some quirks of its own. Today we’ll be using it to transform data on ferry wait time for the Edmonds-Kingston route of the Washington State Ferries. (Thank you WSF for the data!). Full … Read more Using ColumnTransformer to combine data processing steps

Find and play with ‘molecule’ datasets

Machine learning has become popular in all fields. It has helped make industrial processes more efficient in all industries, be it logistics or defense. Although the adoption of machine learning has been slow by pharmaceutical companies — they are surely catching up. In this article, I talk about some of the standard datasets which can … Read more Find and play with ‘molecule’ datasets

Improve Your Data Wrangling With Object Oriented Programming

One of the dirty secrets of data science goes as follows: “Far from spending hours discovering glorious new algorithms and developing cutting edge neural networks, you will in fact spend most of your time cleaning, munging, and manipulating data” This is the result of a simple, inescapable truth — data in the real world doesn’t … Read more Improve Your Data Wrangling With Object Oriented Programming

Creating An Interactive Data Visualisation for ‘The Office’, Using D3.js (Part 1)

This first post is going to be focused on the data pre-processing I did prior to creating the actually visualisation. All the pre-processing I did was done using Python and Pandas in Jupyter Notebooks, along with some other libraries for specific tasks such as sentiment analysis. The dataset which I started with contained information about … Read more Creating An Interactive Data Visualisation for ‘The Office’, Using D3.js (Part 1)

Visualize and communicate uncertainties with Python and Plotly

Fantasy sports examples to visualise, understand and communicate uncertainties, probabilities and ranges of outcomes (code & data included) Original Photo by Riho Kroll on Unsplash In this article, I want to show how easy it is to effectively visualise and communicate uncertainties and ranges. More importantly, I would like to demonstrate how helpful doing so … Read more Visualize and communicate uncertainties with Python and Plotly

Using Python and Robinhood to Build An Iron Condor Options Trading Bot

Photo by Jean Vella on Unsplash So I recently discovered the potential of revenue in options trading. My mind works very technically and noticed that trading strategies are nothing more but conditional statements. So I decided to create a simple iron condor trading strategy. According to OptionAlpha, iron condor options trading strategy is the most … Read more Using Python and Robinhood to Build An Iron Condor Options Trading Bot

How to use NLP in Python: a Practical Step-by-Step Example

Tokenizing the Job Descriptions Tokenization is a process of parsing the text string into different sections (tokens). It is necessary since the computer programs understand the tokenized text better. We must explicitly split the job description text string into different tokens (words) with delimiters such as space (“ ”). We use the word_tokenize function to … Read more How to use NLP in Python: a Practical Step-by-Step Example

Plotnine plot deconstruction: visualizing the billboard hot 100

Smart data selection In the first part of the code (lines 4–10) I preprocess the data for the plot. Here I make two important choices: I use only the data for the top 15 of the top 100 I discard songs that only appear 4 weeks or less in the data By removing less important … Read more Plotnine plot deconstruction: visualizing the billboard hot 100

How I used machine learning to strategize my GRE preparation.

DataFrame containing words and synonyms Now let’s join the 2 data frames (meanings and synonyms): result = pd.merge(df.word, data, on=’word’)result.fillna(”, inplace=True)print(result) We can see the data needs some cleaning since it contains stop-words like and, or, the and other elements like punctuation marks. Also, we must take care of the contractions like can’t, won’t, don’t … Read more How I used machine learning to strategize my GRE preparation.

Projecting and Visualizing Infrastructure Growth Trends with Python

How Python can aggregate data consumption on a month-to-month basis and prevent hours of manual calculations unsplash.com When it comes to project planning, knowing if your database or dedicated network share is capable of handling increased data load is essential to avoiding performance failures. This is the type of query that comes up naturally and … Read more Projecting and Visualizing Infrastructure Growth Trends with Python

Visualizing Indonesian School Participation Rates with Choropleth Map in Python

Step by step to see how Indonesia’s school participation rates for different age groups with choropleth maps using python geopandas In this article, we will see how Indonesian school participation rates in 2019 for different age groups by utilizing geospatial data. This data will be visualized with a choropleth map that we will break it … Read more Visualizing Indonesian School Participation Rates with Choropleth Map in Python

Voice Classification with Neural Networks

Summary of the project The first step was to learn how to manipulate audio data and build models to classify sounds. I found a great competition called the Urban Sound Classification at https://datahack.analyticsvidhya.com/contest/practice-problem-urban-sound-classification/. The problem was to classify 10 different urban sounds like children playing, street music, car engine, etc. I did a lot of … Read more Voice Classification with Neural Networks

Optimize Python Code in Jupyter Notebook

Bored waiting for a slow Python job to finish? Let’s find the bottleneck and optimize it. Video by PIRO4D on Pixabay Want to make a slow Python code fast in Jupyter Notebook? By using sum aggregate as an example, we will measure the runtime with %%time and find the bottleneck using %lprun line profiler. Assume … Read more Optimize Python Code in Jupyter Notebook

Does public sentiment affect the stock market?

stocktwits.com It’s long been known that the stock market’s performance has an effect on the public, but what if the reverse is true? With the rise of social media, people have been posting vast amounts of information on a variety of topics, and researchers have become increasingly interested in finding out what role finance-related social … Read more Does public sentiment affect the stock market?

Building a Coronavirus Outbreak SMS Alert System!

Photo by @thedotter Using Python to stay on top of the outbreak Due to the recent Coronavirus outbreak in China, as well as other countries in Asia, people from the rest of the world are growing concerned that the virus will continue to spread and eventually reach their country. As of the time of writing … Read more Building a Coronavirus Outbreak SMS Alert System!

Visualizing Map of Crime against women in India using Geopandas

This article shows us a simple way to plot either state-wise or district-wise statistical data (like the one used here i.e. crime against women in India) on the Indian Map(or you can choose any country) using GeoPandas: A Python Library. Image Courtesy: Reyna Zamora , [Pinterest Post]. Retrieved Feb 18, 2020, from https://in.pinterest.com/pin/626563366894685267/ Introduction: Hello … Read more Visualizing Map of Crime against women in India using Geopandas

Ten Tips to Save you Time and Frustration When Programming ⏳

This is such a big topic that doesn’t get enough attention that I was debating writing a whole article on it. Instead, here’s a bonus section — a 5-step plan to resolve errors faster. Translation: you called shape as method when it doesn’t exist, you wanted to get an attribute. Every programmer see lots of … Read more Ten Tips to Save you Time and Frustration When Programming ⏳

DevOps for Data Science with GCP

Source: https://pixabay.com/photos/dock-ship-container-port-boat-1277744/ Deploying Production-Grade Containers for Model Serving One of the functions of data science teams is building machine learning (ML) models that provide predictive signals for products and personalization. While DevOps has not always been considered a core responsibility of data science teams, it is becoming increasingly important as these teams start to take … Read more DevOps for Data Science with GCP

Create effective data visualizations of proportions

Best ways to see individual contributions to a whole and changes over time, at various dataset sizes — (includes simple, visual demonstrations, code & data) Various visualisations of proportions Plotting proportions of a whole might be one of the most common tasks in data visualisation. Examples include regional differences in happiness, economic indicators or crime, … Read more Create effective data visualizations of proportions

Exploring Moving Averages to Build Trend Following Strategies in Python

Generated in Python using Plotly How moving averages can be used to improve the portfolio performance over the benchmark “Buy low, sell high” is a common goal everyone in finance wants to achieve. This, however, is more difficult than appears, since it is almost impossible to predict what direction the market is going. Many investors … Read more Exploring Moving Averages to Build Trend Following Strategies in Python

An Overview Of Importing Data In Python

Python build-in functions (read(), readline(), and readlines()) In general, a text file (.txt) is the most common file we will deal with. Text files are structured as a sequence of lines, where each line includes a sequence of characters. Let’s assume we need to import in Python the following text file (sample.txt). Country/RegionMainland ChinaJapanSingaporeHong KongJapanThailandSouth … Read more An Overview Of Importing Data In Python

Python ETL Tools: Best 8 Options

ETL is the process of fetching data from one or many systems and loading it into a target data warehouse after doing some intermediate transformations. The market has various ETL tools that can carry out this process. Some tools offer a complete end-to-end ETL implementation out of the box and some tools help you to … Read more Python ETL Tools: Best 8 Options

Build a custom-trained object detection model with 5 lines of code

Ideally, you’ll want at least 100 images of each class. The good thing is that you can have multiple objects in each image, so you could theoretically get away with 100 total images if each image contains every class of object you want to detect. Also, if you have video footage, Detecto makes it easy … Read more Build a custom-trained object detection model with 5 lines of code

A better way for asynchronous programming: asyncio over multi-threading

A brief introduction to asyncio import asynciofrom aiohttp import ClientSessionasync def fetch(url):async with ClientSession() as session:async with session.get(url) as response:return await response.read() This is basically asyncio version of fetch_url. I use aiohttp because it provides an excellent client session where we can make HTTP requests asynchronously. Besides aiohttp.ClientSession, the code probably looks strange with async … Read more A better way for asynchronous programming: asyncio over multi-threading

The easiest way to download YouTube videos using Python

Source: Unsplash And how to use a custom class to extract frames as images In one of my first articles on Medium, I showed how to train a Convolutional Neural Network to classify images coming from old GameBoy games — Mario and Wario. After over a year, I wanted to revisit one aspect of the … Read more The easiest way to download YouTube videos using Python

Building an Incremental Recommender System

A recommender system should ideally adapt to changes as they happen. Although I will try to keep the math jargon to a minimum, this story expects that the reader is familiar with concepts like user-item interaction matrix, matrix factorization, embedding spaces, as well as basic machine learning terminology. This story is not an introduction to … Read more Building an Incremental Recommender System

Using Data to Find the Most Brutal Cannibal Corpse Song

According to The Free Dictionary, ‘brutality’ is defined as: “The state or quality of being ruthless, cruel, harsh, or unrelenting.” For me there are two ways to express this in a song: by sonic features (like the energy and valence of the music itself) by the lyrics (a vocabulary full of ‘harsh an cruel’ words) … Read more Using Data to Find the Most Brutal Cannibal Corpse Song

Visualize Stock Market Data using Bokeh

The beginner’s guide to creating an interactive stock price dashboard. Photo by Chris Liverani on Unsplash To build your trading platform, you also need to have a way to visualize your stock market data. One of the easiest ways you can choose TradingView, which has an excellent idea to analyze financial market data. However, you … Read more Visualize Stock Market Data using Bokeh

Creating Beautiful Sankey Diagrams with floWeaver

Stepping up your Sankey game in Python A while back, I wrote “The What, Why, and How of Sankey Diagrams.” In the process of exploring Matplotlib’s Sankey library, I found a lovely alternative: floWeaver. If you’re looking to document flows with a Sankey-style diagram with a different style, this blog is for you. Sankey Diagram … Read more Creating Beautiful Sankey Diagrams with floWeaver

Magic Methods in Python, by example

Image Courtesy of www.istockphoto.com Changing Built-in Behavior using Magic Methods Magic methods are special methods that you can define to add ‘magic’ to your classes. They are always surrounded by double underscores, for example, the __init__ and __str__ magic methods. Magic methods can enrich our class design by giving us access to Python’s built-in syntax … Read more Magic Methods in Python, by example

How to quickly find the best bin width for your histogram

In this graphic you can see the end result. If we change the bin width through a slider, the plotly graph adjusts automatically. In order to implement this behavior, we combine plotly.graph_objs (creates the plotly graph) with an ipywidgets.Floatslider. This is the code for creating the rebinnable histogram. Let’s go through it line by line. … Read more How to quickly find the best bin width for your histogram

Understanding the Coronavirus Epidemic Data

A Python Toolbox to analyze the current coronavirus outbreak Graphical Illustration from 前瞻网 The novel coronavirus (previously named 2019-nCov, and later changed to COVID-19¹) is currently sweeping through China. It triggered a global health emergency², and claimed more than a thousand lives as of Feb 13, 2020³. People around the world are constantly bombarded with … Read more Understanding the Coronavirus Epidemic Data

Elevate Your Webscraping With Splinter

In a previous blog, I explored the basics of webscraping using a combination of two packages; Requests, which fetched a website’s HTML and BeautifulSoup4, which makes sense of that HTML. Soup of the Day Webscraping With Beautiful Soup — A Beginner’s Guide towardsdatascience.com These packages are a great introduction to webscraping, but Requests has limitations, … Read more Elevate Your Webscraping With Splinter

Dictionary as an Alternative to If-Else

Use Dictionaries to Create a Cleaner Code of If-Else Function You may have been frequently working with Python’s dictionaries. But have you unlocked the full capacity of the dictionary to create a more efficient code? If you didn’t know how to create an ordered dictionary, group multiple dictionaries into a single mapping, create a read-only … Read more Dictionary as an Alternative to If-Else

Getting Familiar with Keras

Neural networks are computing systems loosely inspired by brain connectivity. In short, neural networks make a series of transformations to input, the results of which are used as features during learning. Keras is an open-source library in python that enables easy experimentation with neural networks. Keras provides many of the building blocks of neural networks … Read more Getting Familiar with Keras

Entity Level Evaluation for NER Task

When we evaluate the NER (Named Entity Recognition) task, there are two kinds of methods, the token-level method, and the entity-level method. For example, we have this sentence predicted below: “Foreign Ministry spokesman Shen Guofang told Reuters”. If we use the token-level evaluation, the token “Shen” is correct, and the token “Guofang” is wrong. But … Read more Entity Level Evaluation for NER Task

How to Prepare for a Data Science Interview

Never put off for tomorrow, what you can do today. Photo by Thomas Park on Unsplash In 2012, Harvard Business Review announced that Data Science will be the sexiest job of the 21st Century. Since then, the hype around data science has only grown. Recent reports have shown that demand for data scientists far exceeds … Read more How to Prepare for a Data Science Interview

SVMs, Random Forests and Unbalanced Datasets: Predicting Hotel Cancellations

When attempting to build a classification algorithm, one must often contend with the issue of an unbalanced dataset. An unbalanced dataset is one where there is an unequal sample size between classes, which induces significant bias into the predictions of the classifier in question. Source: pixabay.com In this particular example (available from the references section … Read more SVMs, Random Forests and Unbalanced Datasets: Predicting Hotel Cancellations