Handling Big Volume of Well Log Data with a Boosted Time-Efficiency with Python

How Python has revolutionized the way we interact with big data in the petroleum industry and so should we know about it A slice of the earth. Figure by Ani-Mate/shutterstock.com This is a story of a geophysicist who has been already getting tired of handling the big volume of well log data with manual input … Read more Handling Big Volume of Well Log Data with a Boosted Time-Efficiency with Python

Hidden Markov Model — Implemented from scratch

The Internet is full of good articles that explain the theory behind the Hidden Markov Model (HMM) well (e.g. 1, 2, 3 and 4). However, many of these works contain a fair amount of rather advanced mathematical equations. While equations are necessary if one wants to explain the theory, we decided to take it to … Read more Hidden Markov Model — Implemented from scratch

Implementing Face Recognition in 2 minutes

A detailed guide to Face Recognition using Python Image source Did you find yourself surprised when you found out Facebook automatically tags your friends in your pictures? It was then we realized that machines have become much smarter nowadays. Our face conveys a lot of information including our emotional state. It has also shown its … Read more Implementing Face Recognition in 2 minutes

Winning Solution of an Online Data Science Hackathon

Full code of my winning solution at Analytics Vidhya Data Science Hackathon Photo by Jefferson Santos on Unsplash A Data Science Hackathon is an ultimate battleground for aspiring data scientists. The aim of any hackathon is to let your imagination guide you to push the boundaries of your knowledge and skills in this field. After … Read more Winning Solution of an Online Data Science Hackathon

How to Use Decorators in Python, by example

Image Courtesy of Debby Hudson via Unsplash A decorator in Python is a function that takes another function as its argument, and returns yet another function. Decorators can be extremely useful as they allow the extension of an existing function, without any modification to the original function source code. Consider the following example: 1. add_together … Read more How to Use Decorators in Python, by example

Machine learning for streaming data with creme

Online machine learning can change the way you think about data science in production Photo by Joao Branco on Unsplash Deploying machine learning models into a production environment is a difficult task. Currently, the common practice is to have an offline phase where the model is trained on a dataset. The model is afterwards deployed … Read more Machine learning for streaming data with creme

Wikipedia API for Python

Summary To get the summary of an article use the “summary” method as shown below: print(wikipedia.summary(“Coronavirus”) On executing this line of code you can get the summary of the desired article that you’re looking for. The return type of this method is string ‘str’ ‘Coronaviruses are a group of related viruses that cause diseases in … Read more Wikipedia API for Python

Building an Image-Taking Interface Application for Your Image Recognition Model

First, import the library. import cv2 Next, we must create a VideoCapture instance. You can test if the instance was able to connect with your camera or not (if it wasn’t, check your settings to make sure apps can access it). cap = cv2.VideoCapture(0)if not (cap.isOpened()):print(“Video device not connected.”) Finally, it’s time to take the … Read more Building an Image-Taking Interface Application for Your Image Recognition Model

How to Turn a Dinosaur Dataset into a Circle Dataset with the Same Statistics?

They have the Same Median, Standard Deviation, but they are Two Clearly Distinct Datasets! What is the role of data visualization in data science? Showing beautiful graphs? If that is your view about data visualization, you may want to change it after seeing the gif below. Dinosaur to Circle The gif above captures the statistics … Read more How to Turn a Dinosaur Dataset into a Circle Dataset with the Same Statistics?

Automate These 3 (Boring!!) Excel Tasks with Python!

Let’s apply the same formula across multiple workbooks! Source: Nik Piepenbreier Let’s take a look at one final example! In each of the Excel workbooks, we have totals across the rows, but not a grand total for sales. Again, we could open each workbook and add in a formula, or we can use Python to … Read more Automate These 3 (Boring!!) Excel Tasks with Python!

Check the quality of your code with Pylint

10 rules of PEP 8 guidelines that will make your code easier to read Starry night over the Rhone — Vincent Van Gogh Pylint is a quality checker for Python programming language that follows the style recommended by PEP 8. This document provides guidelines to write clear code in Python with the main goal of … Read more Check the quality of your code with Pylint

How to Pull Data from an API using Python Requests

Technical API how-to without the headache Automatically retrieving data from APIs is critical. The thing that I’m asked to do over and over again is automate pulling data from an API. Despite holding the title “Data Scientist” I’m on a small team, so I’m not only responsible for building models, but also pulling data, cleaning … Read more How to Pull Data from an API using Python Requests

Computer Vision 101: Working with Color Images in Python

Source: pexels.com Learn the basics of working with RGB and Lab images to boost your computer vision projects! Every computer vision project — be it a cat/dog classifier or bringing colors to old images/movies — involves working with images. And in the end, the model can only be as good as the underlying data — … Read more Computer Vision 101: Working with Color Images in Python

Visualising Assembly Graphs

As shown in Figure 2, sequences are denoted starting from “S” and overlaps (or links) between sequences are denoted starting from “L”. The plus (+) and minus (-) signs denote whether the original sequence or its reverse complement is considered in the overlap. The value denoted with the letter “M” in a link refers to … Read more Visualising Assembly Graphs

Understand the machine learning Blackbox with ML interpreter

A web app for auto-interpreting the decisions of algorithms like XGBoost There are dangers in having models running the world and making decisions from hiring to criminal justice. While it’s ideal to have models that are both interpretable & accurate, many of the popular & powerful algorithms are still black-box. Among them are highly performant … Read more Understand the machine learning Blackbox with ML interpreter

Deploying a Text Classification Model

Source Learn to deploy a machine learning-based application with Dash and Heroku This article is the last of a series in which I cover the whole process of developing a machine learning project. If you have not read the previous two articles, I strongly encourage you to do it here and here. The project involves … Read more Deploying a Text Classification Model

Stylistic differences between R and Python in modelling data through the Naïve Bayes classifier

How to use R and Python to predict the probability of an event, based on prior knowledge of conditions that relate to it By author combining images from source 1, source 2 and source 3 Since the most decisive aspect of a data science methodology is to model data to produce estimations and powerful predictions, … Read more Stylistic differences between R and Python in modelling data through the Naïve Bayes classifier

Foundations for the Statistical Analysis of Climate Change

Stern view of NOAA Ship RAINIER with survey launches alongside. Credit: Personnel of NOAA Ship RAINIER. In order to explore the data quickly and easily, I’ve scaled it down to two columns: Date (year and month) and Monthly Anomaly in Celcius. The datapoints range from January 1850 to December 2014 and contain a combination of … Read more Foundations for the Statistical Analysis of Climate Change

Web Applications in Python

Django is a python-based and open-sourced web framework that enables easy creation of database-driven websites. Some examples of sites that use Django include Instagram, Mozilla, and Bitbucket. In this post, we will walk through the steps of building a simple web application with Django. Documentation for Django can be found here. Let’s get started! The … Read more Web Applications in Python

Machine Learning Finds Just How Contagious (R-Naught) the Coronavirus Is

To fit the model, we will implement a very simple yet effective gradient descent algorithm. For those unaware, a gradient descent algorithm adjusts parameters in a direction of the error space that leads to the minima. The process is: Initialize a and b to 1 and 30, respectively. Initialize lr1 and lr2 to 0.00005 each. … Read more Machine Learning Finds Just How Contagious (R-Naught) the Coronavirus Is

Exploring Netflix Data in Python

Next, it would be useful to generate summary statistics from numerical columns like ‘duration’. Let’s define a function that takes a data frame, a categorical column, and a numerical column. The mean and standard deviation of the numerical column for each category is stored in a data frame and the data frame is sorted in … Read more Exploring Netflix Data in Python

Track Coronavirus in Your Country by Displaying Notifications Using Python

search = bs.select(“div tbody tr td”) Then we use the get_text() method to get the text (the country name) beneath tags, then we use find() that allows us to extract the very first element that satisfies our restriction which is the country name. start = -1for i in range(len(search)):if search[i].get_text().find(country) !=-1:start = ibreak Then we … Read more Track Coronavirus in Your Country by Displaying Notifications Using Python

Notes on the number of components in PCA: R², Q² & eigenvectors similarity

If you search for the question “how many PCA components should I analyse?” you will most likely find the recommendation of using a threshold for cumulative variance explained. The threshold is often arbitrary and usually set to 75% or 80%. Well, there are some better ways. This article briefly introduces concepts of R² and Q², … Read more Notes on the number of components in PCA: R², Q² & eigenvectors similarity

Create Virtual Environment using “virtualenv” and add it to Jupyter Notebook

Here is the article outline: Why need Virtual Environment? What is the difference between virtualenv, virtualenvwrapper, penv and venv? Create a Virtual Environment using virtualenv Add Virtual Environment to Juypter Notebook Like other programming language, Python has its own way of downloading, storing and resolving packages (or libraries). By default, every Python project on your … Read more Create Virtual Environment using “virtualenv” and add it to Jupyter Notebook

How to Export Pandas DataFrame to CSV

Pandas DataFrames create an excel data structure with labeled axes (rows and columns). To define one DataFrame, you need at least the rows of data and columns name (header). Here is an example of pandas DataFrame: Pandas DataFrame is Excel-like Data Code to generate DataFrame: Pandas DataFrame to_csv() function exports the DataFrame to CSV format. … Read more How to Export Pandas DataFrame to CSV

Altering & Merging Photos with Python Bitwise Operators

There are three methods we can use to merge photos — AND, OR, and XOR. Let’s try them out! (First, I reshaped the first image into 500×500 pixel dimensions using the PIL library — bitwise operations must be performed on same lengths of data. Additionally, know that data holds the values for the first image … Read more Altering & Merging Photos with Python Bitwise Operators

Create Music Recommendation System Using Python

Recommendation System Get familiar with Tableau and create a Collaborative-Filtering Recommendation System Hello folks! Recommendation Systems are everywhere and pretty standard all over the web. Amazon, Netflix, and many such companies are using Recommendation Systems. The one we are going to build is pretty common to what Spotify or Youtube Music uses but much more … Read more Create Music Recommendation System Using Python

Exploring COVID-19 Research Publications

The next thing we could look at is the sentiment scores of the abstracts. Maybe we can assign sentiment scores to each abstract and see if the score are associated with positive results. In order to get sentiment scores, we need to import a python package called textblob. The documentation for textblob can be found … Read more Exploring COVID-19 Research Publications

Learn How to (easily!!) do 3 MORE Advanced Excel Tasks in Python

In another post, we Learned How to (easily!!) do 3 Advanced Excel Tasks in Python. Here we’ll continue that mission and learn another 3 advanced things you can easily take on! I hope that the post made Python a little more approachable while providing some fun comparisons between the two great tools. We’ll begin by … Read more Learn How to (easily!!) do 3 MORE Advanced Excel Tasks in Python

TensorFlow: Sarcasm Detection in 20 mins

Build your first NLP Model in TensorFlow from scratch Photo by Joshua Hoehne on Unsplash This is a quick Natural Language Processing (NLP) tutorial in TensorFlow. NLP is a sub-field of artificial intelligence concerned with understanding, interpreting and manipulating human language by computers. TensorFlow is an open source machine learning library by Google. This tutorial … Read more TensorFlow: Sarcasm Detection in 20 mins

A Mathematical Primer of Compression

How to reduce storage size without losing information. Photo by Alexey Ruban on Unsplash Since I’m into music, it often comes up there is a growing trend: music is sold digitally and as vinyl. Sometimes I’ll hear people mistakenly call the vinyl trend “retro” or “trendy” or “hip” or whatever. But if you actually ask … Read more A Mathematical Primer of Compression

Visualise COVID-19 case data using Python, Dash and Plotly

First, download the latest stable release of Python from https://www.python.org/downloads/, start the installer and go through. For the selection of the code editor there are plenty of possibilities. Visual Studio Code is a valid option. To install it, download the installer from https://code.visualstudio.com/download, run it and complete the installation. To install the packages we need, … Read more Visualise COVID-19 case data using Python, Dash and Plotly

Machine Learning: Model Selection and Hyperparameter Tuning

Grid Search as a clean and simple means of discovering better hyperparameters Image by andreas160578 from Pixabay In my day-to-day research, a problem I would face quite often is selecting a proper statistical model that fits my data. Mostly, I would be using statistical models for smoothing out erroneous signals from DNA data and I … Read more Machine Learning: Model Selection and Hyperparameter Tuning

Fine-Tuning the Strategy Using a Particle Swarm Optimization

After we got the description of the strategy we can start writing the code. As a backtesting system is used Backtrader. This is an open-source feature-rich framework for the backtesting and trading. Yes, in Python. This project I separated into 5 files, where 4 files consisting of a class, and one file is a script … Read more Fine-Tuning the Strategy Using a Particle Swarm Optimization

A Quick, Easy Way to Unpivot Data in Python

Let’s make downloaded data more usable! Source: Nik Piepenbreier Python and Excel make it incredibly easy to analyze data and to present that analysis in a cross-tab style format. But what if you receive data only in that format? To take on meaningful analysis beyond that, it’s often necessary to spend a good chunk of … Read more A Quick, Easy Way to Unpivot Data in Python

Simulate cultural interactions using Go and Python

Model and simulate how culture disseminates using Go and Jupyter Notebook Nothing shouts culture more than visiting relatives during Chinese New Year. From celebrations and customs to decorations and food (it’s always about food), everything is red and loud in your face. When I was young, I thought this is what being Chinese is all … Read more Simulate cultural interactions using Go and Python

Dictionaries in Python

In python, a dictionary is an unordered collection of data values where each element is a key/value pair. Dictionaries have a wide variety of applications including mapping objects in video games, database settings, unit testing and much more. In this post, we will discuss how to define and perform simple operations on dictionaries in python. … Read more Dictionaries in Python

Building a Content Based Book Recommendation Engine

Total 3592 books details available in our dataset. It has six columns title -> Book Name Rating -> Book rating given by the user Genre -> Category(Type of book). I have taken only three genres like business, non-fiction and cooking for this problem Author -> Book Author Desc -> Book description url -> Book cover … Read more Building a Content Based Book Recommendation Engine

How to analyse a quantitative finance trading hypothesis in Python

A SMA200 simple moving average quantitative analysis using Python and Polygon. Image licensed from AdobeStock. In this article, I will analyse the 200-period moving average on US equities using Python, Jupyter Notebook and data from Polygon.io. The reason for selecting this strategy is that if there is any indicator that can generate consensus among the … Read more How to analyse a quantitative finance trading hypothesis in Python