How short-term forecasting with neural nets can inform long-term decisions.

For electric utilities, reducing monthly demand charge can be hugely profitable. Implementing a peak shaving strategy every day, however, could be costly. If a utility is using direct load control (paying customers to turn off air conditioners, water heaters, etc.), they may frustrate customers if they do so too frequently. If a utility uses storage, … Read more How short-term forecasting with neural nets can inform long-term decisions.

AI for medical imaging — now?

QuantX (Quantitative Insights) This computer-aided diagnosis (CADx) software assists radiologists in the assessment and characterisation of breast abnormalities using magnetic resonance imaging (MRI). As well as highlighting areas of interest using image segmentation, a proprietary metric (‘QI Score’) is calculated that is correlated to the likelihood of malignancy Class II medical device, with special controls … Read more AI for medical imaging — now?

Building a Zero Curve with Forward Rate Agreements Using Pandas

Photo by Markus Spiske on Unsplash In finance world, if you wanted to price an instrument and figure out the future value at t(n) from t0 (now), you would need to use the spot yield curve. Among the professional traders, the spot yield curve is called zero curve. If you have a $1000 now to … Read more Building a Zero Curve with Forward Rate Agreements Using Pandas

Artificial Intelligence: Digging for Value

There are three types of Machine Learning algorithms which inform our A.I. use cases. They are: ● Supervised Learning: Teaching a machine to “learn” by showing it lots of examples of inputs and responses. ● Unsupervised Learning: Feeding in data and letting the machine recognize patterns ● Reinforcement Learning: Not giving the machine any data … Read more Artificial Intelligence: Digging for Value

How I improved my text classification model with feature engineering

In this article I will show you how to conduct a textual analysis to improve the accuracy of your model and discover several facts in your data. For that I will use a data set available on Kaggle.This dataset is composed of comments in from ‘Wikipedia’s talk page edits’, it is provided with the identifiers … Read more How I improved my text classification model with feature engineering

Spatial regression in R part 1: spaMM vs glmmTMB

Category Tags Many datasets these days are collected at different locations over space which may generate spatial dependence. Spatial dependence (observation close together are more correlated than those further apart) violate the assumption of independence of the residuals in regression models and require the use of a special class of models to draw the valid … Read more Spatial regression in R part 1: spaMM vs glmmTMB

4 Alternative Data Sources for Model Driven Investing

Looking Beyond Stock Prices in Machine Learning The unprecedented rise of machine learning has seen its application to many industries spanning far and wide, from air travel to healthcare, e-commerce to logistics, insurance to meteorology and even agriculture. The field of investing is no exception as the environment investors operate in is rich with data. … Read more 4 Alternative Data Sources for Model Driven Investing

Starting Data Visualization with Julia and JuliaBox

With Julia and JuliaBox you can make impressive data visualizations with almost no programming knowledge and no need to install anything. Julia is a relatively new language for data analysis. It has a high-level syntax and designed to be easy to use and understand. Some have called it the new Python. Unlike Python, though, it … Read more Starting Data Visualization with Julia and JuliaBox

Why we do datathons for dementia

There’s more reason to join a datathon than unlimited* pizza with your data buddies (even if that’s a really great highlight!) Datathons join up data scientists with subject experts — and, when you’re dealing with complex real world health data — that’s really crucial. Dementia is one of our society’s most urgent health problems and … Read more Why we do datathons for dementia

September Edition: Deploying Machine Learning Models

The applications of machine learning are seemingly endless (as Juan De Dios Santos demonstrates, building a pikachu detection app). But whilst an increasing number of data scientists are familiarising themselves with the technology, finding successful use cases that are scaled and robust is still difficult. This is, at least in part, due to the fact … Read more September Edition: Deploying Machine Learning Models

Automatically finding the best Neural Network for your GAN

Generative Adversarial Networks (GANs) have been a hot topic in Deep Learning ever since their initial invention and publication at NIPS 2014. There’s a good reason for it all: GANs are able to create totally new content based on only a small bit of guidance. It’s that sort of creativity which makes them so powerful. … Read more Automatically finding the best Neural Network for your GAN

Top 7 Machine Learning Methods that Every Data Scientist Must Know

In this digital era, now most of the manual tasks are being automated. Now, machine learning algorithms are helping computers perform surgeries, play chess, and getting smarter and more personal. We are living in a world of constant progress on the technological ground, and looking at how computing is getting advanced day after day. We … Read more Top 7 Machine Learning Methods that Every Data Scientist Must Know

Delivering Data Science Without Delivering Software

Data Science Tools (Photo: Author) Do you always need to deliver complete software? From time to time debates such as ‘R vs Python’ or ‘Software skills vs ‘Statistics Skills’ rear their heads in the Data Science world. These debates sometimes appear to have the hidden assumption that the only possible deliverable for a data scientist … Read more Delivering Data Science Without Delivering Software

Hypothesis tests with Python

In my previous article, I’ve been talking about statistical Hypothesis tests. Those are pivotal in Statistics and Data Science since we are always asked to ‘summarize’ the huge amount of data we want to analyze in samples. Once provided with samples, which can be arranged with different techniques, like Bootstrap sampling, the general purpose is … Read more Hypothesis tests with Python

Writing Covered Calls To Protect Your Stock Portfolio

Now let’s see what happens when we add call options to the mix. First let me quickly cover the nitty gritty of how options work: An option is a financial instrument that grants you the right, but not the obligation, to either buy or sell 100 shares of a financial asset (such as Apple stock) … Read more Writing Covered Calls To Protect Your Stock Portfolio

AI Search Algorithms Implementations

Popular search algorithms in AI explained and implemented In Naboo Planet the R2-D2 droid is serving her Queen Amidala and is successful in stealing some important documents containing secrets of the castle of Dark Lord Darth Vadar located on the volcanic planet of Mustafar. As soon as Dark Lord finds this out about this, he … Read more AI Search Algorithms Implementations

colorspace @ useR! 2019

[This article was first published on Achim Zeileis, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Conference presentation about the colorspace toolbox for manipulating and assessing color palettes … Read more colorspace @ useR! 2019

Real-time Twitter Sentiment Analysis for Brand Improvement and Topic Tracking (Chapter 2/3)

Loading & Preparing Twitter data First, we need to extract the Twitter data from data source. You can directly read from sample_data.csv for simplicity. df = pd.read_csv(“sample_data.csv”) Or in a more formal way, extract data from our MySQL database, which has been set up and filled with real-time Twitter data in the Chapter 1. db_connection … Read more Real-time Twitter Sentiment Analysis for Brand Improvement and Topic Tracking (Chapter 2/3)

Recession Prediction using Machine Learning

“Subprime is contained” — Bernanke (March 2007) This project presents a machine learning approach for predicting U.S. recession occurrence for 6-month, 12-month, and 24-month time frames. The predictive model uses a handful of employment, inflation, interest rate, and market indicators. Simply put: Can one predict the timing of future U.S. recessions by looking for clues … Read more Recession Prediction using Machine Learning

Analysis of car accidents in Barcelona using Pandas, Matplotlib, and Folium

Open Data Barcelona is Barcelona´s data service which contains around 400 datasets, covering a wide rage of topics such as population, business, or housing. This project was born in 2010 with the main objective of maximize available public resources, allowing companies, citizens, researcher, and other public institutions to make use of the data generated. In … Read more Analysis of car accidents in Barcelona using Pandas, Matplotlib, and Folium

How Artificial Intelligence is changing the world

The technology of AI has been improving every year for the past 20 years, and today it is a very mature technology. Many companies and organizations are actively employing AI in different ways. AI is also developing into the next generation of computing, where big ideas can come from and many people can become an … Read more How Artificial Intelligence is changing the world

Feature Selection in Python — Recursive Feature Elimination

Now the fun part can finally begin. You will need to declare two variables — X and target where first represents all the features, and the second represents the target variable. Then you’ll make an instance of the Machine learning algorithm (I’m using RandomForests). In it, you can optionally pass a random state seed for … Read more Feature Selection in Python — Recursive Feature Elimination

A story of my first gold medal in one Kaggle competition: things done and lessons learned

Or a story about a great team Taking part in kaggle competitions is a serious challenge. You need to spend a lot of time and efforts, study new things and try many tricks to get a high score. And often this isn’t enough because there are a lot of great people, who have more experience, … Read more A story of my first gold medal in one Kaggle competition: things done and lessons learned

#HKPROTEST: Visualizing State Troll Tweets From China’s Disinformation Campaign

A deeper dive, using various visualization tools, into the rhetoric behind Beijing’s social media campaign against the protests in HK. A Scattertext plot of key words in tweets and retweets by Chinese state trolls targeting the protest movement in Hong Kong. In late August, as tension grew in Hong Kong following months of street protests … Read more #HKPROTEST: Visualizing State Troll Tweets From China’s Disinformation Campaign

Getting Started with Geospatial Works

Start your journey into the Geospatial world with this guide The application of GIS is limited only by the imagination of those who use it — Jack Dangermond The term geospatial is relatively new and has been gaining popularity since the 1980s. Data which contains geographic content in it is classified as geospatial data. This … Read more Getting Started with Geospatial Works

Multivariate Differential Calculus and Optimization-Part 2

In my previous article, I introduced some concepts which are necessary if we want to set an optimization problem in a multivariate environment. Here, we will first dwell on how to check the smoothness of a surface (which is the main assumption to deploy an optimization task), then we will see how to look for … Read more Multivariate Differential Calculus and Optimization-Part 2

Multivariate Differential Calculus and Optimization-Part 1

Differential calculus is a powerful tool to find the optimal solution to a given task. When I say ‘optimal solution’, I’m referring to the result of the optimization of a given function, called objective function. This result might be either a maximum (namely, if your objective function describes your revenues) or a minimum (namely, if … Read more Multivariate Differential Calculus and Optimization-Part 1

Traffic Sign Detection using Convolutional Neural Network

We will be building a CNN model in order to detect traffic signs. CNN Model Convolutional neural networks or ConvNets or CNN’s are very important to learn if you want to pursue a career in the computer vision field. CNN help in running neural networks directly on images and are more efficient and accurate than … Read more Traffic Sign Detection using Convolutional Neural Network

Models which makes money & makes your Customers happy

Predictive model 1: Behavioural clustering Behavioural clustering informs you how people behave while purchasing: do they use the web site or the call center? Are they discount addicts? How frequently do they buy? How much do they spend? How much time will go buy before they purchase again? This algorithm helps set the right tone … Read more Models which makes money & makes your Customers happy

Artificial Intelligence and Nonprofits

How can nonprofits use AI for good? Could part of AI Safety be ensuring distribution or work towards equality? I have written before about fairness in AI; the importance of data quality; and equality relating to gender. Yet the most challenging article to write was Inequalities and AI. Is artificial intelligence truly safe if it … Read more Artificial Intelligence and Nonprofits

Models as Web Endpoints

Source: https://www.maxpixel.net/Internet-Hexagon-Icon-Networks-Honeycomb-Hexagons-3143432 An excerpt from Data Science in Production In the second chapter of Data Science in Production, I discuss how to set up predictive models as web endpoints. This is a useful skill, because it enables data scientists to shift from batch model application, such as outputting CSV files, to hosting models that other … Read more Models as Web Endpoints

Use ExPanD to Create a Notebook for Your EDA

The ‘ExPanDaR’ package offers a toolbox for interactive exploratory data analysis (EDA). You can read more about it here. The ‘ExPanD’ shiny app allows you to customize your analysis to some extent but often you might want to continue and extend your analysis with additional models and visualizations that are not part of the ‘ExPanDaR’ … Read more Use ExPanD to Create a Notebook for Your EDA

Estimating variance: should I use n or n – 1? The answer is not what you think

Estimates of population parameters based on samples are not exact: there is always some error involved. In principle, one can estimate a population parameter with any estimator, but some will be better than others. There is one particular case which was always very confusing to me (because of the multiple alternatives) and that is the … Read more Estimating variance: should I use n or n – 1? The answer is not what you think