[NLP] Basics: Understanding Regular Expressions

The only guide you’ll ever need Photo by travelnow.or.crylater on Unsplash When I started learning natural language processing, regular expressions truly felt like a foreign language. I struggled to understand the syntax and it would take me hours to write a regular expression that would return the input I was looking for. Naturally, I tried … Read more [NLP] Basics: Understanding Regular Expressions

Understanding the Infinite Monkey Theorem

Absurdities of Probability Theory and why you cannot trust your gut instinct when guessing probabilities Imagine you have an infinite amount of monkeys. And now you give each of these monkeys a laptop and let them type randomly for an infinite amount of time. What are the chances that at some point, this story will … Read more Understanding the Infinite Monkey Theorem

The Hidden Peculiarities of Realtime Data Streaming Applications

With the increasing number of open-source frameworks such as Apache Flink, Apache Spark, Apache Storm, and cloud frameworks such as Google Dataflow, creating realtime data-processing jobs has become quite easy. The APIs are well defined, and the standard concepts such as Map-Reduce follow almost similar semantics across all frameworks. However, still today, a developer starting … Read more The Hidden Peculiarities of Realtime Data Streaming Applications

4 Common Types of Hackathons

As mentioned in the previous article, there is not only one kind of competition named hackathon. In recent years, hackathon is not only for tech-savvy persons but requires collaboration between techies, designers, and businessmen. Based on my own experience, I will classify hackathons into four main categories. I will start from the more technical one … Read more 4 Common Types of Hackathons

How the 4 Most Popular Intelligent Assistants Stack Up

A Brief Comparison of the Pros and Cons That Each Virtual Assistant Offers Image Source: UnSplash Siri Image Source: UnSplash Apple’s voice-based intelligent assistant Released in 2011 Apple announced SiriKit (SiriSDKs) in mid-2016 Features: Available on all iDevices Talks back to the user and proactively recommends actions to take Remembers context and understands relationships. Pros … Read more How the 4 Most Popular Intelligent Assistants Stack Up

Build pipelines with Pandas using “pdpipe”

The second method looks for the string drop in the Price_tag column and drops those rows that match. And finally, the third method removes the Price_tag column, cleaning up the DataFrame. After all, this Price_tag column was only needed temporarily, to tag specific rows, and should be removed after it served its purpose. All of … Read more Build pipelines with Pandas using “pdpipe”

Will NumPy become Python?

Well, without Numpy, how can we perform mathematical operations between arrays? How does Python stack up against the other statistical languages of our period? Python’s array iteration is awesome, actually. The zip() function makes it possible to iterate through two lists at the same time. array = []for f, b in zip(array1, array2): res = … Read more Will NumPy become Python?

Machine Learning and the Future of Music: An era of ML artists

Artificial Intelligence has already taken over our lives and transformed it for the good. The days are over when you could still debate whether AI will impact a certain industry and transform it like others. Wondering why? Because artificial intelligence has already penetrated every other industry that we know and continues to impact several others. … Read more Machine Learning and the Future of Music: An era of ML artists

Can Humans Fall Head Over Heels for AI?

With each passing day, we are using artificial intelligence for a variety of purposes and jobs. It has penetrated almost every industry and is helping them become innovative, develop authentic tools and build strategies towards a sustainable future. Researchers are eagerly exploring new use cases of artificial intelligence that have the power to radically transform … Read more Can Humans Fall Head Over Heels for AI?

12 Steps to Production-Quality Data Science Code

There’s a Dilbert comic in which Dilbert tells his boss that he can’t take over a co-worker’s software project until he spends a week bad mouthing the co-worker’s existing code. If you’ve ever taken over maintaining someone else’s code, you’ll immediately see the truth in this. No one likes taking over maintaining or working on … Read more 12 Steps to Production-Quality Data Science Code

How You Measure Months Matters — A Lot. A Look At Two Implementations of KDA

This post will detail a rather important finding I found while implementing a generalized framework for momentum asset allocation backtests. Namely, that when computing momentum (and other financial measures for use in asset allocation, such as volatility and correlations), measuring formal months, from start to end, has a large effect on strategy performance. So, first … Read more How You Measure Months Matters — A Lot. A Look At Two Implementations of KDA

How to Build a Restaurant Recommendation System Using Latent Factor Collaborative Filtering

Image Designed by Freepik I usually watch youtube when I am taking a break from my work. I commit to myself to watch Youtube only for 5 to 10 minutes to rest my mind. Here is what usually happens, after I finished watching one video, the next video pops out from Youtube recommendations and I … Read more How to Build a Restaurant Recommendation System Using Latent Factor Collaborative Filtering

Julia Box: Google Colab for Julia

Julia is a great language that is up and coming in the statistical computing place. Julia is actually very commonly used by biologists, medical scientists, and chemists; however, Julia for data-science, while not quite used on a large scale yet, is an idea that comes more and more feasible everyday. Julia certainly has advantages to … Read more Julia Box: Google Colab for Julia

5 Actionable advice for Data Science beginners

Here are 5 tips for everyone getting into data science Photo by Frame Harirak on Unsplash “Learn from the experts; you will not live long enough to figure it all out by yourself. “ — Brian Tracy There are myriads of ways to learn data science. You can read articles, watch videos, enroll in online … Read more 5 Actionable advice for Data Science beginners

The Russell Westbrook Effect

For 3 straight seasons, he averaged a triple double in Oklahoma City. While the regular season numbers were stupendous, it didn’t translate to much postseason success. Reunited with his old Thunder buddy James Harden, what will Westbrook’s Effect be in Houston? Oscar Robertson was the first player to average a triple double for a whole … Read more The Russell Westbrook Effect

Setting up your data strategy when you are a startup

How to realise your vision through data You have probably heard it everywhere that you need a data strategy. Buzzwords like big data, digital transformation and artificial intelligence have infiltrated your company and it seems like all your competitors know what they need to do except you. As someone who has worked on the other … Read more Setting up your data strategy when you are a startup

Advantages and Disadvantages of Artificial Intelligence

Artificial Intelligence is one of the emerging technologies which tries to simulate human reasoning in AI systems. John McCarthy invented the term Artificial Intelligence in the year 1950. He said, ‘Every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate … Read more Advantages and Disadvantages of Artificial Intelligence

5 Ways Data is Transforming the Insurance Industry

St. John’s Tobin School of Risk Conference on Transforming Insurance I recently attended a conference on the use of data in the Insurance industry at the St. John’s School of Risk in New York City and here are 5 key things I took away: All insurance companies aspire to use more data but few are … Read more 5 Ways Data is Transforming the Insurance Industry

A.I. Will Reinstate Direct Democracy

THE BRIGHT SIDE OF AI The case for optimism in the age of artificial intelligence The majority of the debates regarding the advent of artificial intelligence (AI) seems to be focused only on negative sides. AI-powered monopolies, the tyranny of the minority, jobless future, collapse of the democratic system and capitalism, global inequality, digital dictatorships … Read more A.I. Will Reinstate Direct Democracy

GRNN with Small Samples

After a bank launches a new product or acquires a new portfolio, the risk modeling team would often be faced with a challenge of how to estimate the corresponding performance, e.g. risk or loss, with a limited number of data points conditional on business drivers or macro-economic indicators. For instance, it is required to project … Read more GRNN with Small Samples

How to Double the Productivity of Your R&D Team

In 1919 Britain faced mass starvation. Although victorious in war, it was broke. Britain had been a food importer for decades, and now could neither grow enough food itself or buy it from others. Deeply in debt, its economy disrupted, politicians hoped scientists could clean up the mess. They turned to Rothamsted, an agricultural research … Read more How to Double the Productivity of Your R&D Team

Have you taught your machine yet?

How Google’s teachable machine transfer learns right inside your little browser A visual representation of what a CNN actually sees — Source Since 2015, when a Resnet first surpassed the human accuracy threshold for classifying images, deep learning has taken the world by storm. Reading research detailing such achievements usually gives one the impression that … Read more Have you taught your machine yet?

Data Science Dream is not so far with R

R is a programming language for statistical computing and data analysis. It is the industry standard for analysis tools. It was created by Ross Ihaka and Robert Gentleman in the year 1992 at the University of Auckland. It is open-source and is completely free to use. With more than 15,000 packages available online, there is … Read more Data Science Dream is not so far with R

Stop Using Word Clouds without the Context

Word clouds don’t show the relations between the words, losing the context. Text network visualization resolves this problem. Obama’s 2013 inauguration address word cloud generated with Wordle What you see above is a word cloud of Barack Obama’s 2013 inauguration address. I don’t know whether word clouds are supposed to be informative, hopefully not, because … Read more Stop Using Word Clouds without the Context

Taking a Step Back: Here’s What AI Needs to Learn from a Child

Artificial intelligence is taking the world by storm. It is already manifesting in a plethora of industries and organizations more reluctant to adopt it than ever. Several niches of artificial intelligence like machine learning are being religiously used by practitioners to form better strategies, predict industry trends and bring innovative products to the market. But, … Read more Taking a Step Back: Here’s What AI Needs to Learn from a Child

A Quick Guide to Create Astonishing Data Science Projects

Set the team culture at your first meeting: Don’t start talking about the challenge or problem, technology, etc. Instead, bond, talk about strengths and what you would like to lead — learn. Without going into details, talk about ideas of how to solve the problem. Feel comfortable with each other and build trust. Select a … Read more A Quick Guide to Create Astonishing Data Science Projects

Emerging Technology Trends for Banking Industry in 2020 & Beyond

Banks around the world are taking advantage of new technologies to streamline their operations and provide a better experience to their customers. Find out the latest trends that will disrupt banking industry in the future! Today, we live in the digital era where technology is driving change in almost every industry, whether it is the … Read more Emerging Technology Trends for Banking Industry in 2020 & Beyond

Build a Python Crawler to Get Activity Stream with GitHub API

I want to get these activities like below ShusenTang starred lyprince/sdtw_pytorchchizhu starred markus-eberts/spertHexagram-King starred BrambleXu/knowledge-graph-learningYevgnen starred BrambleXu/knowledge-graph-learning…… 2.1 GitHub API First, we take a look at GitHub API documentation. If you don’t enable the two-factor authentication, you could run the below command to test the API. After inputting the password, you should see the response. … Read more Build a Python Crawler to Get Activity Stream with GitHub API

11 Evaluation Metrics Data Scientists should be familiar with— Lessons from A High-rank Kagglers’…

Evaluation metric, a theme of this post, is a somewhat confusing concept for ML beginners with another related but separate concept, loss function. They are similar in a sense they could be the same when we are lucky enough, but it will not happen every time. Evaluation metric is a metric “we want” to minimize … Read more 11 Evaluation Metrics Data Scientists should be familiar with— Lessons from A High-rank Kagglers’…

The Top 10 Books on AI recommended by Elon Musk and Bill Gates

I used to read every book I could get my hands on. Once I started to get more into reading non-fictional books to get actual answers on career and business-related topics, I realised that that’s a terrible approach. There are just too many books out there and filtering them by looking at amazon reviews or … Read more The Top 10 Books on AI recommended by Elon Musk and Bill Gates

Host a dynamic website on Google Firebase for free using Node.js and Cloud Firestore DB

Requirements 1. Google AccountIf you don’t have a Google account, you need to sign up for one. You can do so by going to https://accounts.google.com/SignUp. 2. Node.js and npm Mac/WindowsYou can download the installer from https://nodejs.org/en/download/. LinuxFollow the steps below to install Node.js:1. Open a terminal2. Run the following commands: sudo apt-get install curlcurl -sL … Read more Host a dynamic website on Google Firebase for free using Node.js and Cloud Firestore DB

Kafka Gotchas

Great, but not Perfect I’ve assisted several large clients in building a microservices-style architecture using Kafka as a messaging backbone, having a reasonably good understanding of its abilities and the use cases that really bring them out. But I’m not a Kafka apologist by any stretch; any technology that has gone through such a rapid … Read more Kafka Gotchas

The Big Data Handbook

Why are there so many components? In the Hadoop ecosystem, there are many different layers which takes care of different components including data storage, integration, access, resource management, execution engines and operations & management. Before I lose my readers beyond this paragraph, let me provide a high level description of what the stack is trying … Read more The Big Data Handbook

Updates on RSpectra: new “center” and “scale” parameters for svds()

[This article was first published on R on Yixuan’s Homepage, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t. Per the suggestion by @robmaz, RSpectra::svds() now has two new … Read more Updates on RSpectra: new “center” and “scale” parameters for svds()

Knowledge Data Science with Semantics Technologies.

The way we represent things in our mathematical world is fundamental. Most of the theoretical advances in AI, and particularly on machine and deep learning, come from a better way to represent systems, data and find new and useful techniques to analyze it. Almost all the algorithms we have to do such tasks rely on … Read more Knowledge Data Science with Semantics Technologies.

Increasing Kaggle Revenue: Analyzing user data to recommend the best new product

In this project, we will create recommendations for increasing revenue at Kaggle, an online community for data science professionals. We will analyze a Kaggle customer survey, attempting to learn if there are any indicators of potential revenue growth for the company. To make our recommendations, we will try to learn: Is there market potential for … Read more Increasing Kaggle Revenue: Analyzing user data to recommend the best new product

Top-K Off-Policy Correction for a REINFORCE Recommender System

The problem is the following: we have multiple other policies. Let’s take DDPG and TD3 trained actors from my library. Given these policies, we want to learn a new, unbiased one in an off-policy manner. As authors put it: Off-Policy Candidate Generation: We apply off-policy correction to learn from logged feedback, collected from an ensemble … Read more Top-K Off-Policy Correction for a REINFORCE Recommender System

Hacking Google Coral Edge TPU: motion blur and Lanczos resize

Google’s Coral project has recently gone out of beta. According to the benchmarks, Coral devices provide excellent neural network inference acceleration for DIY makers. Those devices ground on the specialized Tensor Processing Unit ASIC (Edge TPU), which proved to be somewhat tricky to work with, but the enforced limitations and quirks are rewarding. I was … Read more Hacking Google Coral Edge TPU: motion blur and Lanczos resize

How different factors have an influence on your life expectancy?

How attributes associated with your country of origin define your life expectancy? Everyone has their expiration date on this planet. After this day, they are buried six feet deep under the earth to decay. Humans die as a result of multiple causes such as accidents, diseases, war and other forms of death. One interesting trend … Read more How different factors have an influence on your life expectancy?