What would the sky look like if Mars, Jupiter, Saturn, or Neptune were as close to us as the Moon is now? Well, other than the global calamity caused by extreme tides and general astrophysical disruption, it looks quite pretty. If Planets were as close as the Moon pic.twitter.com/bBwIPtRQ1J — Physics & Astronomy Zone (@ZonePhysics) … Read moreBecause it’s Friday: If planets were as close as the moon
So it turns out that the 1990 Michael Crichton novel Jurassic Park is, indeed, a work of fiction. (Personal note: despite the snark to follow, the book is one of my all-time favorites — I clearly remember devouring it in 24 hours straight while ill in a hostel in France.) If the monsters and melodrama … Read moreSimulating dinosaur populations, with R
Image File Format Any images are stored in memory, raster image contain pixel values arrange in regular matrix. Conversely, vector image represent geometric objects using continuous coordinates. If you scale up the raster image,resolution of the image will be lost but it does not happen in vector image. Raster image and Vector image Tagged Image File Format … Read moreImage Processing Class #0.2 — Digital Image
Nov 29, 2018 The rapid rise in AI applications, decreasing processor and memory costs have allowed the last decade to show incredible progress with Recommender Systems. Given their rising importance in the retail industry, they are undoubtedly one of the more popular topics in Artificial Intelligence. https://thedatascientist.com/wp-content/uploads/2018/05/recommender_systems.png However, creating a full-fledged, ready-for-production, recommender system can … Read moreFull-Fledged Recommender System
Dada, Instagram, and the future of AI Collage is the language of the moment, but has been for over 100 years. Lets walk through where it came from (Dada), what it’s up to now (Instagram), and why it’s integral to the future of AI (Deep Fakes, GANS, and the ingrained copy). Yesterday: Dada While the technique … Read moreWe are Collage
Nov 29, 2018 TL;DR: In this article you’ll learn how to implement sequence-to-sequence models with and without attention on a simple case: inverting a randomly generated sequence. You might already have come across thousands of articles explaining sequence-to-sequence models and attention mechanisms, but few are illustrated with code snippets. Below is a non-exhaustive list of articles … Read moreAttention Seq2Seq with PyTorch: learning to invert a sequence
Gentle introduction on Neural Networks Nov 29, 2018 This series of posts on Neural Networks are part of the collection of notes during the Facebook PyTorch Challenge, previous to the Deep Learning Nanodegree Program at Udacity. Contents Introduction Forward Pass Backward Propagation Learning Testing Conclusion 1. Introduction In the next illustration, an Artificial Neural Network is … Read moreNeural Networks II: First Contact
Nov 27, 2018 In this series of articles I will explain the inner workings of a neural network. I will lay the foundation for the theory behind it as well as show how a competent neural network can be written in few and easy to understand lines of Java code. This is the first part … Read morePart 1: A neural network from scratch — Foundation
How art and wanting to change the world led me to data science Nov 27, 2018 Data plays a crucial role in understanding the world around us. I’ve been working with data in one way or another since before I could appreciate its value. Now I’m in an immersive data science program. Here’s a little bit … Read moreWhat better time than now?
Nov 27, 2018 I recently leveled up to fatherhood. That’s why I am currently on 5 months of parental leave (thank’s to the awesome team @store2be for going along with this!). Every morning at around 5am, I leave the bedroom with my son for the kitchen so his mom can have two real hours of … Read moreMap the solar system to a place near you –A NatGeo’s MARS inspired Shiny web app
This is the next article in my series on AzureR, a family of packages for working with Azure in R. I’ll give a short introduction on how to use AzureVM to manage Azure virtual machines, and in particular Data Science Virtual Machines (DSVMs). Creating a VM Creating a VM is as simple as using the … Read moreAzureVM: managing virtual machines in Azure
What kind of data is there? Is it only numerical? Are there categorical features which could be incorporated into the model? Heads up, categorical features can be considered any type of data which isn’t immediately available in numerical form. In the problem of trying to predict housing prices, you might have number of bathrooms as … Read moreBeing a Machine Learning Engineer: 7-months in
Reflections on how data (or lack thereof) helps (or fails) policy makers in developing countries Foreword When I stood up to speak last Friday at the Steering Committee meeting between the Ministry of Education of Ivory Coast and TRECC — a partnership for transforming Education in cocoa producing regions, led by the Jacobs Foundation –, it had … Read moreThe Power of Data
Gentle introduction on Neural Networks Nov 25, 2018 This series of posts on Neural Networks are part of the collection of notes during the Facebook PyTorch Challenge, previous to the Deep Learning Nanodegree Program at Udacity. Contents Neurons Connections Layers — Neurons vs Connections 3.1 Layers of Neurons 3.2. Layers of Connections — PyTorch Example 4. Notation ambiguity: Y = … Read moreNeural Networks I: Notation and building blocks
A hands on guide for beginners on EDA and Data Science competitions Exploratory Data Analysis (EDA) is an approach to analysing data sets to summarize their main characteristics, often with visual methods. Following are the different steps involved in EDA : Data Collection Data Cleaning Data Preprocessing Data Visualisation Data Collection Data collection is the process … Read moreExploratory Data Analysis (EDA) techniques for kaggle competition beginners
The popularization of blockchain will not depend on the users understanding its operation but on the existence of friendly and effective applications that solve real problems. Nov 22, 2018 Historically, each paradigm of the internet has had its killer application: before the web, it was email, with the original web it was Google and with … Read moreBlockchain can be the new paradigm of the net
I really love blogging with Hugo+Blogdown, but unfortunately Blogdown is still mostly restricted to R (although Python is now also possible using the reticulate package). Jupyter offers a great literate programming environment for multiple languages and so being able to publish Jupyter notebooks as Hugo blogposts would be a huge plus. I have been looking … Read moreBlogging with Hugo and Jupyter
Nov 19, 2018 Photo by Randy Colas on Unsplash In the world of fake news and ideology-driven subjective media coverage, it is questionable which sources of journalism can be considered “reliable”. It happens many times that two different news outlets share two completely different takes on the same story. “Experts” point out different consequences of events … Read moreCombating media bias with AWS Comprehend
(Photo by rawpixel on Unsplash) Nov 18, 2018 It’s A Career Change. Starting out as a data scientist may be the modern version of becoming a rock star but no-one really seems to be talking about what happens a few years further into your career. Analysing big data sets. Building models. Connecting data pipelines. The challenges … Read moreBecoming An Analytics Manager Isn’t A Promotion.
Businesses from every sector are investing in a data science education programmes. Working at tech education company Decoded, I have found it fascinating to see the immense value data skills can bring to every sector — from banks and retailers, to charities and government. When embarking on such an initiative, there are plenty of strategic decisions for … Read moreTraining your staff in data science? Here’s how to pick the right programming language
Nov 13, 2018 A few weeks ago finished TGS Salt Identification Challenge on the Kaggle, a popular platform for data science competitions. The task was to accurately identify if a subsurface target is a salt or not on seismic images. Our team: Insaf Ashrapov, Mikhail Karchevskiy, Leonid Kozinkin We finished 28th top 1% and would … Read moreKaggle: TGS Salt Identification Challenge
The experiment was simple: could a machine learning (ML) model produce Golden Retriever images that people would mistake for being real? The reason for choosing dogs… was because dogs are awesome! In our current climate, we often hear the term ‘fake news’, and with ML models becoming more advanced, their ability to create non-human content … Read moreDOGNET: can an AI model fool a human?
If you missed the 1st installement of this series, Humans vs Robots is here. Prompted by advances in Generative Adversarial Networks (GAN), a year ago I tweeted a thread about telling apart pictures taken with a camera from generated pictures. Here is the updated version of that thread. A few of my tips are still … Read moreTelling Apart AI and Humans: #2 Photo VS GAN generated image
The future of data storage What is Data? How is it stored, processed, transferred? What is the cloud? Will we eventually run out of space?! These are the questions that populated my fatigued mind as I tried to relax after a long day at the Flatiron School. [Disclaimer: an immersive program will do that you]. As … Read moreData Apocalypse!
Quantum computing is becoming visible in the tech world. There are over a dozen of hardware companies, each trying to build their own quantum computer, from small startups like Xanadu through medium-sized ones like D-Wave or Rigetti to large enterprises like Google, Microsoft or IBM. On top of that there are couple of dozens software … Read moreQuantum advantage
Image Source: www.mapr.com/products/apache-hadoop/ There are many links on the web about install Hadoop 3. Many of them are not working well or need improvements. This article is taken from the official documentation and other articles in addition of many answers from Stackoverflow.com Note: All prerequisites must be applied on name node and data nodes First, … Read moreInstalling Hadoop 3.1.0 multi-node cluster on Ubuntu 16.04 Step by Step
Nov 5, 2018 What is PyTorch? It’s a Python-based package to serve as a replacement for Numpy arrays and to provide a flexible library forDeep Learning Development Platform. As for the why I prefer PyTorch over TensorFLow can be learned from this Fast AI’s blog post for the reason to switch to PyTorch. Or simply put, … Read morePyTorch 101 for Dummies like Me
Austrian Quant The Austrian Quant is named after the Austrian School of Economics which serves as the inspiration for how I structured the portfolio. I designed a trading strategy composed of 3 different investment funds to gain a better understanding of investments, machine learning and programming and how they all combine together in the world … Read moreThe Austrian Quant: My Machine Learning Trading Algorithm Outperformed the SP500 For 10 Years