Special Feature: Our Expanding Global Village
TDS Special Feature Expanding cities and a shrinking countryside Image from Unsplash, created by Louis Reed Are you a city dweller? You’re not alone! Our global urban population has reached … Read more
TDS Special Feature Expanding cities and a shrinking countryside Image from Unsplash, created by Louis Reed Are you a city dweller? You’re not alone! Our global urban population has reached … Read more
Deterministic trends vs stochastic trends, and how to deal with them Photo by Ali Abdul Rahman on Unsplash Detecting and dealing with the trend is a key step in the … Read more
How to “boost” your cyclical sales data forecast with LightGBM and Python Hierarchical time series forecasting (Image drawn by the author) Welcome to another edition of “The Kaggle Blueprints,” where … Read more
An overview of the research landscape combining structured and unstructured knowledge in NLP Photo by Billy Huynh on Unsplash This post is based on our AACL-IJCNLP 2022 paper “A Decade … Read more
If you don’t have real data, you got to fake it! Photo by Leif Christoph Gottwald on Unsplash In the world of data analytics, getting your hands on a good … Read more
From a digital elevation model to a 3D mesh Photo by Planet Volumes on Unsplash 3D meshes can be used to represent geographic data such as terrain, buildings and other … Read more
Mastering Time Series Forecasting with Machine Learning How your choice of loss function can make or break your time series forecasts Photo by Dan Asaki on Unsplash In this post … Read more
How to reduce the offline-online evaluation gap A recommender system jumping through the offline-online evaluation gap (AI-generated image) In this tutorial, I will give some rationales about why one should … Read more
A method for well-distributed approximation of the Pareto front Image generated by author with DALL-E by OpenAI Multi-objective optimization is a critical area of research that has wide-ranging applications in … Read more
Infrastructure as Code With Terraform Image from Unsplash by Krishna Pandey Infrastructure as Code (IaC) is an essential concept to optimize and take your resources and infrastructure to production. IaC … Read more
Data Science Get the Most Out of Your Data with These Simple and Effective Pandas Visualization Techniques Photo by Choong Deng Xiang on Unsplash Data visualization is an important aspect … Read more
Test Effectiveness of any Treatment over Time with PCA Photo by Raul Petri on Unsplash The standard, presented in the literature and adopted at large scale by companies, to study … Read more
A step-by-step guide to develop a map-based application (Part IV) Photo by Caspar Camille Rubin on Unsplash Maps are a powerful tool for visualizing and understanding geographic data but they … Read more
Creating the artifacts and deploying the model on the cluster In part 1, we learned how to use terraform to set up and manage our infrastructure conveniently. In this part, … Read more
The need for proper validations and good datasets, objective and balanced, and that predictions be useful in realistic scenarios. Photo by Tingey Injury Law Firm on Unsplash Recent research in … Read more
Why causing your model to fail is the best thing you can do Photo by Michael Dziedzic on Unsplash When developing a new model or algorithm, it is tempting to … Read more
How to convert infinite variables to a discrete space using tile coding and Python Photo by Ehud Neuhaus on Unsplash This article is a continuation of the Reinforcement Learning series. … Read more
Learn How To Efficiently Write Data To Parquet Format Using Pandas, FastParquet, PyArrow or PySpark. Photo by Dominika Roseclay In today’s data-driven world, efficient storage and processing of large datasets … Read more
Explore changes in rank over time using only Matlpotlib Photo by blueberry Maki on Unsplash There is nothing so stable as change. Bob Dylan When I was a teenager I … Read more
We use table variables in DAX all the time. But what when we need to calculate intermediary results and reuse them later in a DAX Measure? This challenge sounds easy, … Read more
NLP | MULTIMODALITY | CHAIN OF THOUGHTS | The world is not only text: How to extend the chain of thoughts to image and text? photo by Giulio Magnifico on … Read more
Boost your Python skills with this comprehensive class reference Image by Lukas Bieri on Pixabay When it comes to classes, many Python developers struggle, for a lot of reasons. Firstly … Read more
How minor modifications can make a major difference to the communicative power of your visualizations Photo by Mohamed Hassan on Pixabay Look at that poor guy above. He’s been grinding … Read more
Start storing models, parameters, pipelines, data and plots changing only one parameter Photo by Hans Reniers on Unsplash Introduction The MLflow Tracking component is an API and UI for logging … Read more
Photo by Kampus Production from Pexels: https://www.pexels.com/photo/a-judoka-throwing-an-opponent-to-the-ground-6765024/ Using pose estimation with mediapipe to track Jiu Jitsu movements Brazilian Jiu-Jitsu is a martial art that has been getting a lot of … Read more
Scikit Learn or Pandas? One hot encoding is a popular method to represent categorical data (All images by author) Both sklearn.preprocessing.OneHotEncoder and pandas.get_dummies are popular choices (well, practically the only … Read more
Decorators provide a new and convenient way for everything from caching to sending notifications Photo by Elena Mozhvilo on Unsplash At first, every developer’s goal is to get things working. … Read more
Image generated using Stable Diffusion Part 2 of the tutorial on how to run and evaluate experiments without leaving your IDE In the previous article of this series, I demonstrated … Read more
Most common hyperparameter optimization methodologies to boost machine learning outcomes Photo by Jonas Jaeken on Unsplash Usually the first solution that comes to mind when trying to improve a machine … Read more
Use natural language to test the behavior of your ML models Imagine you create an ML model to predict customer sentiment based on reviews. Upon deploying it, you realize that … Read more
A guide on how to make a customized commuter’s map of your home city using data science and Google Maps API Last year I got a job in Massachusetts at … Read more
Development of a machine learning model for materials planning in the supply chain In today’s rapidly changing environment, one of the most critical challenges facing companies is the ability to … Read more
Thanks to PCA’s sensitivity, it can be used to detect outliers in multivariate datasets. Photo by Andrew Ridley on Unsplash Principal Component Analysis (PCA) is a widely used technique for … Read more
Data analysis and visualization with Python, SQLAlchemy, and Bokeh Smart EQ Car, Image Source https://en.wikipedia.org/wiki/Smart_electric_drive When was the first electric car registered? (Spoiler: it was much earlier than most people … Read more
How to adjust CATE to consider costs associated with your treatments Photo by Remy Gieling on Unsplash Getting customers to come back to your business is hard. In the age … Read more
What happens and what you can/can’t do when merging evolutive DataFrames Photo by McDobbie Hu on Unsplash In the last post, we covered the transaction log and how to keep … Read more
Raw insights from IBM’s biggest transformation projects Legacy Waterfall and V-Models of software development rely on heavy upfront time investments to specify solutions, integrations or features, usually leading to extensive … Read more
Introduction to Predicting Treatment Effects on Recipients of Marketing Campaigns Image by geralt @ pixabay.com Imagine that you are responsible for selecting the customers in your company who should receive … Read more
Opinion The contribution of the AI Act and its worldwide impact Photo by DeepMind on Unsplash In mid-2021, the European Commission proposed a new EU regulatory framework on Artificial Intelligence … Read more
The basic structure of a flask application to allow for modular development Photo by KOBU Agency on Unsplash A Data Science workflow often involves the use of notebooks and python … Read more
How to train and validate ML models on multioutput datasets with minimal coding effort Photo by Victor Barrios on Unsplash Introduction The standard machine learning tasks everyone is familiar with … Read more
How to efficiently make your data ready-to-use Photo by Kelly Sikkema on Unsplash Those who have worked on real-life tasks with real-life data know that the bigger slice of pie … Read more
Opinion Having a hammer and seeing everything as a nail Photo by Sergei Chuyko on Canva Recently, I had an opportunity to meet with a data science team from a … Read more
Text everywhere! Since the Internet was spread around the world, the amount of textual data we generate everyday is ginormous. Only textual messages sent everyday, it is estimated that there … Read more
100+ new metrics since 2010 Image from Pixabay An evaluation with automatic metrics has the advantages to be faster, more reproducible, and cheaper than an evaluation conducted by humans. This … Read more
Use pandas aggregate to speed up your data analysis Photo by Dan Hadar on Unsplash Have you ever got a cramp typing same-same-but-different code to calculate various statistics on the … Read more
Prevent staying stuck at the junior level of Python programming Photo by Austin Distel on Unsplash Disclaimer: I’m not affiliated with any of the courses or people mentioned in this … Read more
Data curation in computer vision lacks standardization, leaving many practitioners unsure how to do it correctly. We summarized some of the most common approaches. Where to start with data curation? … Read more
Opinion Qualitative methods, data manipulation, and media sources — as well as a detailed look into how numbers can lie Photo by Laurenz Kleinheider on Unsplash Much of the craze … Read more
TUTORIAL Explore the mysteries of quantum computing! Source: Stable Diffusion. Whenever I learn a new programming language or framework, I often find myself trying to program a game as a … Read more