Feature Engineering: A beginner’s guide
Turn messy real-world data into powerful machine learning features. This post shows step-by-step feature engineering: cleaning raw CSVs, crafting meaningful variables, and fixing weak inputs so your data science models actually perform.
How do you differentiate code?
This post shows how to answer “what if” questions from stakeholders using model sensitivity, scenario analysis, and data science, going beyond theory to make machine learning results clear and business-friendly.
Live Data Science Walkthrough: Making Google Trends Data Usable
Google Trends looks perfect for granular time-series analysis until you download it. This post reveals the hidden aggregation problem in Google Trends data and shows how to rebuild higher-frequency signals for machine learning and forecasting.
Domain Knowledge: The Machine Learning Unlock
Discover why predicting love with data science is a nightmare. This blog dives into a failed Valentine’s linear regression experiment, exposing bias, messy variables, and why domain knowledge matters more than ever in real-world machine learning and AI.
Dijkstra’s Algorithm Tutorial with The Simpsons
Ever wondered how route planning algorithms power Google Maps, LinkedIn suggestions, and Amazon drone delivery? This post explains graph algorithms and optimization through Bart’s trick-or-treating route in Springfield, making complex data science easy to understand.
What is incremental Computing? The Data Science Game Changer
Incremental computing is transforming data science. Instead of rerunning massive models from scratch, this breakthrough lets data scientists update results instantly: saving time, cutting costs, and reducing waste. Discover how smarter, sustainable computation is reshaping analytics.
How to Actually Use ChatGPT
Get a crash course in Large Language Models (LLMs) from a data scientist’s perspective. This blog breaks down how LLMs like ChatGPT actually work, cutting through jargon to explain AI, machine learning, and natural language processing in simple, practical terms.
WTF is a data scientist?
What do data scientists really do? This no-fluff guide explores the real world of data science, AI, and analytics from cleaning messy data to building models and explaining insights. Discover how coding, math, and storytelling power AI, Netflix, Spotify, and everyday decisions.
The 12 Days of Data Science
Discover our “12 Days of Data Science” series! Using the Twelve Days of Christmas theme, we explain core data science concepts like machine learning, network analysis, fraud detection, and forecasting in a fun, accessible way.
Can Data Science Create the Next Christmas Hit?
Can data science and machine learning craft the perfect Christmas song? Using Spotify data, music analytics, TF-IDF, and Elastic Net regression, we reveal the secrets behind hit festive tracks. Exploring lyrics, sentiment, BPM, and danceability to create a data-driven Christmas classic.
Is Die Hard A Christmas Movie?
You’ve heard this argument a million times. Some people are like, “It’s set at Christmas, there are carols, of course it’s a Christmas movie.” Other people: “It’s just an action film with tinsel, calm down.” So instead of screaming on the internet for the rest of my life, I did the only sensible thing: I built a model.
In this video I’m going to show you how I trained a simple Christmas-movie classifier using soundtrack data and basic movie metadata, and then used it to settle the age-old internet debate so we never have to talk about it again. We’re literally going to turn the Die Hard argument into machine learning.
So, what was the actual goal here? I wasn’t trying to build the biggest, fanciest AI in the world. The question was much pettier than that: can I take the stuff people always bring up in the Die Hard argument and turn it into numbers, train a model on a bunch of movies, and ask it, “Okay, based on everything you’ve seen… what do you think about Die Hard?”
I Saved The Simpsons’ Christmas Using the Traveling Salesman Problem
Santa’s running out of time in Springfield. We test Brute Force, Held-Karp, and Greedy algorithms to solve the Traveling Salesman Problem and save Christmas.
Is Black Friday back with a new Switch 2?
Black Friday is back. But are we still that into it or is that just leftover lockdown energy? Remember 2020, when Switches disappeared in minutes and we were flexing our Animal Crossing islands (and yes, the fruit absolutely mattered). With a new Switch on the horizon, is Black Friday still the best time to buy, or just a myth we love to repeat? I’m pulling some google trends data to see if 2025 can beat peak lockdown
The Magic of Data Science: How I Turned Art Into Real Data
As data scientists, people think we’re magicians. When they have a problem to solve, they’ll just bring it to us and we’ll wave our magic wand and suddenly they’ll have the perfect model for what they need. Oh, what’s that? You need data to be able to do that? Wave that magic wand again and I’m sure it’ll appear.
I Stole a Wall Street Trick for Data Science
Ever wondered why comparing Google Trends across countries feels impossible? I stumbled across this problem when I wanted to use google trends data to understand what it is that drives people.
I thought it’d be an interesting video on multilinear regression, that I’d grab the google trends data for motivation and some other clever terms and off we go, but boy was I wrong.
I Used Maths to Prove Google Trends SUCKS!
What motivates people? Is it passion? Purpose? A good cup of tea?
Having something to queue for? The promise of a Sunday roast? The fear of disappointing your Nan?
I asked my colleague Caroline what motivates people in Brazil. Her answer? Football. Alright, but I'm English, I've been singing "It's Coming Home" since birth. She then suggested beaches and carnivals. That's the differentiating factor I'm looking for. We don't have the weather for those.
It's not all cultural stereotypes. I made a cup of tea for a prop for the video which I don't drink and SHE does. (Don't let the King find out, or we're both in trouble). So if what motivates people depends a lot on where you're from. There's something interesting there. Maybe this basket will help?
Using Heart Rate Data to predict the best jump scare
It's a spooky season. That gave me a terrible, wonderful idea: let’s make my colleagues watch those jump scares scenes while we track their heartbeats. For research. And maybe for laughs. We’ve got sensors, timestamps, and a playlist designed to ruin office friendships. Let’s find out who screams first and what the data says.
AI horror is already here!
Artificial intelligence, it’s amongst us, whether we choose it to be or not, it’s in our homes, on our phones and in our daily lives. But in films Ai is less Siri? And more serial killer From murderous robots to big brother style overlords, Horror loves to imagine what could go wrong with this type of technology, sparking a new or potentially an existing fear into all of us. But how much of this is just popcorn fiction and how much could actually happen?
Saving Halloween with Data Science (Starring Bart Simpson)
It's spoooooky season which means warm jumpers, pumpkin spice and the terrifying treachery of the Simpsons Treehouse of horror! And Bart's pretty good at coming up with spooky stories at this point, but is he any good at trick or treating? Let's use data science to find out.
Creating the PERFECT horror movie!
What actually makes a horror movie good? Is it how many times you jump out of your chair or peek through your fingers? The plot? That creeping sound that grabs you from behind?
Is it what the Academy says… or what we feel in the dark? This way, you’ll know exactly why The Ring chills differently than Jaws, and what levers directors pull to make you want to look away.
Let’s test it with data science.
We’ll build a dataset: jump-scare counts, pacing, runtime, jump-scare intensity, critic and audience scores. Turn it into a clear, testable recipe for the perfect horror movie.
Less guessing. More evidence.
Let’s do some Evil Work.