To Data & Beyond

To Data & Beyond

Why Your Machine Learning Projects Won’t Land You a Job

The 5 levels of ML projects

Sep 09, 2025
∙ Paid

Get 50% off for 1 year

This article is written by Marina Wyss! If you liked her writing, make sure to subscribe to her newsletter!

Want to know what separates entry-level machine learning projects from the systems powering companies like Google and Amazon? The gap might seem impossibly wide, but there’s actually a clear progression most ML practitioners follow.

Today, I’m mapping out the five levels of machine learning projects that separate complete beginners from industry leaders. By the end of this post, you’ll understand exactly where you are on this journey and what specific skills you need to reach the next level.

Many aspiring ML Engineers get stuck building the wrong types of projects that never actually land them jobs. I’ll show you exactly what level of project you need for different roles — from entry-level positions to research teams at top AI companies.



LEVEL 1: ENTRY-LEVEL DATA ANALYSIS

Get 50% off for 1 year

Let’s start at the beginning. Level 1 is where every journey begins — working with clean, structured datasets in a Jupyter notebook on your laptop.

At this level, you’re downloading pre-cleaned datasets from sources like Kaggle. You’ll import libraries such as pandas for data manipulation, use matplotlib or seaborn — and maybe even Plotly for interactive visualizations — and experiment with scikit-learn to train basic models like linear regression or logistic regression.

A typical project might look like this:

  • Load a CSV file into a DataFrame.

  • Spend time on exploratory data analysis (EDA) with simple visualizations.

  • Handle missing values by dropping them or filling them with means.

  • Encode categorical features using one-hot encoding.

  • Train a model using default parameters.

  • Evaluate with basic metrics like accuracy.

All of this happens in notebooks where you mix code, comments, and visualizations — which is perfect for learning and getting immediate feedback.

Get All My Books With 40% Off

But, as we all know, these little projects are a far cry from real-world ML applications. Your pristine Kaggle datasets rarely have the messy issues of real data, and you’re not yet thinking about data leakage, sophisticated data imputation, scalability, or literally dozens of other considerations.

When you start feeling limited by these boundaries, it’s time to move on to Level 2.


LEVEL 2: STRUCTURED ML PROJECTS

Get 50% off for 1 year

At Level 2, things get more interesting — and a little more challenging. You’re now working with messier, more realistic data and structuring your projects like a professional data scientist rather than just messy experiments in notebooks.

Your tools and workflow have evolved in the following ways:

  • You’re moving from a single notebook to a well-organized Python project with separate modules for data processing, feature engineering, model training, and evaluation.

  • You use Git for version control, and you’re creating configuration files to keep experiments reproducible.

  • Instead of random shuffling, you’re using proper train/validation/test splits — often with things like walk-forward validation for time-series data.

  • You’re tackling issues like class imbalance using techniques like SMOTE or adjusting class weights and applying modern feature engineering tools.

  • You might be using more interesting models like LightGBM, simple neural networks, or even AI APIs.

  • You’re thinking about hyperparameter tuning and maybe even experimenting with more advanced options like Bayesian Search.

  • And perhaps you’re making a simple pipeline with tools like Prefect.

Imagine a typical Level 2 project. This could be something like:

  • Building a customer churn prediction model using data from multiple sources like transaction records, support interactions, and usage logs.

  • Handling imbalanced classes and performing feature selection to identify the most predictive variables.

  • And evaluating your model using precision-recall curves, ROC curves, and business-specific metrics.

Get All My Books With 40% Off

This is the stage where your work becomes structured and robust. But when your manager or client says, “Great model! When can we use this?” you quickly realize there’s a whole world of production challenges waiting for you. That’s when it’s time for Level 3.


LEVEL 3: PRODUCTION-READY ML

Level 3 is the transformation from pure data science to the world of machine learning engineering — where your models have to work in production, serve real users, and drive business outcomes.

Get 50% off for 1 year

User's avatar

Continue reading this post for free, courtesy of Youssef Hosni.

Or purchase a paid subscription.
© 2026 Youssef Hosni · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture