Learning ML Through 30 Day Projects

A slight recalibration that has lead to three potential projects and a 30 day countdown!

Jan 13, 2025

I had to recalibrate my first week.

Jeffrey Codes, a YouTuber, suggested that the amount of work required to do the math for ML is not quite as realistic as if you were working in AI. And he also suggested that using MathAcademy is highly recommended, because you will feel like you are accomplishing something.

But what really kicked me into gear was his recommendation to turn learning ML into a 30 day project incremental process – so that I could actually gain knowledge from real world activity and have something to show for what I learned over 30 days.

So— below is my redundant prompt to ChatGPT and the results. And here’s a quick hint, I’ve started on Project 2, which felt more challenging. Next week - I’ll share some insights on what I’ve accomplished.

Prompt

I realized that the approach as outlined in the roadmap for the next twelve months is designed to educate and improve my skills in ML -- but it pigeonholes my work into separate silos. And it requires context switching on a large scale. Instead - can you create three potential and cleverly unique projects I can select from, that I must complete within 30 days - that fit in the scope of the monthly roadmap outline you originally provided? These projects should define a clear process that will require me to build something, post it in GitHub, and learn from the various silos in the process. This would provide an improved learning experience that would be engaging and much more productive.

Response

Here are three potential, cleverly unique 30-day projects based on the outlined roadmap for Months 1-2. These projects are designed to integrate multiple silos—research, math, Python, and ML workflows—into cohesive learning experiences. Each project requires building something, posting it on GitHub, and reflecting on the process.

Project 1: Data Storytelling with Visual Insights

Objective: Analyze a real-world dataset to uncover patterns and trends, then create a visually compelling report that explains your findings.

What You’ll Learn:

Data manipulation with NumPy and Pandas.
Data visualization using Matplotlib and Seaborn.
Basic statistics and probability for meaningful insights.
Communication and storytelling with data.

Step-by-Step Plan:

Choose a Dataset:
- Use public datasets from Kaggle or UCI ML Repository (e.g., COVID trends, global happiness, or housing prices).
Data Exploration and Cleaning:
- Load the dataset into Python using Pandas.
- Clean the data: handle missing values, detect outliers, and standardize formats.
Analyze and Summarize:
- Calculate basic descriptive statistics: mean, median, mode, variance, and correlation.
- Identify patterns (e.g., time trends, group differences).
Create Visualizations:
- Build at least three compelling visualizations using Matplotlib or Seaborn.
- Examples: histograms, scatter plots, line charts, heatmaps.
Deliverables:
- Post your cleaned dataset, Python notebook, and visualizations on GitHub.
- Write a short README explaining your findings and learning process.

Project 2: Build a Simple Linear Regression Model

Objective: Create a linear regression model from scratch to predict a numerical outcome based on real-world data.

What You’ll Learn:

Linear algebra concepts (e.g., matrix operations).
Python implementation of regression models (using both libraries and manual computation).
Model evaluation techniques (e.g., RMSE, R²).

Step-by-Step Plan:

Choose a Dataset:
- Pick a dataset with numerical predictors and a target variable (e.g., Boston housing prices, car fuel efficiency).
Understand the Math:
- Brush up on the concept of linear regression, including:
  - The formula: y=mx+by = mx + by=mx+b
  - Loss functions and gradient descent (optional for depth).
Implement Linear Regression:
- Manually calculate coefficients using Python and NumPy (no libraries like Scikit-learn for now).
- Compare your implementation against Scikit-learn's LinearRegression.
Evaluate the Model:
- Split the data into training and testing sets.
- Evaluate model performance using RMSE, R², or MAE.
Deliverables:
- Post your Python script/notebook on GitHub.
- Include a README explaining:
  - The dataset.
  - How you implemented linear regression.
  - Lessons learned.

Project 3: Build a Simple "Common Sense AI" Chatbot

Objective: Create a simple chatbot that uses rules and basic probability to respond to user inputs (no ML models yet).

What You’ll Learn:

Python basics for handling input/output.
Probability and statistics for generating responses.
Data structures like dictionaries and arrays.

Step-by-Step Plan:

Set a Theme:
- Decide on the chatbot’s purpose (e.g., a weather bot, movie recommendation bot, or trivia bot).
Write a Rule-Based Engine:
- Use dictionaries to map user inputs to responses.
- Example: Input "weather today?" → Output "Sunny with a high of 70°F."
Incorporate Probability:
- For open-ended inputs, write a function that selects responses probabilistically.
- Example: "How are you?" → Responses: "I'm fine," "Doing great!" (weighted probabilities).
Extend the Bot:
- Allow the bot to learn one or two new responses during the session and save them for the next session (basic file handling).
Deliverables:
- Post the chatbot’s Python script on GitHub.
- Include a README explaining how it works and possible future improvements.

How to Choose the Right Project

Choose Project 1 (Data Storytelling) if you enjoy working with data and want to build foundational data analysis skills.
Choose Project 2 (Linear Regression) if you’re eager to dive into math and core ML concepts.
Choose Project 3 (Chatbot) if you prefer creative programming and want an engaging introduction to how AI systems interact with users.

Each project has clear deliverables and integrates key components of your learning roadmap into a practical, engaging task. Let me know which one resonates most with you or if you'd like additional details!

The Final Algorithm

Discussion about this post

Ready for more?