;
;

MachineLearningMastery.com

Making developers awesome at machine learning

Training a Model on Multiple GPUs with Data Parallelism

This article is divided into two parts; they are: • Data Parallelism • Distributed Data Parallelism If you have multiple GPUs, you can combine them...
Posted on 26 December 2025 | 6:44 am

Train a Model Faster with torch.compile and Gradient Accumulation

This article is divided into two parts; they are: • Using `torch.
Posted on 25 December 2025 | 4:44 pm

Training a Model with Limited Memory using Mixed Precision and Gradient Checkpointing

This article is divided into three parts; they are: • Floating-point Numbers • Automatic Mixed Precision Training • Gradient Checkpointing Let's ge...
Posted on 24 December 2025 | 5:43 pm

Practical Agentic Coding with Google Jules

If you have an interest in agentic coding, there's a pretty good chance you've heard of
Posted on 24 December 2025 | 3:13 pm

Evaluating Perplexity on Language Models

This article is divided into two parts; they are: • What Is Perplexity and How to Compute It • Evaluate the Perplexity of a Language Model with Hel...
Posted on 23 December 2025 | 4:44 pm

3 Smart Ways to Encode Categorical Features for Machine Learning

If you spend any time working with real-world data, you quickly realize that not everything comes in neat, clean numbers.
Posted on 22 December 2025 | 3:59 pm

Pretraining a Llama Model on Your Local GPU

This article is divided into three parts; they are: • Training a Tokenizer with Special Tokens • Preparing the Training Data • Running the Pretrain...
Posted on 22 December 2025 | 4:27 am

Rotary Position Embeddings for Long Context Length

This article is divided into two parts; they are: • Simple RoPE • RoPE for Long Context Length Compared to the sinusoidal position embeddings in th...
Posted on 20 December 2025 | 3:51 pm

How to Fine-Tune a Local Mistral or Llama 3 Model on Your Own Dataset

Large language models (LLMs) like Mistral 7B and Llama 3 8B have shaken the AI field, but their broad nature limits their application to specialize...
Posted on 19 December 2025 | 9:00 am

5 Agentic Coding Tips & Tricks

Agentic coding only feels "smart" when it ships correct diffs, passes tests, and leaves a paper trail you can trust.
Posted on 18 December 2025 | 3:40 pm