Michal Pitr's Avatar

Michal Pitr

@mptr

I build code from scratch and write engineering blogs | k8s @ Google | University of Edinburgh https://open.substack.com/pub/michalpitr

22
Followers
12
Following
26
Posts
14.11.2024
Joined
Posts Following

Latest posts by Michal Pitr @mptr

Post image

PMPP is a nicely structured introduction to general purpose GPU programming.

GPUs are fundamentally different from CPUs. This impacts the type of workloads that benefit from them.

If you do pick this up, I recommend working through at least some of the exercises.

20.05.2025 13:31 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

* My matmul optimization article

I spend more time on building the intuition for some of the optimizations.
lnkd.in/dRPZmZyM

13.05.2025 11:04 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

These are some great resources to learn more:

* How to optimize GEMM

Nice step-by-step introduction by BLIS contributors.
lnkd.in/df6FdX8S

* BLISlab

Goes into much more depth than most introductory sources. Source of the illustration below.
lnkd.in/dHG6akFN

13.05.2025 11:04 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

πŸ“¦Β Packing

Packing arranges sub-matrix data contiguously. This helps performance and can reduce cache conflicts with large matrices. I cover this more in my article.

13.05.2025 11:04 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

🧱 Tiling

A fast micro-kernel doesn't help much if it needs to wait for data to be loaded from RAM. Tiling breaks down large matrices into smaller, cache-friendly blocks. The goal is to maximize data reuse from caches.

13.05.2025 11:04 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

βš™οΈΒ Optimized micro-kernel

BLAS libs often use small kernels (e.g., 8x4) optimized for specific CPUs using SIMD intrinsics and software prefetching. Many use handwritten assembly for optimal register allocation.

13.05.2025 11:04 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Matrix-matrix multiplication is a fundamental building block of many scientific and ML applications. What does it take to write efficient one?
🧡

13.05.2025 11:04 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I've published a new article on CPU-based matrix multiplication optimizations in C.

We'll learn a few things about compilers, read some assembly, and learn about the underlying hardware.

michalpitr.substack.com/p/optimizing...

15.02.2025 20:12 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
GitHub - kelseyhightower/kubernetes-the-hard-way: Bootstrap Kubernetes the hard way. No scripts. Bootstrap Kubernetes the hard way. No scripts. Contribute to kelseyhightower/kubernetes-the-hard-way development by creating an account on GitHub.

Consider checking it out: github.com/kelseyhighto...

It's always fun to see how people approach explaining technical topics. I really like Kelsey's concise style, but you might have to Google around for the initial infra setup.

05.01.2025 16:39 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

πŸ“š I'd probably skip KTHW until you are pretty comfortable with using K8s. But once you are confident and want to understand the cluster bootstrap process, this is a great place to start.

05.01.2025 16:39 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

πŸ”§ Things can appear working but be broken. I made a mistake setting up pod subnet for one worker node. This meant that pods on node-0 were reachable, but pods on node-1 weren't.

05.01.2025 16:39 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

πŸ’» Easy to setup a few QEMU VMs on a local machine connected by a virtual bridge. Each VM gets a static IP. This nicely mimics an on-prem setup.

πŸ’‘Linux networking is probably the toughest part, but it can be pretty rewarding to debug and understand.

05.01.2025 16:39 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Worked my way through Kubernetes The Hard Way yesterday.

Here're a few impressions from someone familiar with K8s internals but not so much with cluster administration:

05.01.2025 16:39 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Dev containers are real nice for C++ CUDA projects!

No chance of mismatched gpu driver and cuda toolkit, all dependencies auto installed, clean environment that's isolated from host, and super easy to setup on a new machine.

Gonna be a mainstay for all of my future C++/CUDA projects.

03.01.2025 13:23 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
From Scratch | Michal Pitr | Substack I build software from scratch and share my learnings! I especially enjoy topics around compilers, distributed systems, GPU programming, and Linux. Click to read From Scratch, by Michal Pitr, a Subst...

I love figuring out how things work under the hood. I build something like Docker or MapReduce from scratch, then write a post about it.

open.substack.com/pub/michalpitr

#promosky #promotionsky

14.12.2024 13:41 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

Oh wow, got a first pledged subscription to my substack From Scratch.

Not planning to enable payments anytime soon, but it's great to see that folks enjoy my pragmatic engineering deep dives.

#substack #softwareengineering #programming

14.12.2024 13:37 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Linux container from scratch Let's build a minimal container step-by-step in a terminal

Feel free to check it out: open.substack.com/pub/michalpi...

09.12.2024 15:30 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

My recent article on creating a container from scratch in a linux terminal popped off and doubled my subscriber count!

09.12.2024 15:28 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
How does SQLite store data?

How does SQLite store data? https://michalpitr.substack.com/p/how-does-sqlite-store-data

09.04.2024 16:07 πŸ‘ 1 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Preview
Linux container from scratch Let's build a minimal container step-by-step in a terminal

Do you understand how Linux containers work? Are cgroups and namespaces just magical blackboxes? Are containers just "light-weight VMs"?

In my latest post, I create a Linux container, step-by-step, using just terminal commands!

#softwareengineering #linux #cloud

open.substack.com/pub/michalpi...

07.12.2024 18:17 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Sharing my code since talk is cheap: github.com/MichalPitr/a...

04.12.2024 22:35 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Day 3 - Advent of Code 2024

Really liked day 3 #AdventOfCode problem! I approached it by writing a simple tokenizer that converts the input file into tokens - identifier, left parenthesis, comma, right paren, number, ..., then looked for legal sequences. Good fun!

adventofcode.com/2024/day/3

03.12.2024 22:19 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
GPU Programming Writing code for massively parallel processors

Ever wondered how to execute code on a GPU?

Wrote a hands-on short post on how to write a simple function that executes on a GPU in C CUDA.

open.substack.com/pub/michalpi...

#softwareengineering #machinelearning #ai #gpgpu

28.11.2024 09:25 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

Today GKE team celebrated the launch of 65k node clusters!

My favorite bit is that the team had to replace etcd with spanner to overcome scaling issues. ⚑

Curious to see these massive TPU/GPU training clusters in action!

26.11.2024 20:53 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Build Your Own Inference Engine: From Scratch to "7" Building a C++ Inference Engine from scratch

Crossed 300 subs on substack!

My publication focuses on intuitive hands-on explanations of complex software topics - usually after I build something like an ML inference engine from scratch.

Why not check it out?

open.substack.com/pub/michalpi...

19.11.2024 20:37 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Primer on Linux container filesystems Building a container filessytem by hand

Wrote a practical article summarizing what I learned about Linux container filesystems by building a Docker clone from scratch!

We'll reverse engineer how Docker handles it, then discuss overlayfs, and finally use it to setup an Alpine-based container filesystem.

open.substack.com/pub/michalpi...

16.11.2024 17:42 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
How does SQLite store data? What I learned by implementing (parts) of SQLite from scratch.

Ever wondered how SQLite stores data?

I wrote a hands-on post where I explain how SQLite represents data on disk. I manually navigate the B-Tree with a hex editor to find a specific entry.

Come along!

open.substack.com/pub/michalpi...

14.11.2024 18:53 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0