I just spent way too much money on books.
In a field that moves as fast as AI and machine learning, it's really tempting to think you can learn everything from blog posts, YouTube videos, and whatever the algorithm serves you. For staying current on news, that's fine. But for actually understanding things deeply, nothing beats sitting down with a well-structured book and working through it.
So I went through my to-read list and bought the 10 books that would actually address specific holes in my skills. Today I'm sharing them with you, including why I picked each one.
Let's get into it.
The first category is what I'm calling "fill in the gaps" books. These are foundational topics I either skipped over or have learned piecemeal on the job.
Book 1 — Computer Systems: A Programmer's Perspective
I have largely self-taught CS skills. I didn't do a traditional CS degree, and while that hasn't held me back career-wise, there are foundational concepts I'm shakier on than I'd like to be.
This book — Computer Systems: A Programmer's Perspective — is basically THE undergraduate systems textbook. It's used at Carnegie Mellon, Stanford, and dozens of other CS programs. It covers everything from data representation and machine-level code to memory hierarchy, linking, and virtual memory.
What I'm hoping to get from it is a more detailed understanding of performance. Right now when I optimize code, I'm often doing things I only have a high-level understanding of. I want to actually understand what's happening at the hardware level so I can reason about it properly instead of just trying a bunch of things until something works.
But understanding the hardware is only part of the picture. The other foundational piece I've been patching together on the job is data engineering.
Book 2 — Fundamentals of Data Engineering
Data is foundational to every AI/ML system. You can have the most sophisticated model architecture in the world, but if your data pipelines suck, so will your results.
I learned most of what I know about data engineering from watching DEs on my team, getting feedback when I messed things up, and debugging pipeline failures. It's been largely reactive and ad-hoc. I can build a pipeline that works and has guardrails, but I'm sure I could be designing them to be more robust and efficient.
This book (Fundamentals of Data Engineering) covers the full data engineering lifecycle — generation, storage, ingestion, transformation, and serving — and importantly, it covers how to think about these systems, not just how to implement them. I want to be more strategic about data architecture decisions upfront rather than just fixing problems as they emerge.
Of course, well-designed data pipelines are only as good as the code that implements them. Which brings me to the next book.
Book 3 — High Performance Python
Here's a sad fact: like many of us, in the era of AI I'm getting worse at actually writing code. I rely on Claude to generate much of what I write, and while the output is often correct, I'm getting lazy about making sure it's actually as well-written as it could be.
High Performance Python focuses on (wait for it) Python performance— profiling, understanding the GIL, using multiprocessing and multithreading effectively, Cython, and all the ways you can make Python not be slow. It also covers high-data-volume programs specifically, which is obviously relevant for ML work.
I'm hoping this helps me not only avoid forgetting how to code, but actually level up.
Book 4 — Generative AI Design Patterns
I loved Chip Huyen's AI Engineering book, but it's fairly high-level and things have moved fast since it came out. I wanted something more tactical — specific patterns I can apply when I hit specific problems.
Generative AI Design Patterns covers 32 design patterns for production GenAI systems. Each pattern addresses a specific challenge: hallucinations, nondeterministic outputs, knowledge cutoffs, building reliable agents, optimizing for latency and cost. And each one includes code examples and discusses the tradeoffs involved.
I know how I build AI systems, and I read papers and blog posts from other companies, but everyone's approach is slightly different. So I think seeing some formalized design patterns will be really helpful for my day-to-day.
That's the software side of building AI systems. But what about the hardware they actually run on?
Book 5 — Hands-On GPU Programming with Python and CUDA
I use GPUs constantly in my work. But if I'm being honest, I treat them like black boxes. I know PyTorch abstracts away the GPU stuff, and I know roughly that parallelization is happening, but I don't really understand what's going on at the CUDA level.
This book — Hands-On GPU Programming with Python and CUDA — covers PyCUDA, scikit-cuda, profiling with Nsight, and the actual CUDA libraries like cuBLAS and cuFFT. It goes all the way down to writing GPU kernels and device functions in CUDA C, then back up to applying this to data science problems.
What I'm hoping to get is the ability to actually debug and optimize GPU code when something's slow, rather than just hoping the framework handles it. As models get bigger and compute costs matter more, this kind of low-level understanding becomes increasingly valuable — especially in my role as an Applied Scientist.
That book is about optimizing for the GPUs you have. This next one is about what happens when you don't have much compute at all.
Book 6 — Practical Deep Learning for Cloud, Mobile, and Edge
I've built plenty of production deep learning systems, but they've all had essentially unlimited compute available. I've never had to think that hard about model size, inference latency on constrained devices, or deployment to mobile or edge environments.
This book — Practical Deep Learning for Cloud, Mobile, and Edge — covers real-world deployment across the full spectrum — cloud, mobile, browsers, and edge devices like Raspberry Pi and NVIDIA Jetson.
What drew me to this specifically is that mobile and edge deployment is becoming more important as AI moves to the device level. I don't want this to be a blind spot in my career.
Once you're deploying models to production, no matter where that is, there's another consideration that I think a lot of AI/ML people underestimate: security.
Book 7 — The Developer's Playbook for Large Language Model Security
This one is about not being the person who causes a major security incident at work.
I only know about LLM security from scattered mentions in other resources and my own ML security intuition. But prompt injection, data poisoning, and supply chain attacks are becoming real problems as these systems get deployed more widely. I'd rather learn this stuff proactively than after something goes wrong.
This book was written by Steve Wilson, who led the OWASP Top 10 for LLM Applications project — a comprehensive security vulnerability list built by over 400 industry experts. It covers the actual attack vectors, explains the mechanisms behind them, and provides defensive strategies. What I like is that it's specifically about LLMs, not generic AI security. The threat model is different, and this book is focused on exactly the systems I'm building.
Speaking of models, we're taking a detour with this next book.
Book 8 — Models of the Mind: How Physics, Engineering and Mathematics Have Shaped Our Understanding of the Brain
This book — Models of the Mind — is different from the others. It's not about building things — it's about understanding where the ideas behind neural networks actually came from.
Grace Lindsay is a computational neuroscientist at NYU, and she traces how mathematical models have helped us understand the brain — from individual neurons up through memory, perception, movement, and decision-making. The book covers information theory, network theory, Bayesian inference, and builds up to the artificial neural networks that underpin modern AI.
What I find fascinating is that AI and neuroscience have influenced each other in both directions. Convolutional neural networks were inspired by the visual cortex, but then scaled-up CNNs started modeling the visual cortex better than we expected. I don't expect this one to have a specific impact on my career. It's just neat.
Alright, back to books with more direct career implications.
Book 9 — The Staff Engineer's Path
I'm currently a Senior Applied Scientist (L6) at Twitch, and I'm ambitious about where I want to go next in my career. The challenge is that there aren't many Principal-level (L7) Applied Scientists at my company to learn from directly. The path is less well-defined than it was getting to Senior.
The Staff Engineer's Path by Tanya Reilly covers how to navigate growth as a senior individual contributor — thinking strategically, leading projects without formal authority, building technical vision, and dealing with the ambiguity that comes at higher levels. It's specifically written for ICs who want to grow without moving into management. So, perfect for me!
And finally, what's the point of reading all these books if I don't actually remember anything?
Book 10 — Make It Stick: The Science of Successful Learning
This last book, Make it Stick, synthesizes cognitive psychology research on how learning actually works. The core finding is that most common study habits (e.g. highlighting, rereading, cramming) create the illusion of mastery but don't produce durable learning. What actually works is self-testing, spaced repetition, interleaving different topics, and embracing difficulty rather than avoiding it.
I'm including this because I spend a huge amount of time learning, and also teaching others how to learn effectively. I want to be more intentional about applying evidence-based learning techniques to everything else on this list.
Reality Check
Am I really going to read 10 books while working full-time and creating content?
Yes, I think so. Some of these might end up being reference books rather than cover-to-cover reads, but that's fine as long as I'm learning. If you want more on how I think about time management and productivity, I cover it in my newsletter.
And if you want a breakdown of the best AI and ML books for beginners, let me know in the comments and I'll put that together.
— — —
If you're feeling like you need some support with your AI/ML career, here are some ways I can help:
- If you'd like to chat 1:1, you can book a call with me here.
- Subscribe to my YouTube channel for weekly videos on technical topics, interviewing strategies, and more.
- Subscribe to my newsletter for a weekly post on a mix of technical topics and mindset/motivation for challenging fields.