Really nice introduction. Two things stood out to me that I think set this apart from the dozens of "intro to PyTorch" posts out there:
1. The histogram visualization of the different tensor initialization functions is a great idea. I've seen so many beginners confused about rand vs randn vs empty, and seeing the distributions side by side makes the differences immediately obvious. More tutorials should lead with "the best way to understand is to see it."
2. I appreciate that the article is honest about its own results. A lot of intro tutorials quietly pick a dataset where their simple model gets impressive numbers. Here the model gets 18.6% MAPE and only 37% of predictions within 10% — and instead of hand-waving, the author correctly diagnoses the issue: the features don't capture location granularity, and no amount of architecture tuning will fix missing information. That's arguably the most important ML lesson in the whole piece, and it's buried at the end almost as an afterthought. "Great models can't compensate for missing information" is something I wish more practitioners internalized early.
The suggestion to reach for XGBoost/LightGBM for tabular data is also good advice that too many deep learning tutorials omit. Would love to see a follow-up comparing the two approaches on this same dataset.
0bytematt 23 hours ago [-]
Thank you so much. Really appreciate the thoughtful feedback!
I've watched many intros. Somehow they always end with 90%+ accuracy and that was just not my experience while learning on datasets I picked myself. I remember spending hours tuning different parameters and not quite understanding why I was getting way worse accuracy. I showed this intentionally, and I'm glad you commented on this!
Interesting article. It would be really useful if you have added a full article title to the page meta data, so it would get bookmarked with title. I assume one does not require GPU to try out simple examples provided?
slashtom 7 hours ago [-]
This does an honest good job of walking through the beginnings, I would still say understanding/decomposing a decision tree and going through the details and choices /trade offs one makes with how they prepare the tree like binary split or discrete/binning for continuous data. What reducing entropy means, etc. Maybe even start with parametric versus nonparametric modeling pros/cons. You really get to see how probability and statistics is applied in the formulas that eventually will be thrown into a dot function in python.
There is a lot of content on pytorch, which is great and makes a ton of sense since it's used so heavily, where the industry needs a ton of help/support in is really the fundamentals. Nonetheless, great contribution!
pjmlp 15 hours ago [-]
Very nice overview, however just like 30 years ago, neural networks and deep learning stuff is not for me, regardless of the tutorials.
Yet, 2D and 3D graphics feel relatively natural, maybe because at least I can visualize that kind of math.
KeplerBoy 12 hours ago [-]
Are the gradient visualizations not doing it for you?
Of course it kind of breaks down as the gradient can no longer be visualized as an arrow in 2D or 3D space and not all concepts transfer as easily to higher dimensions, as one would hope, but some do.
pjmlp 9 hours ago [-]
It is quite different, because one thing is to look to a math expression like SDF and understand the 3D shape that comes out of it, the math behind a demoscene plasma field, or a ray traced shape.
Other is making heads of tails of what a neural network with backpropagation means.
noisy_boy 7 hours ago [-]
This was quite accessible. If I had to pick one point, I wish there was more "handholding" from gradient to gradient-descent i.e. in the style of the math-focused introduction of the function with one parameter, two parameters etc that was done. It felt a bit of sudden jump from the math to the code. I think the gentle introduction to the math is very valuable here.
tl2do 1 days ago [-]
The PyTorch3D section was genuinely useful for me. I've been doing 2D ML work for a while but hadn't explored 3D deep learning — didn't even know PyTorch3D existed until this tutorial.
What worked well was the progressive complexity. Starting with basic mesh rendering before jumping into differentiable rendering made the concepts click. The voxel-to-mesh conversion examples were particularly clear.
If anything, I'd love to see a follow-up covering point cloud handling, since that seems to be a major use case based on the docs I'm now digging through.
Thanks for writing this — triggered a weekend deep-dive I probably wouldn't have started otherwise.
12 hours ago [-]
lappa 24 hours ago [-]
Good post. I think you mixed torch.eye with torch.full though
0bytematt 23 hours ago [-]
You're right! It's wrongly labelled on the image. Thank you for letting me know. Will fix it.
jcattle 14 hours ago [-]
Are there other similar tutorials like this going into fundamentals of model architectures for example? Something like https://poloclub.github.io/cnn-explainer/ for example
rwarren63 8 hours ago [-]
Really awesome resource, thanks for posting.
gukov 15 hours ago [-]
Did not expect to see a 3D model of Quake 2's grenade launcher in a PyTorch tutorial today.
0bytematt 11 hours ago [-]
Yes! Love that you recognised it! Way more fun than yet another 3D model of a chair or a teapot.
alkh 1 days ago [-]
Thank you, this seems like a very good intro to newcomers! Would be cool if you could continue these series with a few more advanced lessons as well
0bytematt 23 hours ago [-]
Thank you! That's the plan. I was thinking of writing a 3D mesh classifier explainer next that'll build on these concepts.
trcf23 1 days ago [-]
Very nice, thanks! It’s great to be able to play with viz!
For a deeper tutorial, I highly recommend PyTorch for Deep Learning Professional Certificate on deeplearning.ai — probably one of the best mooc I’ve seen so far
1. The histogram visualization of the different tensor initialization functions is a great idea. I've seen so many beginners confused about rand vs randn vs empty, and seeing the distributions side by side makes the differences immediately obvious. More tutorials should lead with "the best way to understand is to see it."
2. I appreciate that the article is honest about its own results. A lot of intro tutorials quietly pick a dataset where their simple model gets impressive numbers. Here the model gets 18.6% MAPE and only 37% of predictions within 10% — and instead of hand-waving, the author correctly diagnoses the issue: the features don't capture location granularity, and no amount of architecture tuning will fix missing information. That's arguably the most important ML lesson in the whole piece, and it's buried at the end almost as an afterthought. "Great models can't compensate for missing information" is something I wish more practitioners internalized early.
The suggestion to reach for XGBoost/LightGBM for tabular data is also good advice that too many deep learning tutorials omit. Would love to see a follow-up comparing the two approaches on this same dataset.
I've watched many intros. Somehow they always end with 90%+ accuracy and that was just not my experience while learning on datasets I picked myself. I remember spending hours tuning different parameters and not quite understanding why I was getting way worse accuracy. I showed this intentionally, and I'm glad you commented on this!
The XGBoost comparison is a great idea.
https://0byte.io/articles/neuron.html
https://0byte.io/articles/helloml.html
He also publishes to YouTube where he has clear explanations and high production values that deserve more views.
https://www.youtube.com/watch?v=dES5Cen0q-Y (part 2 https://www.youtube.com/watch?v=-HhE-8JChHA) is the video to accompany https://0byte.io/articles/helloml.html
There is a lot of content on pytorch, which is great and makes a ton of sense since it's used so heavily, where the industry needs a ton of help/support in is really the fundamentals. Nonetheless, great contribution!
Yet, 2D and 3D graphics feel relatively natural, maybe because at least I can visualize that kind of math.
Of course it kind of breaks down as the gradient can no longer be visualized as an arrow in 2D or 3D space and not all concepts transfer as easily to higher dimensions, as one would hope, but some do.
Other is making heads of tails of what a neural network with backpropagation means.
What worked well was the progressive complexity. Starting with basic mesh rendering before jumping into differentiable rendering made the concepts click. The voxel-to-mesh conversion examples were particularly clear.
If anything, I'd love to see a follow-up covering point cloud handling, since that seems to be a major use case based on the docs I'm now digging through.
Thanks for writing this — triggered a weekend deep-dive I probably wouldn't have started otherwise.
For a deeper tutorial, I highly recommend PyTorch for Deep Learning Professional Certificate on deeplearning.ai — probably one of the best mooc I’ve seen so far
https://www.deeplearning.ai/courses/pytorch-for-deep-learnin...
Free book: https://zekcrates.quarto.pub/deep-learning-library/
Ml by hand : https://github.com/workofart/ml-by-hand
Micrograd: https://github.com/karpathy/micrograd