Getting ROCm working was... an adventure. We documented the entire (painful) journey in a detailed blog post because honestly, nothing went according to plan. If you've ever wrestled with ROCm setup for ML, you'll probably relate to our struggles.
The good news? Everything works smoothly now! We'd love for you to try it out and see what you think.
latchkey · 16h ago
Reading your post now, half the article feels like it is just installing PyTorch. Next time, just use the pre-built docker containers. It is the recommended way and much easier.
Additionally, our MI300x VMs/machines come with ROCm installed and configured already. We also apply all the default recommended BIOS settings as well.
anonymous_llama · 16h ago
Hey,
I think the major issue was getting torch 2.7 working on a bare-metal installation as the app needed that and the pre-built docker containers aren't out yet for torch 2.7. Transformer Lab is also much more than just the Pytorch setup as it provides several plugins which can be used for training, evaluation, dataset generation (and much more!).
Also, the pre-built docker containers unfortunately do not work on WSL which caused the majority of the issues.
I'd love to hear if you had a different experience or if I'm mistaken in any of this!
latchkey · 15h ago
No, this is great feedback. I don't think you're mistaken. It is actually curious to me why PyTorch 2.7 isn't available yet, it should be! I'll pass that feedback up to AMD.
As for WSL, that kind of makes sense, since they just added Windows support to ROCm and that is probably a work in progress.
No need to build your own box, we've got 1xMI300x VMs, for FREE (thanks to AMD), for development exactly like this. Reach out and we can get you set up.
Someone left a comment accusing me of advertising my business, then deleted it. If that’s how it came across, I apologize, but my intention was to offer something genuinely useful, for free, and directly relevant to helping the OP. Those credits weren’t easy to get, it required going all the way up to Lisa. I’m committed to making supercompute accessible to developers. Yes, it’s free like GitHub is free. But this isn’t a sales pitch.
anonymous_llama · 16h ago
Thanks for offering the GPU and we'll definitely be in touch about this one.
What is the best way to reach out to you?
The good news? Everything works smoothly now! We'd love for you to try it out and see what you think.
https://rocm.docs.amd.com/projects/install-on-linux/en/lates...
Additionally, our MI300x VMs/machines come with ROCm installed and configured already. We also apply all the default recommended BIOS settings as well.
Also, the pre-built docker containers unfortunately do not work on WSL which caused the majority of the issues.
I'd love to hear if you had a different experience or if I'm mistaken in any of this!
As for WSL, that kind of makes sense, since they just added Windows support to ROCm and that is probably a work in progress.
https://videocardz.com/newz/amd-pledges-rocm-support-for-win...
Someone left a comment accusing me of advertising my business, then deleted it. If that’s how it came across, I apologize, but my intention was to offer something genuinely useful, for free, and directly relevant to helping the OP. Those credits weren’t easy to get, it required going all the way up to Lisa. I’m committed to making supercompute accessible to developers. Yes, it’s free like GitHub is free. But this isn’t a sales pitch.