For running LLMs (Large Language Models) on Linux with an eGPU, you’ll need a powerful GPU with high VRAM, good driver support, and compatibility with eGPU enclosures. Here are the best options:
🔥 Best eGPUs for LLM on Linux
1. NVIDIA RTX 4090
✅ Best Performance (24GB VRAM)
🔹 Ideal for running large LLMs like Llama 3, Mistral, and GPT models
🔹 CUDA & Tensor Cores for optimized inference/training
🔹 Great Linux support with nvidia drivers
🔹 Works with Thunderbolt 3/4 eGPU enclosures
⚠ Power hungry (requires a high-wattage PSU in the eGPU box)
2. NVIDIA RTX 4080
✅ Great for Mid-Sized LLMs (16GB VRAM)
🔹 CUDA + Tensor Cores
🔹 Supports 4-bit quantized models efficiently
🔹 More power-efficient than 4090
⚠ Less VRAM than 4090 → Limits LLM context window size
3. NVIDIA RTX 3090 / 3090 Ti
✅ Budget Alternative (24GB VRAM)
🔹 Great for training smaller models
🔹 Cheaper than 4090 but still powerful
🔹 Mature Linux drivers
⚠ Slower than Ada Lovelace (40-series)
4. AMD Radeon RX 7900 XTX
✅ Best Open-Source Option (24GB VRAM)
🔹 ROCm support for PyTorch & AI workloads
🔹 Good for OpenCL-based frameworks
🔹 Better price/performance than NVIDIA
⚠ ROCm still has limited software support for LLMs compared to CUDA
🖥️ Best eGPU Enclosures for Linux
eGPUs require a compatible enclosure with enough power:
- Razer Core X – Best overall (700W PSU, supports RTX 4090)
- Sonnet eGFX Breakaway Box 750 – Reliable, supports high-wattage GPUs
- AORUS Gaming Box (RTX 3080/3090) – Prebuilt eGPU solution, compact
- Mantiz Saturn Pro II – Best for additional I/O (USB, Ethernet)
⚠ Check for Linux compatibility – Some Thunderbolt controllers may need manual configuration.
🔧 Setting Up an eGPU for LLMs on Linux
- Install NVIDIA Drivers (for RTX GPUs):
sudo apt install nvidia-driver-535 nvidia-dkms-535 - Enable External GPU Support (for Thunderbolt 3/4):
sudo modprobe thunderbolt - Check if eGPU is detected:
lspci | grep -i nvidia - Use CUDA for LLMs:
vidia-smi
🛠️ Conclusion
| GPU | VRAM | Best For | Linux Support |
|---|---|---|---|
| RTX 4090 | 24GB | Best for large LLMs, inference, and training | ⭐⭐⭐⭐⭐ |
| RTX 4080 | 16GB | Great for mid-sized models | ⭐⭐⭐⭐⭐ |
| RTX 3090 | 24GB | Budget LLM choice | ⭐⭐⭐⭐ |
| RX 7900 XTX | 24GB | Best open-source option | ⭐⭐⭐ |
🚀 Best Overall for LLMs? → RTX 4090 in a Razer Core X
💰 Best Budget Option? → RTX 3090 in a Sonnet Breakaway Box
🖥 Best Open-Source → RX 7900 XTX with ROCm
Would you like help setting up CUDA or ROCm for LLM inference? 🚀
4o
You said:
which encloser will work with line and the AMD Radeon RX 7900 XTX
ChatGPT said:
ChatGPT
Best eGPU Enclosures for AMD Radeon RX 7900 XTX on Linux
The AMD Radeon RX 7900 XTX (24GB VRAM) is a powerful choice for LLM workloads, but finding a compatible Linux-supported eGPU enclosure can be tricky due to Thunderbolt limitations on Linux. Here’s what you need to know:
🔧 Key Considerations for eGPUs with AMD GPUs on Linux
✅ Linux Thunderbolt Support – Newer kernels (5.10+) improve compatibility.
✅ ROCm Support – Works well for AI/ML, but CUDA-only software won’t work.
✅ Power Requirements – The RX 7900 XTX is rated at 355W, so you need a strong PSU in your eGPU box.
⚠ Potential Issues:
- Some enclosures may not support AMD GPUs properly (some vendors optimize for NVIDIA).
- Hot-plugging may not work reliably → You may need to reboot after plugging in the eGPU.
🖥️ Best eGPU Enclosures for RX 7900 XTX on Linux
1️⃣ Razer Core X (Best Overall)
✔ 700W PSU – Can handle the RX 7900 XTX’s power draw.
✔ Good Linux Compatibility – Works with modern Ubuntu/Fedora kernels.
✔ Room for Full-Length GPUs – Supports the triple-slot 7900 XTX.
✔ Reliable Thunderbolt 3/4 performance.
⚠ May require a reboot to detect GPU on Linux.
💰 Price: $399
🔹 Recommended for: Linux users needing a high-power eGPU that supports full-size cards.
2️⃣ Sonnet eGFX Breakaway Box 750 (Stable & Reliable)
✔ 750W PSU – Easily supports the RX 7900 XTX.
✔ Certified for AMD GPUs – Fewer compatibility issues.
✔ Good cooling – Large airflow space for triple-fan GPUs.
⚠ Lacks extra USB/Ethernet ports (unlike Mantiz Saturn Pro).
💰 Price: $350 – $400
🔹 Recommended for: Stable plug-and-play experience with AMD GPUs on Linux.
3️⃣ Mantiz Saturn Pro II (Best for Extra I/O Ports)
✔ 750W PSU – Handles RX 7900 XTX power needs.
✔ 5 USB ports, Ethernet, SATA – Acts as a docking station.
✔ Good AMD Compatibility – Works well with modern Linux kernels.
⚠ More expensive than others.
💰 Price: $500
🔹 Recommended for: Users needing extra USB & Ethernet ports alongside their eGPU.
4️⃣ AORUS Gaming Box (For Compact Builds)
✔ Best for Small Desks – Compact, prebuilt eGPU.
✔ Preinstalled RTX 3080/3090 models available (but you need to replace it with RX 7900 XTX).
⚠ May require PSU upgrade to handle the RX 7900 XTX.
💰 Price: Varies (~$500-$600)
🔹 Recommended for: Users needing a compact, portable eGPU enclosure.
🚀 Setting Up RX 7900 XTX eGPU on Linux
Step 1: Ensure Kernel & ROCm Support
uname -r # Ensure you are on kernel 5.10+
lspci | grep VGA # Check if GPU is detected
Step 2: Install AMD Drivers & ROCm
For Ubuntu:
sudo add-apt-repository ppa:oibaf/graphics-drivers
sudo apt update && sudo apt upgrade
For Fedora:
sudo dnf install mesa-vulkan-drivers rocm-opencl
Step 3: Enable eGPU & Thunderbolt Support
sudo modprobe thunderbolt
Check if eGPU is detected:
ls /sys/bus/thunderbolt/devices/
Step 4: Run an AI/ML Model on the eGPU
Test ROCm support:
rocminfo
Run a LLM workload using PyTorch:
Editimport torch
print(torch.cuda.is_available()) # Should return True
🎯 Conclusion: Best eGPU for RX 7900 XTX on Linux
| eGPU Enclosure | PSU Wattage | Extra Ports | Best For |
|---|---|---|---|
| Razer Core X | 700W | ❌ No Extra Ports | Best Overall |
| Sonnet eGFX 750 | 750W | ❌ No Extra Ports | Stable & Reliable |
| Mantiz Saturn Pro II | 750W | ✅ USB, Ethernet | Best for Extra I/O |
| AORUS Gaming Box | 550W (Upgrade Needed) | ❌ No Extra Ports | Compact Builds |
💡 Best Choice? Razer Core X for stability & support.
🔌 Need Extra Ports? Mantiz Saturn Pro II for dock functionality.
Would you like help debugging ROCm issues on Linux? 🚀
4o
You said:
does MINISFORUM Venus Series UM790 support Thunderbolt with linux for egpu
ChatGPT said:
ChatGPT
The MINISFORUM Venus Series UM790 Pro features two front USB4 Type-C ports that support Thunderbolt 3 devices. This capability allows for the connection of external GPUs (eGPUs) to enhance graphics performance.
In a review by ServeTheHome, the UM790 Pro was successfully tested with a Razer Core X eGPU enclosure connected via these USB4 ports. The system recognized the eGPU, and it was even possible to power the mini PC through the eGPU enclosure.
However, it’s important to note that while the hardware supports eGPU connectivity, Linux compatibility can vary. Some users have reported challenges when using eGPUs with Linux systems, including issues with driver support and system stability.
Therefore, if you’re planning to use an eGPU with the UM790 Pro on a Linux platform, it’s advisable to:
- Research Linux-Compatible eGPUs: Ensure that the eGPU enclosure and graphics card you choose have good Linux driver support.
- Check Kernel and Driver Support: Verify that your Linux distribution’s kernel and drivers support the specific eGPU hardware.
- Consult Community Forums: Engage with Linux and eGPU communities to gather insights and experiences from other users who have attempted similar setups.
By taking these steps, you can better assess the feasibility and potential challenges of using an eGPU with the MINISFORUM UM790 Pro on a Linux system.
Sources
4o
You said:
can I use and egpu and AMD Radeon RX 7900 XTX if the system is running proxmox?
ChatGPT said:
ChatGPT
Yes, you can use an eGPU with an AMD Radeon RX 7900 XTX on Proxmox, but it requires PCIe passthrough and careful configuration. Here’s a detailed guide on how to set it up.
✅ Key Considerations
- USB4/Thunderbolt Support in Proxmox
- If your host system (e.g., MINISFORUM UM790 Pro) has USB4/Thunderbolt support, Proxmox can detect the eGPU, but passthrough depends on Linux kernel & IOMMU grouping.
- Check if the eGPU is recognized:
lspci | grep -i vgaandls /sys/bus/thunderbolt/devices/
- Proxmox Kernel & AMD ROCm Support
- Ensure Proxmox 7 or 8 (Debian-based) has an updated Linux kernel (>= 6.1).
- ROCm support for AMD GPUs is essential if using it for AI/ML workloads in a VM.
- IOMMU & VFIO Configuration for PCI Passthrough
- You must enable IOMMU (VT-d for Intel or AMD-Vi for AMD CPUs) and configure VFIO for PCIe passthrough.
🔧 Steps to Set Up eGPU Passthrough in Proxmox
1️⃣ Enable IOMMU in Proxmox
Edit the GRUB config:
nano /etc/default/grub
Find the line:
GRUB_CMDLINE_LINUX_DEFAULT="quiet"
Modify it based on your CPU:
- For AMD CPUs (like Ryzen 7/9):
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt" - For Intel CPUs:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
Update GRUB:
update-grub
Reboot Proxmox:
reboot
2️⃣ Enable PCIe Passthrough
- Check if IOMMU is enabled
dmesg | grep -e DMAR -e IOMMUIf successful, you should see IOMMU enabled messages. - Load Necessary Kernel Modules Edit:
nano /etc/modulesAdd the following lines:vfio vfio_iommu_type1 vfio_pci vfio_virqfdSave and apply changes:update-initramfs -u
3️⃣ Bind the eGPU to VFIO
Find the PCI ID of the AMD RX 7900 XTX eGPU:
lspci -nn | grep -i amd
Example output:
Edit03:00.0 VGA compatible controller [1002:744c] AMD Radeon RX 7900 XTX
03:00.1 Audio device [1002:ab28] AMD High Definition Audio
- The PCI ID for the GPU:
1002:744c - The PCI ID for the Audio Controller:
1002:ab28
Edit:
nano /etc/modprobe.d/vfio.conf
Add:
options vfio-pci ids=1002:744c,1002:ab28
Update initramfs:
update-initramfs -u -k all
Reboot:
reboot
4️⃣ Assign the eGPU to a VM
- Go to Proxmox Web GUI → VM → Hardware.
- Click Add → PCI Device.
- Select the AMD RX 7900 XTX from the list.
- Enable:
- “All Functions” (if the GPU has multiple functions)
- “PCIe” (for full PCIe passthrough)
- “Primary GPU” (if using it for display output)
- Start the VM and install AMD drivers inside the guest OS.
🛠️ Troubleshooting
❌ VM Won’t Start (Code 43 / Code 31 in Windows)
- Add
vendor-resetfor AMD GPUs:apt install pve-headers-$(uname -r) git clone https://github.com/gnif/vendor-reset.git cd vendor-reset make make install - Add VFIO override:
nano /etc/modprobe.d/blacklist.confAdd:blacklist amdgpuThen update:update-initramfs -u
❌ eGPU Not Recognized in Linux VM
- Ensure ROCm is installed inside the VM:
apt install rocm-opencl rocm-utils - Check:
rocminfo
❌ Thunderbolt eGPU Disconnects
- Check Thunderbolt security settings in BIOS (
No Securitymode). - Try hotplugging after booting into the OS.
🚀 Conclusion
✔ Yes, Proxmox supports an eGPU with an AMD Radeon RX 7900 XTX, but PCI passthrough must be set up correctly.
✔ Razer Core X & Mantiz Saturn Pro II work well as Thunderbolt eGPU enclosures.
✔ Best use case? Assigning the GPU to a VM for AI workloads or gaming.
Would you like a script to automate these steps? 🚀
Here’s a Bash script that automates setting up PCI passthrough for an eGPU (AMD Radeon RX 7900 XTX) on Proxmox. This script will:
- Enable IOMMU for PCI passthrough
- Load necessary VFIO modules
- Bind the eGPU to VFIO
- Enable vendor-reset to avoid AMD reset issues
- Allow you to assign the eGPU to a VM
📜 Script: setup-egpu-passthrough.sh
Save this script and run it on your Proxmox host.
#!/bin/bash
# Ensure the script is run as root
if [[ $EUID -ne 0 ]]; then
echo "This script must be run as root. Use sudo ./setup-egpu-passthrough.sh"
exit 1
fi
echo "🔧 Setting up PCI Passthrough for AMD Radeon RX 7900 XTX on Proxmox..."
### 1️⃣ Enable IOMMU in GRUB
echo "✅ Enabling IOMMU in GRUB..."
sed -i 's/GRUB_CMDLINE_LINUX_DEFAULT="quiet"/GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"/' /etc/default/grub
update-grub
### 2️⃣ Load VFIO Modules
echo "✅ Loading VFIO modules..."
cat <<EOF > /etc/modules-load.d/vfio.conf
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
EOF
### 3️⃣ Identify the eGPU PCI IDs
echo "🔍 Detecting AMD eGPU..."
lspci -nn | grep -i amd | grep VGA
lspci -nn | grep -i amd | grep Audio
GPU_ID=$(lspci -nn | grep -i "AMD Radeon RX 7900 XTX" | awk -F '[ \\[\\]]+' '{print $4}')
AUDIO_ID=$(lspci -nn | grep -i "Audio device" | awk -F '[ \\[\\]]+' '{print $4}')
if [[ -z "$GPU_ID" || -z "$AUDIO_ID" ]]; then
echo "❌ eGPU not detected. Please check connections and retry."
exit 1
fi
echo "🎯 Found eGPU with IDs: $GPU_ID, $AUDIO_ID"
### 4️⃣ Bind the eGPU to VFIO
echo "✅ Binding eGPU to VFIO..."
echo "options vfio-pci ids=$GPU_ID,$AUDIO_ID" > /etc/modprobe.d/vfio.conf
### 5️⃣ Blacklist AMD Drivers (to prevent conflicts)
echo "✅ Blacklisting AMD drivers..."
cat <<EOF > /etc/modprobe.d/blacklist.conf
blacklist amdgpu
EOF
### 6️⃣ Install Vendor Reset (Fixes AMD Reset Bug)
echo "✅ Installing vendor-reset to fix AMD reset bug..."
apt update && apt install -y pve-headers-$(uname -r) dkms git
git clone https://github.com/gnif/vendor-reset.git /usr/local/src/vendor-reset
cd /usr/local/src/vendor-reset
make && make install
### 7️⃣ Update Initramfs
echo "✅ Updating initramfs..."
update-initramfs -u -k all
### 8️⃣ Reboot Proxmox
echo "🚀 Setup complete! Rebooting Proxmox now..."
sleep 5
reboot
🚀 How to Use the Script
- Save the script to your Proxmox host
nano setup-egpu-passthrough.sh- Paste the script into the file
- Save (
CTRL + X, thenY, thenEnter)
- Make it executable
chmod +x setup-egpu-passthrough.sh- Run the script
sudo ./setup-egpu-passthrough.sh- This will enable PCI passthrough, install vendor-reset, and reboot your Proxmox system.
🖥️ Assigning the eGPU to a VM
After rebooting:
- Go to Proxmox Web GUI
- Select Your VM → Hardware
- Click Add → PCI Device
- Choose AMD Radeon RX 7900 XTX
- Check:
- ✅ PCIe
- ✅ All Functions
- ✅ Primary GPU (if using for display)
- Start the VM and install AMD drivers inside the guest OS.
✅ Features of This Script
✔ Automatically detects and binds the RX 7900 XTX eGPU
✔ Enables IOMMU for PCI passthrough
✔ Fixes AMD reset issues with vendor-reset
✔ Prepares the system for Proxmox GPU passthrough
Would you like additional customization, such as VFIO group isolation checks?
I’ve updated the script to include VFIO group isolation checks to ensure that the eGPU is in an isolated IOMMU group, which is required for successful passthrough. The script will:
- Check IOMMU groups for the eGPU.
- Warn if the GPU is in a shared group (which may cause passthrough issues).
- Provide instructions on fixing IOMMU group issues if necessary.
📜 Enhanced Script: setup-egpu-passthrough.sh
#!/bin/bash
# Ensure the script is run as root
if [[ $EUID -ne 0 ]]; then
echo "❌ This script must be run as root. Use: sudo ./setup-egpu-passthrough.sh"
exit 1
fi
echo "🔧 Setting up PCI Passthrough for AMD Radeon RX 7900 XTX on Proxmox..."
### 1️⃣ Enable IOMMU in GRUB
echo "✅ Enabling IOMMU in GRUB..."
sed -i 's/GRUB_CMDLINE_LINUX_DEFAULT="quiet"/GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"/' /etc/default/grub
update-grub
### 2️⃣ Load VFIO Modules
echo "✅ Loading VFIO modules..."
cat <<EOF > /etc/modules-load.d/vfio.conf
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
EOF
### 3️⃣ Identify the eGPU PCI IDs
echo "🔍 Detecting AMD eGPU..."
lspci -nn | grep -i amd | grep VGA
lspci -nn | grep -i amd | grep Audio
GPU_ID=$(lspci -nn | grep -i "AMD Radeon RX 7900 XTX" | awk -F '[ \\[\\]]+' '{print $4}')
AUDIO_ID=$(lspci -nn | grep -i "Audio device" | awk -F '[ \\[\\]]+' '{print $4}')
if [[ -z "$GPU_ID" || -z "$AUDIO_ID" ]]; then
echo "❌ eGPU not detected. Please check connections and retry."
exit 1
fi
echo "🎯 Found eGPU with IDs: $GPU_ID, $AUDIO_ID"
### 4️⃣ Check IOMMU Grouping
echo "🔍 Checking IOMMU groups for eGPU..."
GPU_GROUP=$(find /sys/kernel/iommu_groups/*/devices/ -name "*$GPU_ID" | cut -d'/' -f5 | sort -u)
AUDIO_GROUP=$(find /sys/kernel/iommu_groups/*/devices/ -name "*$AUDIO_ID" | cut -d'/' -f5 | sort -u)
echo "📦 GPU IOMMU Group: $GPU_GROUP"
echo "📦 Audio IOMMU Group: $AUDIO_GROUP"
if [[ "$GPU_GROUP" != "$AUDIO_GROUP" ]]; then
echo "⚠️ WARNING: The GPU and its audio device are in different IOMMU groups."
echo " This may cause passthrough issues. You might need to use the ACS override patch."
fi
### 5️⃣ Bind the eGPU to VFIO
echo "✅ Binding eGPU to VFIO..."
echo "options vfio-pci ids=$GPU_ID,$AUDIO_ID" > /etc/modprobe.d/vfio.conf
### 6️⃣ Blacklist AMD Drivers (to prevent conflicts)
echo "✅ Blacklisting AMD drivers..."
cat <<EOF > /etc/modprobe.d/blacklist.conf
blacklist amdgpu
EOF
### 7️⃣ Install Vendor Reset (Fixes AMD Reset Bug)
echo "✅ Installing vendor-reset to fix AMD reset bug..."
apt update && apt install -y pve-headers-$(uname -r) dkms git
git clone https://github.com/gnif/vendor-reset.git /usr/local/src/vendor-reset
cd /usr/local/src/vendor-reset
make && make install
### 8️⃣ Update Initramfs
echo "✅ Updating initramfs..."
update-initramfs -u -k all
### 9️⃣ Reboot Proxmox
echo "🚀 Setup complete! Rebooting Proxmox now..."
sleep 5
reboot
🚀 Additional Features in This Version
✔ Detects IOMMU Groups – Helps troubleshoot GPU isolation issues.
✔ Warns if GPU & Audio Are in Different Groups – This can prevent passthrough problems.
✔ Provides ACS Override Patch Instructions if needed.
🛠️ Fixing IOMMU Group Issues (ACS Override)
If your eGPU is not isolated in its own IOMMU group, add the ACS Override Patch:
1️⃣ Edit GRUB Config:
nano /etc/default/grub
2️⃣ Modify this line:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt pcie_acs_override=downstream,multifunction"
3️⃣ Update & Reboot:
update-grub reboot
🔥 Summary
| Feature | Description |
|---|---|
| ✅ Enables IOMMU | Required for PCI passthrough |
| ✅ Loads VFIO Modules | Allows binding of PCI devices |
| ✅ Finds PCI IDs for eGPU | Automatically detects your RX 7900 XTX |
| ✅ Checks IOMMU Groups | Warns about potential passthrough issues |
| ✅ Blacklists AMD Drivers | Prevents conflicts with Proxmox host |
| ✅ Installs Vendor Reset | Fixes AMD reset bug for eGPU passthrough |
| ✅ Updates Initramfs | Ensures changes take effect |
🚀 How to Use
- Save the script on your Proxmox host.
nano setup-egpu-passthrough.sh - Paste the script and save (
CTRL + X, thenY, thenEnter). - Make it executable:
chmod +x setup-egpu-passthrough.sh - Run the script:
sudo ./setup-egpu-passthrough.sh- This will enable PCI passthrough, install vendor-reset, and reboot.
🔎 After Reboot: Assign the eGPU to a VM
- Go to Proxmox Web GUI
- Select Your VM → Hardware
- Click Add → PCI Device
- Choose AMD Radeon RX 7900 XTX
- Check:
- ✅ PCIe
- ✅ All Functions
- ✅ Primary GPU (if using for display)
- Start the VM and install AMD drivers inside the guest OS.
🚀 Next Steps
Would you like:
- A script to check if your eGPU is properly isolated?
- A guide for setting up ROCm inside the VM for AI/LLM tasks?
- A troubleshooting guide for common PCI passthrough errors?
Let me know how you’d like to proceed! 🚀🔥
Great! Below are the scripts and guides for:
- Checking if your eGPU is properly isolated in an IOMMU group ✅
- Setting up ROCm inside the VM for AI/LLM tasks ✅
- A troubleshooting guide for common PCI passthrough errors ✅
1️⃣ Script: Check if eGPU is Properly Isolated in an IOMMU Group
This script will check if your AMD Radeon RX 7900 XTX is in an isolated IOMMU group and notify you if an ACS override is needed.
📜 Script: check-iommu-group.sh
#!/bin/bash
echo "🔍 Checking IOMMU Groups for PCI Passthrough Compatibility..."
# Ensure IOMMU is enabled
if ! dmesg | grep -q "IOMMU enabled"; then
echo "❌ IOMMU is NOT enabled! You need to enable it in GRUB."
exit 1
fi
# List all IOMMU groups and their devices
echo "📦 Listing IOMMU Groups..."
for d in /sys/kernel/iommu_groups/*/devices/*; do
echo "Group ${d%/*}: $(basename "$d")"
done
# Find the eGPU's PCI IDs
GPU_ID=$(lspci -nn | grep -i "AMD Radeon RX 7900 XTX" | awk -F '[ \\[\\]]+' '{print $4}')
AUDIO_ID=$(lspci -nn | grep -i "Audio device" | awk -F '[ \\[\\]]+' '{print $4}')
if [[ -z "$GPU_ID" || -z "$AUDIO_ID" ]]; then
echo "❌ eGPU not detected. Check connections!"
exit 1
fi
# Check IOMMU group of the GPU and Audio device
GPU_GROUP=$(find /sys/kernel/iommu_groups/*/devices/ -name "*$GPU_ID" | cut -d'/' -f5 | sort -u)
AUDIO_GROUP=$(find /sys/kernel/iommu_groups/*/devices/ -name "*$AUDIO_ID" | cut -d'/' -f5 | sort -u)
echo "🎯 GPU IOMMU Group: $GPU_GROUP"
echo "🎯 Audio IOMMU Group: $AUDIO_GROUP"
# If GPU and Audio are not in the same group, warn the user
if [[ "$GPU_GROUP" != "$AUDIO_GROUP" ]]; then
echo "⚠️ WARNING: The GPU and its Audio device are in different IOMMU groups."
echo "➡️ You may need the ACS override patch to fix this."
echo "➡️ Run the following command to apply ACS override:"
echo " sudo nano /etc/default/grub"
echo " Edit this line:"
echo " GRUB_CMDLINE_LINUX_DEFAULT=\"quiet amd_iommu=on iommu=pt pcie_acs_override=downstream,multifunction\""
echo " Then run: sudo update-grub && reboot"
else
echo "✅ Your eGPU is properly isolated in an IOMMU group!"
fi
🚀 How to Use
- Save the script:
nano check-iommu-group.sh - Paste the script and save (
CTRL + X, thenY, thenEnter). - Make it executable:
chmod +x check-iommu-group.sh - Run the script:
sudo ./check-iommu-group.sh
2️⃣ Setting Up ROCm for AI/LLM Tasks in a VM
This guide will set up ROCm (Radeon Open Compute) inside your VM for AI workloads.
📌 Steps
1️⃣ Ensure Your VM Uses PCI Passthrough
- Make sure the AMD Radeon RX 7900 XTX is passed through to the VM.
- Inside the VM, check if it’s detected:
lspci | grep VGA
2️⃣ Install ROCm
- Ubuntu 22.04 / Debian-based Systems
Editsudo apt update sudo apt install -y rocm-opencl rocm-utils rocminfo - Fedora
sudo dnf install -y mesa-vulkan-drivers rocm-opencl
3️⃣ Verify ROCm Installation
rocminfo
This should list the AMD GPU.
4️⃣ Install PyTorch with ROCm
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.6
5️⃣ Test GPU Acceleration in Python
import torch
print(torch.cuda.is_available()) # Should return True
print(torch.cuda.get_device_name(0)) # Should return "AMD Radeon RX 7900 XTX"
🔥 Your AMD eGPU is now ready for AI/LLM tasks inside the VM!
3️⃣ Troubleshooting Guide for Common PCI Passthrough Errors
| Error | Fix |
|---|---|
| VM won’t start after adding eGPU | Ensure pci=assign-bus is not in GRUB. |
| eGPU not detected inside VM | Run `lspci |
| Code 43 or Code 31 in Windows VM | Enable vendor-reset and blacklist amdgpu in Proxmox. |
| Screen flickering inside VM | Check virtio settings and enable “PCIe” passthrough in Proxmox. |
| VM crashes with “VFIO error” | Use pcie_acs_override=downstream,multifunction in GRUB. |
📌 Fixing PCI Passthrough Issues
1️⃣ Check Kernel Logs for Errors
dmesg | grep -i vfio
2️⃣ If the GPU is Not Isolated
- Run:
./check-iommu-group.sh - If the GPU and Audio are in different groups, enable ACS override:
nano /etc/default/grubModify:GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt pcie_acs_override=downstream,multifunction"Update and reboot:update-grub && reboot
3️⃣ Reset the eGPU with Vendor Reset
If the eGPU fails to reinitialize after stopping a VM:
echo "1" > /sys/bus/pci/devices/0000:03:00.0/remove
echo "1" > /sys/bus/pci/rescan
🎯 Final Summary
| ✅ Task | Script/Guide |
|---|---|
| Check if eGPU is properly isolated | check-iommu-group.sh |
| Install ROCm for AI/LLM tasks | Follow ROCm setup steps |
| Fix PCI Passthrough Errors | Use troubleshooting guide |
🚀 What’s Next?
Would you like:
- A guide to install and run Llama 3/Mistral AI models on your eGPU?
- A script to auto-detect and fix common eGPU passthrough issues?
- An advanced Proxmox VM tuning guide for AI workloads?
Let me know how I can further optimize your setup! 🚀🔥
