Pass-through a PCIe SAS HBA/RAID controller to a VM in Proxmox

I’m basically paraphrasing the Proxmox documentation along with a few tips I noticed are not in the documentation.

Go into your PC’s BIOS and make sure IOMMU/VT-d is enabled, will differ depending on your brand of mainboard.

On the Proxmox host, add iommu=pt to the variable GRUB_CMDLINE_LINUX_DEFAULT in the file /etc/default/grub, then run update-grub

Edit /etc/modules to add:

vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

Run update-initramfs -u -k all to update the kernel then reboot.

When back up, run: dmesg | grep -e DMAR -e IOMMU -e AMD-Vi to chek if IOMMU is enabled. This is what I get on my Ryzen box:

[ 0.063186] AMD-Vi: Using global IVHD EFR:0x246577efa2254afa, EFR2:0x0
[ 0.350585] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[ 0.351328] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 0.351329] AMD-Vi: Extended features (0x246577efa2254afa, 0x0): PPR NX GT [5] IA GA PC GA_vAPIC
[ 0.351334] AMD-Vi: Interrupt remapping enabled
[ 0.351729] AMD-Vi: Virtual APIC enabled
[ 0.351939] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[ 5.167729] AMD-Vi: AMD IOMMUv2 loaded and initialized

It’s important that the PCIe device is in its own IOMMU group and not shared with other PCIe cards/devices.

shopt -s nullglob; for g in $(find /sys/kernel/iommu_groups/* -maxdepth 0 -type d | sort -V); do echo "IOMMU Group ${g##*/}:"; for d in $g/devices/*; do echo -e "\t$(lspci -nns ${d##*/})"; done; done;

Here’s the output on my machine:

IOMMU Group 0:
	00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
IOMMU Group 1:
	00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
IOMMU Group 2:
	00:02.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14db]
IOMMU Group 3:
	00:02.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14db]
IOMMU Group 4:
	00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
IOMMU Group 5:
	00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
IOMMU Group 6:
	00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
IOMMU Group 7:
	00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14dd]
IOMMU Group 8:
	00:08.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14dd]
IOMMU Group 9:
	00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 71)
	00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU Group 10:
	00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e0]
	00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e1]
	00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e2]
	00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e3]
	00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e4]
	00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e5]
	00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e6]
	00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e7]
IOMMU Group 11:
	01:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f4] (rev 01)
IOMMU Group 12:
	02:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
	03:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 [1000:0087] (rev 05)
IOMMU Group 13:
	02:08.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
	04:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f4] (rev 01)
	05:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
	05:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
	05:08.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
	05:0c.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
	05:0d.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
	07:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller [10ec:8125] (rev 05)
	09:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f7] (rev 01)
	0a:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f6] (rev 01)
IOMMU Group 14:
	02:0c.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
	0b:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f7] (rev 01)
IOMMU Group 15:
	02:0d.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
	0c:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f6] (rev 01)
IOMMU Group 16:
	0d:00.0 Non-Volatile memory controller [0108]: Phison Electronics Corporation E18 PCIe4 NVMe Controller [1987:5018] (rev 01)
IOMMU Group 17:
	0e:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Raphael [1002:164e] (rev c4)
IOMMU Group 18:
	0e:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller [1002:1640]
IOMMU Group 19:
	0e:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] VanGogh PSP/CCP [1022:1649]
IOMMU Group 20:
	0e:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:15b6]
IOMMU Group 21:
	0e:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:15b7]
IOMMU Group 22:
	0f:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:15b8]

Can see that the SAS controller is in its own group (12) with the PCI bridge. If I put the card into the slot above, it ends up in group 13, shared with the USB and Ethernet controller. When you pass the SAS controller through to a guest VM, it takes out the Ethernet, USB & SATA controller too. This is why I had to buy a SAS controller instead of using the onboard SATA.

Now run lspci -nnk to view which driver your device is using. For my SAS controller it looks like this as I have already set it up to pass through to a VM, so the kernel driver in use is vfio-pci. Before this it said mpt3sas for both the driver in use and the kernel modules:

03:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 [1000:0087] (rev 05)
	Subsystem: Broadcom / LSI 9207-8i SAS2.1 HBA [1000:3020]
	Kernel driver in use: vfio-pci
	Kernel modules: mpt3sas

The aim here is to make sure the PCIe device is not used by the host Proxmox server, leaving it free to be sent to your guest VM. We do this by editing /etc/modprobe.d/pve-blacklist.conf - here is the contents of mine:

options vfio-pci ids=1000:0087
blacklist mpt3sas

This tells the computer that PCIe device 1000:0087 (you can get this from the lspci -nnk command above) should use the vfio-pci module and to not load the mpt3sas driver at all. If you have two or more PCIe cards using the same driver but only want of those PCIe cards passed through to a VM, you can force the Proxmox server to load the vfio-pci module before the actual module, as well as passing the vfio-pci option to the kernel:

Create a conf file: nano /etc/modprobe.d/vfio-before-mpt3sas.conf
Add this line to it to make sure vfio-pci is loaded before mpt3sas: softdep mpt3sas pre: vfio-pci

Either method you choose, after doing this change you need to update the kernel again:

update-initramfs -u

Then reboot and when the system loads up again, run lspci -nnk to see if the changes worked, You should see Kernel driver in use: vfio-pci under your PCIe device. If you don’t, something’s cooked as to pass the PCIe device through it needs to be using the vfio-pci driver.

Now you can create your VM. Some tips:

  • Use the q35 machine type so it gets PCIe support
  • When adding the PCIe device to the VM, select the “PCI-Express” option
  • Disable ACPI Support - I was getting a bunch of pci_hp_register failed with error -16 errors on boot with Debian 12 with ACPI enabled
  • If using a Linux OS you might get an irq N:nobody cared (try booting with the "irqpoll" option) when trying to use the PCIe card, if this happens, do what the error says and add “irqpoll” to the GRUB_CMDLINE_LINUX_DEFAULT variable in /etc/default/grub, then run update-grub and reboot.