Skip to content

Commit

Permalink
[lspci] Fix handling of more than one PCIe root port (#1822)
Browse files Browse the repository at this point in the history
Summary:
I'm guessing most machines only have one PCIe root port, but things like ARM hosts tend to have more, and that breaks as it creates collisions.

As we want to keep backwards compatibility, and `lspci` is run with the defaults that doesn't show the root port if it's `0000`, only add it when it's non-zero, and also add it as a new extra field called `root_port` to denote the fact that there is one (useful when using the BDF to find files on disk, as when there is no `root_port`, one needs to add the extra `0000:`.

More detailed explanation I wrote on the slack channel:
On Linux, we run `lspci -vnnmk`, which will show the BDF in the first `Device:` field. So the first bug is that the regular expression on https://github.com/chef/ohai/blob/main/lib/ohai/plugins/linux/lspci.rb#L54 is missing a `\` before the `.`, therefore will will match `0001:02:03.4` as `01:02:03` which is plain broken. But fixing that we end up with `02:03.4` which is also plain wrong as it loses the root complex.
I see a few ways we could fix this:

1. Highly disruptive: we add -D to lspci, which means we'll always get the root complex even when it's 0000 . This is terrible 'cause it will basically break everything
2. We do this but only with an option in ohai. Less terrible, but means people have to know about that option
3. We do this but only when we notice more than one root complex. That's not really doable
4. We add a new field, root_complex in the PCI structure, which is 0000 by default and the actual root complex for everything else

I like 4 the most, as it mirrors closely what lspci does (https://github.com/pciutils/pciutils/blob/master/lspci.c#L285)
or maybe 5. We do it the same way lspci does it: if the root complex is 0, we just store the BDF as the ID, if it's not, we store the root complex + BDF
we can't really do 4 'cause we might have collisions... Like if you have 0000:01:02.3 and 0001:01:02.3 then they have to be stored differently. So we can either do:
Option A: 01:02.3 and 0001:01:02.3 (this can co-exist with everything today)
Option B: 00000:1:02.3 and 0001:01:02.3 (this most likely needs to become pci2)

Test Plan:
Before:
```
$ ohai | jq .pci\|keys
[
  "00:00.0",
  "01:00.0",
  "01:00.1",
  "02:00:0",
  "04:00:0",
  "08:00:0",
  "08:01:0",
  "08:02:0",
  "08:03:0",
  "08:04:0",
  "08:05:0",
  "09:00:0",
  "09:01:0"
]
```

After:
```
$ ohai | jq .pci\|keys
[
  "0002:00:00.0",
  "0004:00:00.0",
  "0008:00:00.0",
  "0008:01:00.0",
  "0008:02:01.0",
  "0008:02:02.0",
  "0008:03:00.0",
  "0008:04:00.0",
  "0008:05:00.0",
  "0009:00:00.0",
  "0009:01:00.0",
  "00:00.0",
  "01:00.0",
  "01:00.1"
]
```

Reviewers: jaymzh

Closes: #1693

Signed-off-by: Olivier Raginel <babar@meta.com>
  • Loading branch information
Babar authored May 28, 2024
1 parent 6d64237 commit 6337ab5
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 2 deletions.
6 changes: 5 additions & 1 deletion lib/ohai/plugins/linux/lspci.rb
Original file line number Diff line number Diff line change
Expand Up @@ -51,10 +51,14 @@ def standard_array(devices, d_id, tag, line)

case dev[0]
when "Device" # There are two different Device tags
if ( tmp = dev[1].match(/(#{hh}:#{hh}.#{h})/) )
if ( tmp = dev[1].match(/(#{hhhh}:)?(#{hh}:#{hh}\.#{h})/) )
# We have a device id
d_id = tmp[0] # From now on we will need this id
devices[d_id] = Mash.new
if tmp[1]
# We have a root complex
devices[d_id]["root_port"] = tmp[1][0..-2]
end
else
standard_form(devices, d_id, hhhh, "device", dev[1])
end
Expand Down
16 changes: 15 additions & 1 deletion spec/unit/plugins/linux/lspci_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,14 @@
Driver: nvme
Module: nvme
NUMANode: 0
Device: 0005:00:00.0
Class: PCI bridge [0604]
Vendor: Intel Corporation [8086]
Device: C620 Series Chipset Family PCI Express Root Port #1 [a190]
Rev: f9
Driver: pcieport
NUMANode: 0
LSPCI
allow(plugin).to receive(:shell_out).with("lspci -vnnmk").and_return(
mock_shell_out(0, @stdout, "")
Expand All @@ -96,7 +104,7 @@
it "lists all devices" do
plugin.run
expect(plugin[:pci].keys).to eq(
["00:1f.3", "00:1f.4", "00:1f.6", "02:00.0", "04:00.0", "05:00.0"]
["00:1f.3", "00:1f.4", "00:1f.6", "02:00.0", "04:00.0", "05:00.0", "0005:00:00.0"]
)
end

Expand Down Expand Up @@ -129,5 +137,11 @@
expect(plugin[:pci]["04:00.0"]["driver"]).to eq(["iwlwifi"])
expect(plugin[:pci]["04:00.0"]["module"]).to eq(["iwlwifi"])
end

it "populates the root port field" do
plugin.run
expect(plugin[:pci]["0005:00:00.0"]["driver"]).to eq(["pcieport"])
expect(plugin[:pci]["0005:00:00.0"]["root_port"]).to eq("0005")
end
end
end

0 comments on commit 6337ab5

Please sign in to comment.