Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supply chain security / simplify dependencies #152

Closed
lars-t-hansen opened this issue Apr 2, 2024 · 10 comments
Closed

Supply chain security / simplify dependencies #152

lars-t-hansen opened this issue Apr 2, 2024 · 10 comments
Assignees
Labels
question Further information is requested

Comments

@lars-t-hansen
Copy link
Collaborator

This has been on my mind for a while but the recent exploitation of xz makes it a good time to talk about it. Sonar runs (or will run) on every node on almost every supercomputer in Norway. It pulls in not too many dependencies but some of them (clap, serde, libc) are quite large and actively maintained and some (subprocess, libc) are expected to do dark things by their nature.

https://news.ycombinator.com/item?id=39902241
https://research.swtch.com/xz-timeline

serde is an interesting case in point because for a while (possibly still but I've not been able to substantiate that) it shipped a precompiled binary in order to speed up compilation time, serde-rs/serde#2538. With this type of setup, vetting what we pull in and how it works is very hard.

Sonar actually has very simple needs and most of its libraries have been pulled in not because we need the full functionality but only a small part of it. It may be to our benefit to go over the list of dependencies, remove what we can, and perhaps audit and pin what we can't remove, and define some kind of upgrade protocol. For example, we have simple needs for csv and json production and for command line parsing and could easily replace those libraries if we wanted to.

I'm uncertain about how worried to be about this. Sonar currently runs with no special privileges and its only superpowers are that (a) it runs regularly and everywhere and (b) if an admin sees it running, she will think it is fine that it's running. It is otherwise no worse than any user program, indeed user programs whose source and binary we do not control. On the other hand, being everywhere and running all the time under the radar it is a very convenient stepping stone for a more sophisticated attack leveraging other mechanisms. And it is not out of the realm of the possible that Sonar will need some special powers eventually, for some types of monitoring. And if not Sonar, then maybe a program that Sonar runs - same problem.

@lars-t-hansen lars-t-hansen added the question Further information is requested label Apr 2, 2024
@bast
Copy link
Member

bast commented Apr 2, 2024

Great point. Let's go together through list of dependencies and let's check what we use them for and how much work it would be to replace some of them.

@lars-t-hansen
Copy link
Collaborator Author

lars-t-hansen commented Apr 3, 2024

List of dependencies with uses, and some preliminary commentary:

chrono = "0.4"

  • chrono::Local::now() for local time
  • looks like this delegates to std::SystemTime and our use could be replaced by that

clap = { version = "4.5", features = ["derive"] }

  • argument parsing and argument error reporting, but our needs are very simple
  • hand-writing this would be easy

csv = "1.3"

  • csv writer to ensure proper quoting, but our needs are very simple
  • hand-writing this would be easy

env_logger = "0.11"

  • env_logger::init()
  • see also log below
  • this was added to "improve logging" in changeset f10ae98, but it effectively doubles the size of the binary and pulls in many dependencies recursively

hostname = "0.3"

  • hostname::get(), this delegates trivially to libc

libc = "0.2"

  • sysconf(_SC_CLK_TCK), though i believe Linux hardcodes this value as 100 and has exposed that fact
  • getpwuid_r to map uid to user name
  • libc comes from the rust project and probably has a high degree of trust, it is OK to depend on it (and hard not to)

log = "0.4"

  • used for error logging in a couple of places
  • could maybe be replaced by env_logger, or vice versa? See above.

num_cpus = "1.16"

page_size = "0.6"

  • page_size::get(), delegates to libc
  • this can be gotten from /proc I believe, or the necessary code can be inlined in sonar

serde_json = "1.0.114"

  • json serialization of sysinfo data, which are pretty simple
  • we could hand-code this

serde = { version = "1.0.197", features = ["derive"] }

  • required by serde_json
  • required by csv

signal-hook = "0.3"

  • catch signals so that lock files can be cleaned up while the lock file is held

subprocess = "0.2"

  • execute subprocess
  • pulls in libc

wait-timeout = "0.2"

  • not used, looks like it can just be removed

@lars-t-hansen
Copy link
Collaborator Author

A sensible approach might be:

  • we trust libc, because we probably have to and because it comes from the Rust project
  • we crib code for things that are simple wrappers around libc, we already did this once for the code in users.rs, if the license allows it
  • we create new implementations for higher-level code that is simple and obvious to rewrite (clap, csv, serde_json)
  • we unify and standardize where we can (log, env_logger)
  • we consider whether std:: libraries exist for some of the things we pull in from 3rd party libs

This will leave us with libc, signal-hook, subprocess, and maybe a logging lib, which we can then consider further.

lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 3, 2024
@lars-t-hansen
Copy link
Collaborator Author

A deeper matter is also what is pulled in recursively by the packages that we do keep. Cargo.lock is currently almost 700 lines, naming 74 packages, including many packages with names like "android*" and "wasm*", none of which are obvious for us. It's possible some of these dependencies can be controlled by features, and for the dependencies we can't get rid of we should definitely investigate that.

Also, we should be mindful of recursive dependencies when purging. Trying to get rid of log and keeping env_logger, say, is not going to work because env_logger pulls in log.

lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 3, 2024
lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 3, 2024
@lars-t-hansen
Copy link
Collaborator Author

Sensible rules for taking new dependencies might be:

  • can we build on what we already depend on, even by adding a little extra code?
  • if not, can we build on new std dependencies?
  • if not, are the recursive dependencies of the new dependency acceptable? (as an example, env_logger pulls in env_filter which pulls in all of regex which pulls in a big tree of packages to support the regex engine)
  • are there features in the new dependency we can flip to pull in only what we need?

(In addition to an existing rule that I'm not sure we're paying attention to: are we only pulling in code with a compatible and desirable license?)

@lars-t-hansen
Copy link
Collaborator Author

Additional background: https://research.swtch.com/deps

lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 3, 2024
bast added a commit that referenced this issue Apr 3, 2024
lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 4, 2024
lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 4, 2024
lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 4, 2024
@lars-t-hansen lars-t-hansen self-assigned this Apr 4, 2024
@lars-t-hansen lars-t-hansen changed the title Supply chain security Supply chain security / simplify dependencies Apr 4, 2024
lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 4, 2024
lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 4, 2024
lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 4, 2024
lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 4, 2024
lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 4, 2024
@lars-t-hansen
Copy link
Collaborator Author

lars-t-hansen commented Apr 4, 2024

With all the patches applied we are down to 4 explicit dependencies and 8 dependencies in total (and code size is reduced by approximately 75%). Of the 4, only libc comes from the rust project proper, the other three being signal-hook, page_size, and subprocess. Of the 8, signal-hook pulls in a private dependency, and subprocess pulls in 3 packages for winapi, which we do not need but it's not really in the way, probably.

lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 4, 2024
lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 4, 2024
lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 4, 2024
bast added a commit that referenced this issue Apr 4, 2024
bast added a commit that referenced this issue Apr 4, 2024
bast added a commit that referenced this issue Apr 4, 2024
bast added a commit that referenced this issue Apr 4, 2024
bast added a commit that referenced this issue Apr 4, 2024
bast added a commit that referenced this issue Apr 4, 2024
bast added a commit that referenced this issue Apr 4, 2024
For #152 - Remove env_logger and log dependencies
bast added a commit that referenced this issue Apr 4, 2024
bast pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 4, 2024
lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 5, 2024
lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 5, 2024
@lars-t-hansen
Copy link
Collaborator Author

With the patch to remove the dependency of signal-hook we may have come to the end of the line. I'm skeptical of trying to remove subprocess, even though it might be possible to replace it with std::process. We might reconsider if we decide to fix #88. Instead, I will spend some time vetting the subprocess library and see if we might consider pinning the version.

@lars-t-hansen
Copy link
Collaborator Author

Re subprocess: On hniksic/rust-subprocess#71 we read "The library is currently not very actively maintained, since std::process now has many more features compared to when I started writing subprocess." Indeed the last release was in May, 2022. There are no serious open bugs and crates.io shows a steady stream of downloads, but there isn't any evidence that this library is the best long-term bet.

bast added a commit that referenced this issue Apr 5, 2024
lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 5, 2024
@lars-t-hansen
Copy link
Collaborator Author

The fix to #88 will simplify removing the subprocess dependency, if we want to.

lars-t-hansen pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 22, 2024
bast pushed a commit to lars-t-hansen/sonar that referenced this issue Apr 24, 2024
@bast bast closed this as completed in ab8155e May 16, 2024
bast added a commit that referenced this issue May 16, 2024
Fix #152 - Bump version, documentation, tidy up
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants