Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better sysadmin access management #4

Open
dalcde opened this issue Oct 1, 2020 · 5 comments
Open

Better sysadmin access management #4

dalcde opened this issue Oct 1, 2020 · 5 comments
Labels
suggestions needed A problem in need of a solution triage This issue needs triaging

Comments

@dalcde
Copy link

dalcde commented Oct 1, 2020

Problem statement

Currently, many SRCF services run on dedicated VMs, which only sysadmins are
given access to. For many of them, access control is managed by NIS ---
sysadmins get an -adm account on pip which can be logged on in any of these
machines with the same password. This currently comes with several problems:

  • This requires the machines to be in the internal network. In particular, it
    cannot be used on externally hosted servers.

  • We do not sync home directories, so we need to copy ssh keys into each VM by
    hand.

  • As a counterpart, it is troublesome to revoke ssh keys, because it has to be
    performed on each machine individually.

  • We may want to forbid password login into -adm accounts, but this is not
    possible due to the previous problem.

  • This presents us with a single point of failure --- if NIS master goes down,
    we cannot access any of our machines via ssh. We've recently had various
    incidents due to NIS misbehaving.

  • NIS is not well-supported by Ubuntu; we are using our own NIS build. While
    moving away from NIS is unlikely to happen soon, it would be nice to reduce
    dependence on NIS.

A better method to manage sysadmin vm access would be welcome.

Requirements

  • It should not rely on the machine being on a trusted network.

  • It should be impossible to escalate from a sysadmins normal srcf account to
    access to -adm accounts.

  • Sysadmins should not be locked out of the -adm account because another
    machine has failed (no single point of failure).

  • It should be based on SSH keys instead of passwords.

  • It should be easy to revoke keys.

  • It should be easy to give non-sysadmins access to the machine.

Possible solutions

Use SSH certificate authorities (see this Facebook engineering post).

@dalcde dalcde added suggestions needed A problem in need of a solution triage This issue needs triaging labels Oct 1, 2020
@edwinbalani
Copy link
Member

I think we could eliminate dependence on NIS if we wanted to.

There are libnss modules like libnss-extrausers that let you supply a passwd-style file at a path of your choice (ditto for shadow, etc.). We could have a small adm-passwd list with the necessary info:

  • -adm username
  • uid/gid (best to match against the NIS passwd database)
  • GECOS - i.e. real name

In essence, this would be an extract from the NIS passwd database, filtered to only contain the -adm accounts.


I think this could be a low-effort solution to distributing the minimum info sshd needs to be able to authenticate users with keys -- since SSH key authentication bypasses PAM password checking (which is where shadow entries would be needed). Note that if we don't want sysadmin password access, we definitely wouldn't even need an 'adm-shadow' file.

Additionally, I would say the information in the adm-passwd file would be far less sensitive than our NIS maps, and if it did leak also wouldn't disclose the personal data of the entire user base. (The names of the sysadmins are public on the web, for instance.)

As for how to distribute the adm-passwd file: probably with something like rsync as in #13, or a script to push it out. Given that sysadmins come and go so rarely, it could be a manually-triggered action.


Before anyone says this sounds like a reinvention of NIS: yes, it is, apart from the part that depends on a secure isolated network and ports <1024. Most proposed solutions to this problem will undoubtedly look like a reinvention of NIS.

@CMTC
Copy link

CMTC commented Dec 17, 2021

I think that this is at best a sticking plaster. My impression is that if the SRCF is ever going to utilise more than one site effectively, it's going to have to find a way to replace NIS (and everything else that trusts the network) in more ways than just this. I expect that ultimately the answer is probably going to look like or be Kerberos.

@edwinbalani
Copy link
Member

edwinbalani commented Dec 17, 2021

I expect that ultimately the answer is probably going to look like or be Kerberos.

I wouldn't want that. Assuming a future with Kerberos itself, we'd have this situation:

  • You need a ticket to do anything. (Fair enough, this is how Kerberos hands out auth)
  • Tickets are time-limited and require refreshment with your Kerberos password (or a keytab file, which is equivalent). You either have to supply this on every login or keep it in plaintext/exploitable form somewhere, such as your SRCF homedir.

This is incompatible with passwordless access options (or at least a major pain to work around), which I think we should vie for as a general principle above all.

Despite this all, you do still need a way of distributing directory information -- LDAP could fill that gap, although if your LDAP goes down then you're up the creek as you would have been with NIS. (I've never toyed with LDAP but it also just seems to have a lot more complexity that would duplicate our own memberdb.) So, I'm strongly in favour of storing copies of our user directory (passwd and group maps) everywhere, and maybe even just not distributing the shadow map to servers that don't need it. (At this point I'm talking all users, not just the -adms.) In other words, make near-everything passwordless -- and I think this would be absolutely fine because we can reasonably require users of some services, which may live off-Thunder, to use something like SSH keys for authentication, and existing services in Thunder like shell hosting and authenticated mail can keep supporting passwords.

PS: If NFS is Kerberised too, then long-running jobs that require access to the filesystem become an absolute nightmare once your ticket expires.


One way of getting directory data everywhere would be to make every machine a NIS replica, although that's a bit heavy and as already said, NIS is insecure by default and not excellently maintained in the context of Ubuntu.

After a bit more searching around last night I did find nsscache, which seems to be a nice customisable/extensible tool for synchronising directory information between servers in any way you choose -- it seems that out of the box it will do HTTP, S3 (so, flavoured HTTP), LDAP, and your own mechanism if you write a Python class for it. nsscache has an associated library libnss_cache.so for integrating with Name Service Switch, to make the information availble to getpwent(3) and friends, although it can also generate Berkeley DB files for use with libnss_dbm.so instead.


(and everything else that trusts the network)

I'm guessing you mean NFS here, and/or services that rely on ident? Is there anything else?

@CMTC
Copy link

CMTC commented Dec 17, 2021

I think the way I envisage Kerberos working would be users collecting tickets on their own machine, rather than on an SRCF one. Then the ticket and the problem of protecting it becomes much like an ssh key, expect with a shorter life. My understanding is that sssd can be used on (SRCF) hosts to allow automatic renewal of tickets for long jobs, up to a maximum set by the KDC.

You're right; I used Kerberos as a shorthand for a combination of it and LDAP, which is what I think I mean. Although I've not tried it myself yet, LDAP seems like a good way to solve the problem of distributing directory information. I think you can set it up to do mutual TLS authentication, so no need for a privileged network. Presumably LDAP is easier to keep up than NIS because you aren't restricted to running it all in one physical location.

I was thinking of nfs as the obvious example of something else that trusts the network.

In any case, my assumption was that the main problem here is not directory services; these adm-only hosts are surely mainly only logged in to for maintenance, with the service they provide being unrelated to who can get a shell on them.

@edwinbalani
Copy link
Member

In any case, my assumption was that the main problem here is not directory services; these adm-only hosts are surely mainly only logged in to for maintenance, with the service they provide being unrelated to who can get a shell on them.

That's a fair point; my mind ended up wandering to user identity and authentication in general. That said, I am gravitating towards a general solution that meets all needs rather than an adm-specific one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
suggestions needed A problem in need of a solution triage This issue needs triaging
Projects
None yet
Development

No branches or pull requests

3 participants