Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

896 - GDPR Update and CI fixes #909

Closed
wants to merge 5 commits into from
Closed

896 - GDPR Update and CI fixes #909

wants to merge 5 commits into from

Conversation

csullivannet
Copy link

@csullivannet csullivannet commented Feb 28, 2020

Intended as the resolution to #896

A number of changes were made to the Jira API server side that break the implementation of this library. Primarily, username/name is generally no longer a valid query or response. This has had the effect of breaking a number of the tests as well as how we authenticate to our test instance in the first place.

I've made the decision to continue passing the credentials in plaintext, due to limitations noted first by @Addono here: #871 (comment)
Simply put, the tests will fail on PRs as they would not have access to the credentials necessary for the test instance otherwise. Other solutions are possible, but other than the potential for vandalism I didn't see the need for these credentials to be secret.

However, I have moved these variables out of the code and into the environment created by Travis. This gives people the option to define their own set, should they desire. The tests will fail fast if otherwise not set, as discussed here: #896 (comment)

A couple of other fixes have been made, where fields have either been removed or made mandatory:

  • The key field has been removed the myself response
  • permissions is now a mandatory field for the mypermissions request

Finally, some fixes have been applied to pass linting.

This PR is not likely to be a complete fix for all GDPR related issues. Merely, it fixes the code where tested, or the tests themselves, such that CI will be green again.

@csullivannet csullivannet added the Status: In progress Currently being worked on. label Feb 28, 2020
@ssbarnea ssbarnea self-requested a review February 28, 2020 20:39
@csullivannet csullivannet changed the title 896 - Fix Jira auth 896 - GDPR Update to fix CI Mar 2, 2020
@csullivannet csullivannet changed the title 896 - GDPR Update to fix CI 896 - GDPR Update and CI fixes Mar 2, 2020
@csullivannet csullivannet added Status: Ready All information is available, and a solution needs to be created. and removed Status: In progress Currently being worked on. labels Mar 2, 2020
@Addono
Copy link
Collaborator

Addono commented Mar 5, 2020

However, I have moved these variables out of the code and into the environment created by Travis. This gives people the option to define their own set, should they desire. ...

How would people be supposed to overwrite these values? I tried to overwrite them through the "Environment variables from repository settings", however it seems that the ordering is such that the one in the .travis.yaml take precedence.

This pipeline passes, while the environment variables are overwritten to use non-existent credentials for the admin user:
image

.travis.yml Outdated
@@ -74,3 +74,8 @@ env:
global:
- secure: "pGQGM5YmHvOgaKihOyzb3k6bdqLQnZQ2OXO9QrfXlXwtop3zvZQi80Q+01l230x2psDWlwvqWTknAjAt1w463fYXPwpoSvKVCsLSSbjrf2l56nrDqnoir+n0CBy288+eIdaGEfzcxDiuULeKjlg08zrqjcjLjW0bDbBrlTXsb5U="
- PIP_DISABLE_PIP_VERSION_CHECK=1
- CI_JIRA_URL="https://pycontribs.atlassian.net"
Copy link
Member

@ssbarnea ssbarnea Mar 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not put credentails in plain text. You should be able to define as project config and avoid using the travis file for them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this is a limitation of Travis. I've added all the variables to the Travis settings and dropped the env auth creds commit. This causes the build to fail on a KeyError as they are not passed to my fork for the PR.

This is by design from Travis CI:
https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions

If we expect external contributors to fork the project and have their PRs run these same tests, we necessarily must expose the credentials to these users.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option would be to obfuscate them, e.g. store them encrypted together with the password. Security through obfuscation doesn't make it safer, but at least it requires someone to put a bit of effort in to get them and makes it harder for bots to scrape them.

Copy link
Author

@csullivannet csullivannet Mar 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssbarnea Would Addono's proposal be an acceptable solution? I'm not certain there's any real workaround to this as it's a hard limitation of Travis. If having the credentials to the cloud test instance in the code (even obfuscated) is completely unacceptable then our efforts might be better redirected towards provisioning local test instances as part of the CI.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssbarnea Hi, just bumping this into view since it looks like this has been sitting for a bit and it appears to be crucial for this PR.

Key is no longer a valid field returned by 'myself'. Instead,
to quickly validate we can authenticate we will check the display name
is as we expect.
Adds the now mandatory permissions parameter to when querying
`my_permissions`.
@csullivannet
Copy link
Author

csullivannet commented Mar 5, 2020

How would people be supposed to overwrite these values? I tried to overwrite them through the "Environment variables from repository settings", however it seems that the ordering is such that the one in the .travis.yaml take precedence.

My intention was that these variables are read from the environment only as an intentional choice. You can run the tests against another server locally, but not via CI. For a PR to be passable, I would imagine the tests _must_be performed against the same server so that we don't have to chase down config ghosts.

@Addono
Copy link
Collaborator

Addono commented Mar 5, 2020

My intention was that these variables are read from the environment only as an intentional choice. You can run the tests against another server locally, but not via CI. For a PR to be passable, I would imagine the tests _must_be performed against the same server so that we don't have to chase down config ghosts.

Good point, I agree that it's more important that PR MR are forced to be explicit about config changes.

@@ -160,7 +145,8 @@ def __init__(self):
logging=False,
max_retries=self.max_retries,
)
if not self.jira_admin.current_user():

if not self.jira_admin.current_user("displayName") == "CI Admin":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks local tests running against my atlas-run-standalone initiated Jira instance.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll find a value that the local tests and cloud tests can agree on that exists for the admin user or otherwise make changes to the tests so that they agree.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might not be such a big issue, there is much more to be configured before an instance spinned up with atlas-run-standalone works.

Shouldn't this check be more like "check if the current user is logged in ([optional] with any admin account)", rather than "check if the current user is logged in with an account with exact name XYZ"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my reading of the original test it was to check whether the call to myself works, which shows you are currently logged in. To be honest at this point I don't remember why exactly I used displayName, but I do recall I was having issues with getting this test to pass until I checked the username.

So in terms of the original intention of the test, I agree with you. It would be better for this test to pass as long as it returns anything, although I'm assuming it throws an error or otherwise not 200 if you're not logged in. I'll revisit this test.

else:
self.CI_JIRA_USER_PASSWORD = "sd4s3dgec5fhg4tfsds3434"
for var in env_auth_vars:
setattr(self, var, os.environ[var])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't really like this way of getting the environment variables. It's not really clear where the self.CI_JIRA_ADMIN. Took me a while before I figured out that the magic happened here, as I started reading somewhere halfway and my IDE couldn't tell me where they came from.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would if be preferable to go back to declaring them line-by-line (but without having a fallback if/else clause- i.e. they must still be set by the environment)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the way to go to me.

def current_user(self, field="key"):
"""Returns the username or emailAddress of the current user. For anonymous
users it will return a value that evaluates as False.
def current_user(self, field="accountId"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is not backwards compatible and will break non-cloud versions of Jira.

Copy link
Collaborator

@Addono Addono Mar 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I noticed that similar BC breaking changes have already been committed to master (e.g. a4e6b1b). Not too fond of this, but I guess restoring the BC could better be left for a separate PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI tests are only against the cloud version, which seems to be the priority for this project. Given the limited resources available to maintain it, I would tend to agree with breaking older on-prem versions in favour of keeping track with the cloud and maintaining BC on a best-effort basis.

The GDPR compliance changes also represent breaking changes within a major version of the API. This represents a fair amount of work to restore functionality to the library as more things get deprecated and dropped. As far as I can tell it also has caused a fair amount of divergence between on-prem and cloud versions, increasing the difficulty of maintenance.

With these factors in mind I'd prefer to prioritize getting fixes in to bring CI back to green and raise issues that arise afterwards, so that the project can gain momentum again. With CI green, it should be easier to put forward smaller PRs that are easier to merge to fix BC piece-by-piece.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that getting the CI green again should be a top priority. Over the weekend I was working on getting the CI to run against a Jira instance spinned up as part of the pipeline, as to get rid of the credential management issue and have more reproducible builds. So in that effort fixing CI and maintaining BC align.

Anyway, this was just the first BC breaking change I ran into. Shouldn't be considered a blocker for this PR.

@iramisvalentincapco
Copy link

Hello @Addono and @ssbarnea I was wondering if this PR has met the criteria for approval yet? Given the large number of changes to the JIRA API to be GDPR-compliant this would be crucial for usage of this package lib.

I understand that having proper passing CI checks is critical for re-enabling PR approvals for this project. My understanding is that this PR would resolve the CI issues that seem to be blocking the 17 other PRs that are sitting in a blocked state.

@Addono
Copy link
Collaborator

Addono commented May 20, 2020

Hi @iramisvalentincapco, the goal of this PR is indeed to get the CI to work again. However, work on that hasn't completed - see the Travis CI pipelines which is still failing.

The reason why stems from the fact that to get the CI to pass again three issues needs to be resolved:

  1. Fixing the codebase and test-suite to be compatible with the latest version of Jira. As no fixes are merged, the drifting API specs cause the current test suite to fail.
  2. Provisioning of a Jira instance to test against. Previously, a public Jira Cloud instance was used. However, the credentials to this instance can only be used for branches inside the repository. As giving credentials to this Jira instance to CI runs against PRs from untrusted sources would risk having them exposed.
  3. Updating the CI to make use of the changed made to accommodate (1) and (2).

I have seen that a lot of people have been trying to fix 1, 2, 3 or a combination of them. My own experience is that solving all issues concurrently is troublesome, as it's too much to "just" do it all in a weekend and so many changes are required that you lose track of it yourself, hence for the reviewer of the PR it will be even harder to fathom.

@iramisvalentincapco
Copy link

iramisvalentincapco commented May 20, 2020

@Addono @ssbarnea It seems like @csullivannet has addressed 2 via this comment #909 (comment) is the proposed solution acceptable?

If so, would it then be acceptable for the main repo to have a "fix/ci" branch or something along those lines so that fixes for 1, 2 and 3 can be more safely merged without impacting the master/release branch?

@ssbarnea
Copy link
Member

Yes, good idea. Create a new branch called fix/ci and this will be used to bring it back to green as a collaborative effort.

@Addono
Copy link
Collaborator

Addono commented May 21, 2020

First contribution to fix/ci is ready: Managed to get a containerized version of Jira Software running. However, no other branch than master exists, so I cannot create a PR.
https://github.com/Addono/jira/tree/feature/ci-run-jira-in-sidecar-container

The source code for the jira-run-standalone container can be found here:
https://github.com/Addono/docker-jira-software-standalone

Barely any of the tests pass, so debugging that would be the next step. Probably some can be fixed by cherry-picking from all open PRs.

@ssbarnea
Copy link
Member

Now there is a new branch https://github.com/pycontribs/jira/tree/fix/ci -- feel free to feed it.

@Addono
Copy link
Collaborator

Addono commented May 21, 2020

@ssbarnea cool! Created PR #942

@iramisvalentincapco
Copy link

@csullivannet I think after taking in the changes from #942 and changing the merge branch to the new fix/ci branch this PR may be able to get some more movement on the other standing issues with the PR.

@Addono
Copy link
Collaborator

Addono commented May 28, 2020

Yes, more work can not be done. But having the CI now test against a Jira Server version instead of Jira Cloud shows that recently (traced it back somewhere around master~10) changes were introduced which completely break support for Jira Server.

I think it's important for QA that both Server and Cloud are tested, as their APIs are growing more and more apart. Therefore, having something work on one version does not guarantee that it works on the other. At the very least we can know and document that if a new version would be released now is not suitable for usage with Server environments.

Sadly, this also means that the CI issue isn't completely solved, as now the tests do not run against a Cloud environment. I would suggest reintroducing the tests which run against the Jira Server environment using the encrypted secrets similarly as what was used before. The downside is that this means that this won't be available on runs which originate from PRs. However, this will still be able to run on pushes to fix/ci, which would prevent faulty commits from reaching master. Re-adding this part of the CI should be pretty trivial and at least get full coverage whilst fixing all the broken tests.

Anyhow. Any contributions to trying to fix the failing tests are much appreciated. My time budget is spent after I got this Jira Server CI thing to work again, so regrettably I cannot be of much help there.

@dsm1212
Copy link

dsm1212 commented Aug 27, 2020

Ah, sorry I submitted issue #962 about this. Is there any way to get this moving again?

@Addono
Copy link
Collaborator

Addono commented Aug 27, 2020

Ah, sorry I submitted issue #962 about this. Is there any way to get this moving again?

Yes, but it requires finding someone who's willing to spend time on getting all tests to pass again for Jira Server instances, from then on it will be possible to merge fixes (assuming they are not Jira Cloud specific).

For Jira Cloud specific fixes it would require finding a way on how to run the CI against a Jira Cloud instance. Which has been troublesome to setup, as because of this concern of exposing the credentials to the Jira Cloud instance used to run the CI against.

@SpoonOfDoom
Copy link

For Jira Cloud specific fixes it would require finding a way on how to run the CI against a Jira Cloud instance. Which has been troublesome to setup, as because of this concern of exposing the credentials to the Jira Cloud instance used to run the CI against.

Are there any ideas on how to do this? If I understand the discussion above correctly, we're talking about limitations of Travis that cannot be circumvented without having the credentials in the source?

Also, what is the worst case scenario if credentials to the test instance get leaked? As far as I'm aware, the cloud instance usage limits are hard, and there are no costs automatically incurred for traffic, used storage space, etc., is this correct?
I just want to understand the risks we are talking about, because on the surface the danger looks to be pretty mild, to be perfectly honest - especially if, as suggested, the credentials are stored in an encrypted manner to prevent scraping by bots.
I know it's not exactly best practice, but if that's what makes or breaks this project, it might be a valid pragmatic approach.

@Addono
Copy link
Collaborator

Addono commented Oct 27, 2020

Are there any ideas on how to do this? If I understand the discussion above correctly, we're talking about limitations of Travis that cannot be circumvented without having the credentials in the source?

Just some alternatives to including the credentials in the source which come to the top of my head right now:

  • Use Robotic Process Automation (RPA) to automate the setup of free Jira Cloud instances as part of the CI 🤷 . (Don't think Atlassian will be too keen on that 😉 , but probably it works albeit the result will be quite fragile.)
  • Automate getting temporary access which has sufficient rights for the test-user and still doesn't allow these test-user credential to harm the Jira instance or break the tests of other users.
  • Let people supply their own Jira server instance for every PR. [I think this has been suggested, which received the objection that you cannot trust the self-provisioned Jira instance. However, if this is limited to an Atlassian owned subdomain, then I guess we should be good.]
  • Build a mechanism for external PRs which only runs the CI when the code is approved by a maintainer of the repo. This maintainer then should first do a code review to check if there's anything funny going on before the CI runs. Doesn't give complete safety, as I'm sure a malicious user can come up with an innocently looking PR which leaks the credentials, however the risk might be acceptable.

I agree that the potential damage to include the credentials of a free Jira instance in the public CI pipeline is small, however this is @ssbarnea's call to make. If you feel like working on creating a workaround, then IMO I would spend my time on one of my latter two suggestions.

Also, what is the worst case scenario if credentials to the test instance get leaked? As far as I'm aware, the cloud instance usage limits are hard, and there are no costs automatically incurred for traffic, used storage space, etc., is this correct?

Another potential issue - don't think it's too major but just for the sake of showing that there are other potential issues than cost - of exposing the credentials one could argue for is that someone stealing the credentials can make changes such that the test suite breaks. The tests have some assumptions (e.g. certain users are present) about the environment which aren't part of a clean installation, so violating these assumptions (removing/renaming these users) breaks the tests for everyone.

However, the status quo of not having access to the test environment already breaks the test for pretty much everyone...

@adehad adehad mentioned this pull request Jul 27, 2021
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Ready All information is available, and a solution needs to be created.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants