Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for non-Basic Auth in scrapyd-deploy #15

Open
jwebb-va opened this issue Dec 16, 2015 · 3 comments
Open

Support for non-Basic Auth in scrapyd-deploy #15

jwebb-va opened this issue Dec 16, 2015 · 3 comments

Comments

@jwebb-va
Copy link

My team is working on a set of scrapy spiders which we want to deploy to a scrapyd server. Our scrapyd server is configured to use an oauth2 proxy to authenticate traffic.

On all of our requests to our Scrapyd API, we need the following header to authenticate our requests:

Authorization: Bearer 1/AbC123

where 1/AbC123 is a OAuth2 access token.

Currently the scrapyd-deploy utility only supports using Basic auth.

@jwebb-va
Copy link
Author

I foresee a few ways to solve this issue...

scrapyd-deploy could support some kind of oauth2_bearer_token key in the scrapy.cfg file. This could add the HTTP Authorization header for us automatically.

[deploy]
url = https://scrapyd.example.com/
project = example
oauth2_bearer_token = 1/AbC123

A more generic solution might be to simply allow users to specify additional HTTP headers manually via command-line arguments. This would allow us to use OAuth2 authentication but would also allow other use cases which require custom headers.

scrapyd-deploy -h "Authorization: Bearer 1/AbC123", "Another-Header: foo"

@madzohan
Copy link

madzohan commented Jun 3, 2016

Currently the scrapyd-deploy utility only supports using Basic auth.

Can you explain how it can be done?

@Digenis
Copy link
Member

Digenis commented Jun 3, 2016

@VlaGrishenko,

def _add_auth_header(request, target):
if 'username' in target:
u, p = target.get('username'), target.get('password', '')
request.add_header('Authorization', basic_auth_header(u, p))

You need to add a username and password to your target in scrapy.cfg.
scrapyd doesn't support authorization
but you can configure it to listen only on 127.0.0.1
and then use apache to proxy connections from elsewhere
while requiring them to provide credentials.

You still however trust users of the same computer connecting on 127.0.0.1 without auth.
What makes sense to me is enabling scrapyd to listen on a uds
to which only the apache user has access and then proxy it as described above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants