Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test against Scylla alternator #27

Open
dimaqq opened this issue Apr 1, 2020 · 27 comments
Open

Test against Scylla alternator #27

dimaqq opened this issue Apr 1, 2020 · 27 comments

Comments

@dimaqq
Copy link
Contributor

dimaqq commented Apr 1, 2020

scylladb/scylladb#5796 (comment)

ScyllaDB is a fast reimplementation of Cassandra, and they have a dynamodb compatibility layer called alternator.

I've ran some basic tests against their docker image. It would be awesome to run a performance test now that aiodynamo is so much faster :)

@ojii
Copy link
Contributor

ojii commented Apr 1, 2020

@dimaqq
Copy link
Contributor Author

dimaqq commented Aug 24, 2021

I came. I saw. I failed. scylladb/scylladb#9240

@dimaqq
Copy link
Contributor Author

dimaqq commented Aug 29, 2021

It's possible to run ScyllaDB Alternator in a container, if one uses the standard (non-nightly) build.
However, running Scylla in a container is painful on macOS, because roughly:

  • Scylla is a multimaster(?) database
  • each node must have a unique id
  • Scylla uses the ip address for that id
  • thus, Scylla refuses to start "bound" to e.g. [::]
  • so Scylla tries to determine the ip address it runs on at startup
  • (this is all fine and dandy in production, but hurts us, developers, big time if just want to test Scylla)
  • somehow, because of that, Scylla fails to start on the default docker network
  • meanwhile, it seems (?) a port can only be exported to host from the default network
  • moreover, Docket doesn't seem (?) to allow attaching two networks, default and custom to same container
  • as a result, it's just too hard to run Scylla in a container and dynamo client on the host

What can be done?

  1. get developer or trial account with hosted ScyllaDB and alternator, and thus test against prod / fast db
  2. run Scylla container on custom network and run benchmarks from another container in same custom network

@kittyandrew
Copy link

kittyandrew commented Oct 28, 2021

I've ran some basic tests against their docker image. It would be awesome to run a performance test now that aiodynamo is so much faster :)

@dimaqq Have you ran tests with aiodynamo? Can you elaborate on the version (of both) you've been using?

I've discovered aiodynamo recently and enjoyed moving code from slow boto3 and it was great so far, but now I've tried to test scylladb with dynamodb api, and aiodynamo code just doesn't work (while my boto3 code works fine).

Example of the issue I have:
boto3 script from scylladb docs (works fine):

import boto3
dynamodb = boto3.resource('dynamodb',endpoint_url='http://localhost:8000',
                  region_name='None', aws_access_key_id='None', aws_secret_access_key='None')

dynamodb.batch_write_item(RequestItems={
    'usertable': [{'PutRequest': {
        'Item': { 'key': 'test', 'x' : {'hello': 'world'} }
    }}]
})

And now aiodynamo code replicating example above (doesn't work):

import asyncio
from aiohttp import ClientSession

from aiodynamo.client import Client, URL
from aiodynamo.credentials import Credentials
from aiodynamo.http.aiohttp import AIOHTTP
from aiodynamo.expressions import HashKey


async def main():
    async with ClientSession() as session:
        client = Client(AIOHTTP(session), Credentials.auto(), region="None", endpoint=URL("http://localhost:8000"))

        await client.put_item("usertable", item={"key": "test", "x": {"hello": "world"}})

asyncio.run(main())

It hangs for a while and gives this output:

Traceback (most recent call last):
  File "/tmp/anjfssa/.venv/lib/python3.9/site-packages/aiodynamo/client.py", line 871, in send_request
    async for _ in self.throttle_config.attempts():
  File "/tmp/anjfssa/.venv/lib/python3.9/site-packages/aiodynamo/models.py", line 238, in attempts
    raise Throttled()
aiodynamo.errors.Throttled

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/anjfssa/aiodynamo_write.py", line 16, in <module>
    asyncio.run(main())
  File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/tmp/anjfssa/aiodynamo_write.py", line 14, in main
    await client.put_item("usertable", item={"key": "test", "x": {"hello": "world"}})
  File "/tmp/anjfssa/.venv/lib/python3.9/site-packages/aiodynamo/client.py", line 598, in put_item
    resp = await self.send_request(action="PutItem", payload=payload)
  File "/tmp/anjfssa/.venv/lib/python3.9/site-packages/aiodynamo/client.py", line 906, in send_request
    raise failed
  File "/tmp/anjfssa/.venv/lib/python3.9/site-packages/aiodynamo/client.py", line 886, in send_request
    return await self.http.post(
  File "/tmp/anjfssa/.venv/lib/python3.9/site-packages/aiodynamo/http/aiohttp.py", line 54, in post
    return cast(
  File "/usr/lib/python3.9/contextlib.py", line 135, in __exit__
    self.gen.throw(type, value, traceback)
  File "/tmp/anjfssa/.venv/lib/python3.9/site-packages/aiodynamo/http/aiohttp.py", line 22, in wrap_errors
    raise RequestFailed()
aiodynamo.http.base.RequestFailed

My system is Ubuntu 21.04, I ran ScyllaDB in the docker-compose, and I've tried scylladb/scylla-nightly:latest, scylladb/scylla:latest, scylladb/scylla:4.4.4 and scylladb/scylla:4.3.6. All of them work with boto3, but none work with aiodynamo.
I couldn't see anything meaningful in scylladb logs during the request.

I just installed fresh version of boto3 and aiodynamo[aiohttp] from pypi for this repro.

Any ideas?

@dimaqq
Copy link
Contributor Author

dimaqq commented Oct 28, 2021

Credentials.auto tried to pick your credentials from the environment and failing that from the magical AWS URL, etc.
You probably don't want that.
the boto sample has aws_access_key_id='None', aws_secret_access_key='None'
note these are strings, not Python None.
I would not be surprised if you have to replicate these settings for aiodynamo.

I think Throttled is a red herring, see #102 the actual error is not shown 🙈

@kittyandrew
Copy link

Hm, I'm not sure about that. On the one hand, you are right, in the repro-case I don't have any environment configured. But in my actual application I have gimmick AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY values loaded to the environment. And it still didn't work.

I'll add it to the repro case to make sure it's not an issue.

@ojii
Copy link
Contributor

ojii commented Oct 28, 2021

scylla returns the wrong (or at least a different) mimetype in JSON responses, and the aiohttp adaptor fails due to that. Either use the httpx adaptor or change the aiohttp one to ignore mimetypes.

With scylladb/scylla-nightly:latest, tests pass except for two:

  • tests/integration/test_client.py::test_exists fails because scylla doesn't support throughput and doesn't return any throughput information in describe_table.
  • tests/integration/test_client.py::test_update_item this one looks like an actual bug in scylladb. adding a new item to a string set fails.

@kittyandrew
Copy link

... change the aiohttp one to ignore mimetypes.

When you say that do you mean something like https://github.com/HENNGE/aiodynamo/blob/master/src/aiodynamo/http/aiohttp.py#L57, or something else? I'm trying to understand how easy is that to tweak, and whether it's worth it for me.

@ojii
Copy link
Contributor

ojii commented Oct 28, 2021

When you say that do you mean something like https://github.com/HENNGE/aiodynamo/blob/master/src/aiodynamo/http/aiohttp.py#L57, or something else? I'm trying to understand how easy is that to tweak, and whether it's worth it for me.

change that to content_type=None.

@kittyandrew
Copy link

Eh, that still didn't make it work for me.
Running scylladb/scylla-nightly:latest with --alternator-port=8000 --alternator-write-isolation=always --smp=1.

@ojii
Copy link
Contributor

ojii commented Oct 28, 2021

I run it with docker run --name scylla -p 8087:8000 scylladb/scylla-nightly:latest --alternator-port=8000 --alternator-write-isolation=always (I use port 8087 for my test db) and then just ran the test suite against it (after changing the aiohttp adaptor)

@kittyandrew
Copy link

kittyandrew commented Oct 28, 2021

Ohh, woops. I think it was my fault. At some point when tweaking repro code I changed something in the url, and the throttling error didn't help.

Can confirm httpx and tweaked aiohttp works for me.
Thank you again, by the way. I really didn't expect to resolve this quickly :)

I guess I will stick with httpx though, if aiohttp fix won't be in the lib, since I really have no desire to maintain my fork for this.

@kittyandrew
Copy link

...or not. After pluging httpx into my application, I've seen that request time went from 0.03s with aiohttp adapter to 0.7s with httpx.

All I changed was from

from aiodynamo.http.aiohttp import AIOHTTP  # @nocheckin: fork required for this to work.
from aiohttp import ClientSession

self.aioclient = ClientSession()
self.aiodynamo = Client(AIOHTTP(self.aioclient), Credentials.auto(), self.region, endpoint=URL(self.db_url))

to

from aiodynamo.http.httpx import HTTPX
from httpx import AsyncClient

self.aioclient = AsyncClient()
self.aiodynamo = Client(HTTPX(self.aioclient), Credentials.auto(), self.region, endpoint=URL(self.db_url))

And these numbers (0.03s and 0.7-1s) are similar both for scylladb and dynamodb-local for me, so I guess that's another issue. I just haven't used httpx before. Am I doing something terribly wrong here?

@dimaqq
Copy link
Contributor Author

dimaqq commented Oct 29, 2021

I can't see anything obviously wrong here.

~1s response times are pretty bad 😱

Off the top of my head, I'd consider two aspects:

  • httpx may have different connection pool defaults than aiohttp (though defaults seem sane in the docs 🤔 )
  • httpx can also do HTTP/2 which aiohttp cannot, however, HTTP/2 is hard/rare without TLS, and if you are running server in Docker, I don't think you have certs, so probably no HTTP/2 either.

psarna pushed a commit to scylladb/scylladb that referenced this issue Oct 31, 2021
Although the DynamoDB API responses are JSON, additional conventions apply
to these responses - such as how error codes are encoded in JSON. For this
reason, DynamoDB uses the content type `application/x-amz-json-1.0` instead
of the standard `application/json` in its responses.

Until this patch, Scylla used `application/json` in its responses. This
unexpected content-type didn't bother any of the AWS libraries which we
tested, but it does bother the aiodynamo library (see HENNGE/aiodynamo#27).

Moreover, we should return the x-amz-json-1.0 content type for future
proofing: It turns out that AWS already defined x-amz-json-1.1 - see:
https://awslabs.github.io/smithy/1.0/spec/aws/aws-json-1_1-protocol.html
The 1.1 content type differs (only) in how it encodes error replies.
If one day DynamoDB starts to use this new reply format (it doesn't yet)
and if DynamoDB libraries will need to differenciate between the two
reply formats, Alternator better return the right one.

This patch also includes a new test that the Content-Type header is
returned with the expected value. The test passes on DynamoDB, and
after this patch it starts to pass on Alternator as well.

Fixes #9554.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211031094621.1193387-1-nyh@scylladb.com>
@nyh
Copy link

nyh commented Oct 31, 2021

I can't see anything obviously wrong here.

~1s response times are pretty bad scream

Off the top of my head, I'd consider two aspects:

* `httpx` may have different connection pool defaults than `aiohttp` (though defaults seem sane in the docs thinking )

* `httpx` can also do HTTP/2 which `aiohttp` cannot, however, HTTP/2 is hard/rare without TLS, and if you are running server in Docker, I don't think you have certs, so probably no HTTP/2 either.

Alternator does not support HTTP 2 (neither does DynamoDB or DynamoDB local, by the way), so I doubt that's related.

This is a wild guess but unexplained fraction-second delays could be bad interaction between Naggle's algorithm and delayed ack:

  1. For some reason, the client sends the request in two write() system calls (e.g., perhaps sends the headers and the body in two separate system calls). The first system call immediately generates a packet and it is sent. The second packet is then delayed by Nagle's Algorithm until the first packet is acknowledged (the TCP wrongly hopes that until that time, the client will have sent even more data, and it can all be combined into a single packet).
  2. However, the server's TCP stack has the "delayed ack" feature - it does not ack the first packet util some time has passed, hoping it can combine multiple acks or even piggyback an ack on the response. Because the server doesn't send an ack, the client can't send the second packet - i.e., the end of the request.

You can verify in wireshark if the timing makes sense for this explanation. If it's this problem, you can try setting the TCP_NODELAY option on the client's socket - to disable Nagle's algorithm. Even more efficient is to use TCP_CORK - to explicitly tell the kernel to only send one packet after several write system calls. I don't know httpx can do any of that, or maybe it already does - it and not this problem.

@dimaqq
Copy link
Contributor Author

dimaqq commented Nov 1, 2021

Actually... I have used aiodynamo+httpx in the past, and the performance was fine.
The dev use was against dynalite and prod use was against aws dynamo.
I'll re-test the combo with newest library versions.

@kittyandrew
Copy link

kittyandrew commented Nov 2, 2021

Okay, I think the slowness is my fault (my profiler's).

I was using cProfile for my code, and when I switched from sync boto3 to async aiodynamo, I didn't think much about asyncio causing issues with the profiler. So basically something inside the profiler is slowing down everything async by an order of magnitude, and even more so for httpx.

(Haven't tested with httpx though, since I have no need for it as aiohttp is fixed in the scylla now)

nyh added a commit to scylladb/scylladb that referenced this issue Dec 29, 2021
Although the DynamoDB API responses are JSON, additional conventions apply
to these responses - such as how error codes are encoded in JSON. For this
reason, DynamoDB uses the content type `application/x-amz-json-1.0` instead
of the standard `application/json` in its responses.

Until this patch, Scylla used `application/json` in its responses. This
unexpected content-type didn't bother any of the AWS libraries which we
tested, but it does bother the aiodynamo library (see HENNGE/aiodynamo#27).

Moreover, we should return the x-amz-json-1.0 content type for future
proofing: It turns out that AWS already defined x-amz-json-1.1 - see:
https://awslabs.github.io/smithy/1.0/spec/aws/aws-json-1_1-protocol.html
The 1.1 content type differs (only) in how it encodes error replies.
If one day DynamoDB starts to use this new reply format (it doesn't yet)
and if DynamoDB libraries will need to differenciate between the two
reply formats, Alternator better return the right one.

This patch also includes a new test that the Content-Type header is
returned with the expected value. The test passes on DynamoDB, and
after this patch it starts to pass on Alternator as well.

Fixes #9554.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211031094621.1193387-1-nyh@scylladb.com>
(cherry picked from commit 6ae0ea0)
nyh added a commit to scylladb/scylladb that referenced this issue Dec 29, 2021
Although the DynamoDB API responses are JSON, additional conventions apply
to these responses - such as how error codes are encoded in JSON. For this
reason, DynamoDB uses the content type `application/x-amz-json-1.0` instead
of the standard `application/json` in its responses.

Until this patch, Scylla used `application/json` in its responses. This
unexpected content-type didn't bother any of the AWS libraries which we
tested, but it does bother the aiodynamo library (see HENNGE/aiodynamo#27).

Moreover, we should return the x-amz-json-1.0 content type for future
proofing: It turns out that AWS already defined x-amz-json-1.1 - see:
https://awslabs.github.io/smithy/1.0/spec/aws/aws-json-1_1-protocol.html
The 1.1 content type differs (only) in how it encodes error replies.
If one day DynamoDB starts to use this new reply format (it doesn't yet)
and if DynamoDB libraries will need to differenciate between the two
reply formats, Alternator better return the right one.

This patch also includes a new test that the Content-Type header is
returned with the expected value. The test passes on DynamoDB, and
after this patch it starts to pass on Alternator as well.

Fixes #9554.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211031094621.1193387-1-nyh@scylladb.com>
(cherry picked from commit 6ae0ea0)
@dimaqq
Copy link
Contributor Author

dimaqq commented Feb 2, 2022

I gave it a go, again... but I'm a bit stuck:

  • dynamodb-admin can create table, put item, scan, get item
  • aiodynamo tried to create an item, reports "request failed", but the item is actually created 😖

@dimaqq
Copy link
Contributor Author

dimaqq commented Feb 2, 2022

DEBUG:aiodynamo:sending request Request(url=URL('http://localhost:8000'), body=b'{"TableName":"test","Item":{"test":{"S":"test"},"quux":{"S":"sample-0"},"field-0":{"S":"value-0"},"field-1":{"S":"value-1"},"field-2":{"S":"value-2"},"field-3":{"S":"value-3"},"field-4":{"S":"value-4"},"field-5":{"S":"value-5"},"field-6":{"S":"value-6"},"field-7":{"S":"value-7"},"field-8":{"S":"value-8"},"field-9":{"S":"value-9"},"field-10":{"S":"value-10"},"field-11":{"S":"value-11"},"field-12":{"S":"value-12"},"field-13":{"S":"value-13"},"field-14":{"S":"value-14"},"field-15":{"S":"value-15"},"field-16":{"S":"value-16"},"field-17":{"S":"value-17"},"field-18":{"S":"value-18"},"field-19":{"S":"value-19"},"field-20":{"S":"value-20"},"field-21":{"S":"value-21"},"field-22":{"S":"value-22"},"field-23":{"S":"value-23"},"field-24":{"S":"value-24"},"field-25":{"S":"value-25"},"field-26":{"S":"value-26"},"field-27":{"S":"value-27"},"field-28":{"S":"value-28"},"field-29":{"S":"value-29"},"field-30":{"S":"value-30"},"field-31":{"S":"value-31"},"field-32":{"S":"value-32"},"field-33":{"S":"value-33"},"field-34":{"S":"value-34"},"field-35":{"S":"value-35"},"field-36":{"S":"value-36"},"field-37":{"S":"value-37"},"field-38":{"S":"value-38"},"field-39":{"S":"value-39"},"field-40":{"S":"value-40"},"field-41":{"S":"value-41"},"field-42":{"S":"value-42"},"field-43":{"S":"value-43"},"field-44":{"S":"value-44"},"field-45":{"S":"value-45"},"field-46":{"S":"value-46"},"field-47":{"S":"value-47"},"field-48":{"S":"value-48"},"field-49":{"S":"value-49"},"field-50":{"S":"value-50"},"field-51":{"S":"value-51"},"field-52":{"S":"value-52"},"field-53":{"S":"value-53"},"field-54":{"S":"value-54"},"field-55":{"S":"value-55"},"field-56":{"S":"value-56"},"field-57":{"S":"value-57"},"field-58":{"S":"value-58"},"field-59":{"S":"value-59"},"field-60":{"S":"value-60"},"field-61":{"S":"value-61"},"field-62":{"S":"value-62"},"field-63":{"S":"value-63"},"field-64":{"S":"value-64"},"field-65":{"S":"value-65"},"field-66":{"S":"value-66"},"field-67":{"S":"value-67"},"field-68":{"S":"value-68"},"field-69":{"S":"value-69"},"field-70":{"S":"value-70"},"field-71":{"S":"value-71"},"field-72":{"S":"value-72"},"field-73":{"S":"value-73"},"field-74":{"S":"value-74"},"field-75":{"S":"value-75"},"field-76":{"S":"value-76"},"field-77":{"S":"value-77"},"field-78":{"S":"value-78"},"field-79":{"S":"value-79"},"field-80":{"S":"value-80"},"field-81":{"S":"value-81"},"field-82":{"S":"value-82"},"field-83":{"S":"value-83"},"field-84":{"S":"value-84"},"field-85":{"S":"value-85"},"field-86":{"S":"value-86"},"field-87":{"S":"value-87"},"field-88":{"S":"value-88"},"field-89":{"S":"value-89"},"field-90":{"S":"value-90"},"field-91":{"S":"value-91"},"field-92":{"S":"value-92"},"field-93":{"S":"value-93"},"field-94":{"S":"value-94"},"field-95":{"S":"value-95"},"field-96":{"S":"value-96"},"field-97":{"S":"value-97"},"field-98":{"S":"value-98"},"field-99":{"S":"value-99"}},"ReturnValues":"NONE"}')

and then

DEBUG:aiodynamo:request failed

@nyh
Copy link

nyh commented Feb 2, 2022

@dimaqq what does "request failed" mean? Was there an HTTP error? What was the content of the HTTP reply?

@dimaqq
Copy link
Contributor Author

dimaqq commented Feb 2, 2022

Right, here's a qiuck fix to get benchmarks running:

                     await response.json(
-                        content_type="application/x-amz-json-1.0", encoding="utf-8"
+                        content_type=None, encoding="utf-8"
                     ),

Looks like ScyllaDB Alternator returns different MIME type than Amazon DynamoDB cc @nyh

@nyh
Copy link

nyh commented Feb 2, 2022

@dimaqq I thought I already fixed the mime type (scylladb/scylladb#9554) - which version of Scylla are you using? Can you please verify with "docker pull" that you are using a recent version, not some version that was called "latest" a year ago?

@dimaqq
Copy link
Contributor Author

dimaqq commented Feb 2, 2022

Scylla version 4.5.3-0.20211223.c8f14886d with build-id 9a5b504c51cbe8feb1217517d6977f7793b2971e starting ...

It's what gets pulled today: docker pull scylladb/scylla:latest though docker images reports that it's 5 weeks old.

@dimaqq
Copy link
Contributor Author

dimaqq commented Feb 2, 2022

Query performance (against single node, running in Docker, backed by host mount-bound volume, Linux laptop SSD)

row/s: 9066.190242153862
MB/s: 20.90234811938829

So, officially faster than DynamoDB (global tables, max provisioning) which topped out at ~5K rows/s IIRC.

@nyh
Copy link

nyh commented Feb 2, 2022

My mime-type fix reached 4.5 only 4 weeks ago (scylladb/scylladb@5d7064e) so this version is not recent enough for this fix.
Can you please try the nightly version? docker pull scylladb/scylla-nightly:latest

@dimaqq
Copy link
Contributor Author

dimaqq commented Feb 3, 2022

Nightly cannot start up at the moment, with the same arguments as normal build:

Scylla version 5.0.dev-0.20220201.00a9326ae with build-id 13ed134794204a18277ef3aacd075fb9070c81b7 starting ...
command used: "/usr/bin/scylla --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --developer-mode=1 --smp 1 --overprovisioned --listen-address 172.27.0.2 --rpc-address 172.27.0.2 --seed-provider-parameters seeds=172.27.0.2 --alternator-address 172.27.0.2 --alternator-port 8000 --alternator-write-isolation always --blocked-reactor-notify-ms 999999999 --skip-wait-for-gossip-to-settle 0"
parsed command line options: [log-to-syslog, (positional) 1, log-to-stdout, (positional) 0, default-log-level, (positional) info, network-stack, (positional) posix, developer-mode: 1, smp, (positional) 1, overprovisioned, listen-address: 172.27.0.2, rpc-address: 172.27.0.2, seed-provider-parameters: seeds=172.27.0.2, alternator-address: 172.27.0.2, alternator-port: 8000, alternator-write-isolation: always, blocked-reactor-notify-ms, (positional) 999999999, skip-wait-for-gossip-to-settle: 0]
2022-02-03 00:04:01,874 INFO exited: scylla-server (exit status 1; not expected)
Traceback (most recent call last):
  File "/opt/scylladb/scripts/libexec/scylla-housekeeping", line 196, in <module>
    args.func(args)
  File "/opt/scylladb/scripts/libexec/scylla-housekeeping", line 122, in check_version
    current_version = sanitize_version(get_api('/storage_service/scylla_release_version'))
  File "/opt/scylladb/scripts/libexec/scylla-housekeeping", line 80, in get_api
    return get_json_from_url("http://" + api_address + path)
  File "/opt/scylladb/scripts/libexec/scylla-housekeeping", line 75, in get_json_from_url
    raise RuntimeError(f'Failed to get "{path}" due to the following error: {retval}')
RuntimeError: Failed to get "http://localhost:10000/storage_service/scylla_release_version" due to the following error: <urlopen error [Errno 99] Cannot assign requested address>
2022-02-03 00:04:05,736 INFO spawned: 'scylla-server' with pid 117
Scylla version 5.0.dev-0.20220201.00a9326ae with build-id 13ed134794204a18277ef3aacd075fb9070c81b7 starting ...
command used: "/usr/bin/scylla --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --developer-mode=1 --smp 1 --overprovisioned --listen-address 172.27.0.2 --rpc-address 172.27.0.2 --seed-provider-parameters seeds=172.27.0.2 --alternator-address 172.27.0.2 --alternator-port 8000 --alternator-write-isolation always --blocked-reactor-notify-ms 999999999 --skip-wait-for-gossip-to-settle 0"
parsed command line options: [log-to-syslog, (positional) 1, log-to-stdout, (positional) 0, default-log-level, (positional) info, network-stack, (positional) posix, developer-mode: 1, smp, (positional) 1, overprovisioned, listen-address: 172.27.0.2, rpc-address: 172.27.0.2, seed-provider-parameters: seeds=172.27.0.2, alternator-address: 172.27.0.2, alternator-port: 8000, alternator-write-isolation: always, blocked-reactor-notify-ms, (positional) 999999999, skip-wait-for-gossip-to-settle: 0]
2022-02-03 00:04:06,143 INFO exited: scylla-server (exit status 1; not expected)
2022-02-03 00:04:07,144 INFO gave up: scylla-server entered FATAL state, too many start retries too quickly

I think I've seen this before, IIRC that's due to Scylla refusing to listen to [::] or 0.0.0.0 because then the node doesn't know own name which would be bad in a cluster. Too bad that it hurts devex plenty 😢

@nyh
Copy link

nyh commented Feb 3, 2022

I can't reproduce the above failure. I got a slightly newer nightly, but it worked:

$ docker run --name scylla -d -p 8000:8000 scylladb/scylla-nightly:latest --alternator-port=8000 --alternator-write-isolation=always
$ docker logs scylla |& less
...
Scylla version 5.0.dev-0.20220203.d309a8670 with build-id d3f2c9395a10f04bb59997
9dc3cb18f2b0ac4648 starting ...
...
$ curl http://localhost:8000/
healthy: localhost:8000

@syuu1228 does this scylla-housekeeping error seem familiar? Could it explain why scylla-server is not coming up?

I don't think Scylla is listening on 0.0.0.0 - why/where would it do that? A different problem might be that Scylla insists to listen on port 10000 (the REST API) by default on 127.0.0.1 - not the address you give it. Maybe that's a problem in your docker setup somehow (it works on mine...). You can try to override the REST API address with the "--api-address" option and see if it changes anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants