Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError('Invalid related remote id: ***') #3357

Open
prolibre opened this issue Apr 12, 2024 · 11 comments · May be fixed by #3471
Open

ValueError('Invalid related remote id: ***') #3357

prolibre opened this issue Apr 12, 2024 · 11 comments · May be fixed by #3471
Assignees
Labels
bug Something isn't working

Comments

@prolibre
Copy link

In Bookwyrm 0.7.3

I often get this error (via Flower) regarding the propagation of book changes between instances.
When I look at the link I think there is confusion in the id. Here for example :

('Edition', 'Work', 'parent_work', 'https://bw.heraut.eu/book/52087', 'https://bouquins.zbeul.fr/book/42398')

52087 corresponds to an edition (on an instance) whereas 42398 corresponds to a book on the other instance.

It's a shame because changes are not propagated.

Traceback (most recent call last):
File "/opt/bookwyrm/venv/lib/python3.11/site-packages/celery/app/trace.py", line 477, in trace_task
R = retval = fun(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^
File "/opt/bookwyrm/venv/lib/python3.11/site-packages/celery/app/trace.py", line 760, in protected_call
return self.run(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/contextlib.py", line 81, in inner
return func(*args, **kwds)
^^^^^^^^^^^^^^^^^^^
File "/opt/bookwyrm/bookwyrm/activitypub/base_activity.py", line 277, in set_related_field
raise ValueError(f"Invalid related remote id: {related_remote_id}")
ValueError: Invalid related remote id: https://bw.heraut.eu/book/52087

@prolibre prolibre added the bug Something isn't working label Apr 12, 2024
@hughrun
Copy link
Contributor

hughrun commented Oct 27, 2024

This is pretty confusing, but I have also seen this error.

I think you have the Edition and the Work around the wrong way:

Here's the full context for this error:

instance = origin_model.find_existing_by_remote_id(related_remote_id)
if not instance:
    raise ValueError(f"Invalid related remote id: {related_remote_id}")

origin_model is Work and related_remote_id is obviously https://bouquins.zbeul.fr/book/42398. I think it should actually be https://bw.heraut.eu/book/52089.

@prolibre can you explain more (if you can remember) where the update happened? I don't quite understand what we're looking at here: is it an update from bw.heraut.eu that has been received by bouquins.zbeul.fr, or is it the other way around, or is this appearing on the server where the update was made (in which case - which one)?

I've tracked this back through the code and it looks like the remote_id in this case might actually be the origin_id (which is why it can't be found). But I can't work out why that would ever happen. Any more context would be great.

@prolibre
Copy link
Author

Hello @hughrun

So the last error I get in log (because I get them quite often) is about a book that can't be found on my instance... https://bw.heraut.eu/book/71527 indicates that the book doesn't exist.

Error:
('Edition', 'Work', 'parent_work', 'https://bw.heraut.eu/book/71527', 'https://books.theunseen.city/book/323924')
to (Received) 2024-10-26 15:19:45 / (failed) 2024-10-26 15:19:49 (you'll understand why I specify this below)

However, if I search by title (Kon-Tiki) I find a book: https://bw.heraut.eu/book/71529/s/kon-tiki (with an id not far from the one in my logs).
This book 71529 was created on my instance at 2024-10-26 15:19:46, then modified at 2024-10-26 15:19:52. Its origin_id (in my database) is https://books.theunseen.city/book/323928 <- !! a different id from the log !!!

All this is probably not very clear, but I think this error comes from the book import phase on my instance.

@hughrun
Copy link
Contributor

hughrun commented Oct 27, 2024

@prolibre Thank you!

Actually this is very clear and I now realise I misunderstood what was happening. You've basically confirmed it's the same bug as #3019.

71527 actually does exist in your database (https://bw.heraut.eu/book/71527.json) but as you can see it doesn't have any Editions attached, which shouldn't normally be possible. Because there is no Edition data, it displays in the web as a 404.

I think I may have just worked out why this happens - more soon.

@hughrun
Copy link
Contributor

hughrun commented Oct 28, 2024

@prolibre sorry I only just picked up on this:

All this is probably not very clear, but I think this error comes from the book import phase on my instance.

Do you mean user imports? If so I strongly recommend you disable them until we have merged #3431 and issued a new release.

Or do you mean book imports e.g. from GoodReads?

Or do you mean something else?

@prolibre
Copy link
Author

@hughrun ah no sorry, I meant when adding a book from another instance. The term “import” is indeed confusing.

@hughrun
Copy link
Contributor

hughrun commented Oct 28, 2024

This is very perplexing: I spent most of the day looking into this and I can't work out how it is possible.

@prolibre are there any other errors that seem to happen around the same time?

@prolibre
Copy link
Author

@hughrun For the moment I can't see anything, but I'm going to increase the number of processes stored by Flower to keep more error logs.

@prolibre
Copy link
Author

prolibre commented Oct 29, 2024

@hughrun
Well, I have the impression that these errors are linked to duplicates.
I recently had 1 success and then 1 error with modifications corresponding to the same duplicate book on my instance (71755 and 71754).

SUCCESS
('Edition', 'Work', 'parent_work', 'https://bw.heraut.eu/book/71755', 'https://books.theunseen.city/book/441207')

FAILURE
('Edition', 'Work', 'parent_work', 'https://bw.heraut.eu/book/71754', 'https://books.theunseen.city/book/441207')

Traceback (most recent call last):
File "/opt/bookwyrm/venv/lib/python3.11/site-packages/celery/app/trace.py", line 477, in trace_task
R = retval = fun(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^
File "/opt/bookwyrm/venv/lib/python3.11/site-packages/celery/app/trace.py", line 760, in protected_call
return self.run(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/contextlib.py", line 81, in inner
return func(*args, **kwds)
^^^^^^^^^^^^^^^^^^^
File "/opt/bookwyrm/bookwyrm/activitypub/base_activity.py", line 280, in set_related_field
raise ValueError(f"Invalid related remote id: {related_remote_id}")
ValueError: Invalid related remote id: https://bw.heraut.eu/book/71754

@hughrun
Copy link
Contributor

hughrun commented Oct 31, 2024

@prolibre thankyou this is a super useful example. It's definitely the same bug as #3019.

So I can see here we have three identical Editions come in, 2 on https://bw.heraut.eu/book/71755 and one on https://bw.heraut.eu/book/71754

I'm investigating this, it's a significant bug.

@hughrun
Copy link
Contributor

hughrun commented Nov 1, 2024

Ok I've figured this out. It's late here: I'll post an explanation tomorrow my time.

@hughrun
Copy link
Contributor

hughrun commented Nov 3, 2024

Hmm ok I have not figured this out. It really is making no sense. What I know definitely for sure, from testing this a bit:

  1. These errors happen when an Activity is received by an instance. e.g. a user on another instance followed by someone on our instance adds a book to their To-Read shelf.
  2. In such circumstances, the Edition will be imported by two parallel processes (the book as a reference on the Status, and the book as a ShelfBook). This is explained a little bit in Duplicate book after manual addition #3019
  3. The set_related_field jobs always fail at find_existing_by_remote_id on the first Work, and usually all succeed on the second Work
  4. Sometimes but not always duplicate editions are created
  5. Usually the first Work ends up orphaned with no Editions. It will 404 if you open the URL in a browser, but will resolve if you load it as JSON.
  6. The remote_id on the first work definitely exists both before and after this failure and is sent correctly to the function

@hughrun hughrun self-assigned this Nov 16, 2024
@hughrun hughrun linked a pull request Nov 16, 2024 that will close this issue
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants