Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drive.download() does not work as expected #350

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

gmaclennan
Copy link

Calling await drive.download() does not download the expected data if you do not wait for the drive to update first. Currently drive.update({ wait: true }) does not work as expected either. This is possibly due to hyperbee read streams not being updated to account for the change in default behaviour of hypercore changing to updates being local only by default.

This PR adds a failing test, but I'm not sure if the fix belongs here or in hyperbee.


// Must do this in order for this test to pass
// await drive2.db.core.update({ wait: true })
await drive2.download()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's expected behaviour because you didn't receive the first version/length update yet

If I understand correctly, you can "update" only once for waiting to "bootstrap" into the network, so you receive the latest length from peers, but then subsequent updates won't work so you should not rely on force updating to receive a new length

I might be wrong, I was also confused with this for a long time haha

In other news, we're working to improve this so replicating becomes way easier!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Funny enough, check this really old version of drives:
https://github.com/holepunchto/drives/blob/43a2f23fc39a0cce713b4d7a62779490d1ccc11f/download.js#L40
I experienced something similar

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Top summary: I would say it's ok to wait for the first update

We just need to fix Hyperbee so you can just do drive.update({ wait: true }) instead of accessing the db core, etc but we have a PR for that already

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @LuKks - the behaviour seemed to change with Hypercore 10.6.0 which made updates local only - at least I had a test that did something similar to this (without calling update({ wait: true }) first) that worked with prior to 10.6.0, but maybe I was using it wrong!

I do find the update() stuff to be confusing - I'm never sure when I need to call it to make sure things work as expected, and since we don't exclusively use hyperswarm for discovery (we also use mdns) I'm never quite sure how the findingPeers stuff is supposed to work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I have been using this for implementing a "live" download functionality: https://github.com/digidem/mapeo-core-next/blob/main/lib/blob-store/live-download.js#L87

Is this something you would consider adding to hyperdrive?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once Hyperbee update method is fixed then just do like so:

const swarm = new Hyperswarm()
const done = drive.findingPeers()
swarm.on('connection', (socket) => drive.replicate(socket))
swarm.join(drive.discoveryKey)
swarm.flush().then(done, done)
await drive.update({ wait: true })

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep in mind something like this:

Let's say you already have the code above working and everything is ok, let's say you later re-execute again for a second time this:
await drive.update({ wait: true })

You expect the drive to PULL from other peers their latest length, right? That's the main issue, peers must PUSH to you new data in this case the latest version/length

So you only get background updates for new versions! I think you can always use the Hypercore 'append' event to know when you receive an update

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, I didn't answer your doubt about how findingPeers works

First check this: https://github.com/holepunchto/hyperdrive#const-done--drivefindingpeers
Very important is that you don't depend on Hyperswarm

When you do this:
const done = core.findingPeers()
That does literally almost nothing, it's just an internal counter that gets increased

So all requests made after that are put on hold, until you call done(), so call it when your mdns thing has connected to all peers that it found, that's it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, thanks for that @LuKks, helpful. I think in our case the main challenge is deciding when "done" happens - new devices can appear at any time, so it's never actually "done".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you start the mdns search, if there is a peer available, it's normally extremely fast to appear, right?

So you could just give 3-5 seconds of wait until a peer appears, otherwise continue without

When you enter a swarm topic i.e. drive discovery key and there is no peers, the update takes a few seconds as well, you should try it out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants