You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An input stream opened to a cache file can outlive the cache file itself, in which case the current cache code will delete it (or attempt to) before the stream is closed. Under windows, this fails immediately; on max/linux, it succeeds. In either case, the file continues to exist for some time but is not counted against the cache total.
In general, even without explicit cache clearing, any cache file input stream might still be in used when the cache is asked to store more files, which can result in a cleanup, so this needs to be handled in a general fashion
We cannot track the lifetime of open input streams, so we have to handle this another way:
Fix tests that leave the stream unconsumed -- this is the cause of the immediate failure
Under windows, if we fail to delete, we keep track of the file as obsolete, count it against the total disk space, and attempt to delete it later until it succeeds
Under macos/linux, possibly keep track an unlinked but undeleted inode, and count it against the total disk space
The text was updated successfully, but these errors were encountered:
As @radeusgd points out, the Data.fetch interface is meant to be the only one used by end-users, and the Data methods always close streams.
This suggests that should consider still-open streams to be either (1) a library bug, or (2) a race condition caused by cache pressure during a long fetch. We don't have to concern ourselves with possibility that an end user will try to open a stream and keep it open for a long period.
Solving the race condition requires checking for successful file deletion, and retrying, but we can assume that undeleted files will only last a short time.
Indeed, using HTTP.fetch without consuming the output is a bug. I wonder what we could do to detect these.
Perhaps we could add a piece of code to the finalizer that logs a warning when a stream is closed by GC reclaiming its associated Managed_Resource (but such warning should only apply to the HTTP.fetch streams, other streams like the materialized one are supposed to be handled by GC like that). When the stream is manually closed (i.e. by being consumed - for example like in Data.fetch by being materialized) this would not be logged. We could then inspect if our tests end up logging this.
To protect against unexpected unclosed streams, it might be possible to add a wrapper to detect stream closure, and defer cleanup of obsolete cache files until the streams are closed. This would require extensive testing, though, so it's not a quick fix.
But for detecting unclosed streams in current library code (and end-user code that goes throught Data.fetch) this logging would be enough.
An input stream opened to a cache file can outlive the cache file itself, in which case the current cache code will delete it (or attempt to) before the stream is closed. Under windows, this fails immediately; on max/linux, it succeeds. In either case, the file continues to exist for some time but is not counted against the cache total.
In general, even without explicit cache clearing, any cache file input stream might still be in used when the cache is asked to store more files, which can result in a cleanup, so this needs to be handled in a general fashion
We cannot track the lifetime of open input streams, so we have to handle this another way:
The text was updated successfully, but these errors were encountered: