Module fetching

Source
Expand description

Asynchronous retrieval of missing data.

FetchingDataSource combines a local storage implementation with a remote data availability provider to create a data sources which caches data locally, but which is capable of fetching missing data from a remote source, either proactively or on demand.

This implementation supports three kinds of data fetching.

Β§Proactive Fetching

Proactive fetching means actively scanning the local database for missing objects and proactively retrieving them from a remote provider, even if those objects have yet to be requested by a client. Doing this increases the chance of success and decreases latency when a client does eventually ask for those objects. This is also the mechanism by which a query service joining a network late, or having been offline for some time, is able to catch up with the events on the network that it missed.

The current implementation of proactive fetching is meant to be the simplest effective algorithm which still gives us a reasonable range of configuration options for experimentation. It is subject to change as we learn about the behavior of proactive fetching in a realistic system.

Proactive fetching is currently implemented by a background task which performs periodic scans of the database, identifying and retrieving missing objects. This task is generally low priority, since missing objects are rare, and it will take care not to monopolize resources that could be used to serve requests. To reduce load and to optimize for the common case where blocks are usually not missing once they have already been retrieved, we distinguish between major and minor scans.

Minor scans are lightweight and can run very frequently. They will only look for missing blocks among blocks that are new since the previous scan. Thus, the more frequently minor scans run, the less work they have to do. This allows them to run frequently, giving low latency for retrieval of newly produced blocks that we failed to receive initially. Between each minor scan, the task will sleep for a configurable duration to wait for new blocks to be produced and give other tasks full access to all shared resources.

Every nth scan (n is configurable) is a major scan. These scan all blocks from 0, which guarantees that we will eventually retrieve all blocks, even if for some reason we have lost a block that we previously had (due to storage failures and corruptions, or simple bugs in this software). These scans are rather expensive (although they will release control of shared resources many times during the duration of the scan), but because it is rather unlikely that a major scan will discover any missing blocks that the next minor scan would have missed, it is ok if major scans run very infrequently.

Β§Active Fetching

Active fetching means reaching out to a remote data availability provider to retrieve a missing resource, upon receiving a request for that resource from a client. Not every request for a missing resource triggers an active fetch. To avoid spamming peers with requests for missing data, we only actively fetch resources that are known to exist somewhere. This means we can actively fetch leaves and headers when we are requested a leaf or header by height, whose height is less than the current chain height. We can fetch a block when the corresponding header exists (corresponding based on height, hash, or payload hash) or can be actively fetched.

Β§Passive Fetching

For requests that cannot be actively fetched (for example, a block requested by hash, where we do not have a header proving that a block with that hash exists), we use passive fetching. This essentially means waiting passively until the query service receives an object that satisfies the request. This object may be received because it was actively fetched in responsive to a different request for the same object, one that permitted an active fetch. Or it may have been fetched proactively.

ModulesΒ§

block πŸ”’
Fetchable implementation for BlockQueryData and PayloadQueryData.
header πŸ”’
Header fetching.
leaf πŸ”’
Fetchable implementation for LeafQueryData.
state_cert πŸ”’
Fetching for light client state update certificates.
transaction πŸ”’
Transaction fetching.
vid πŸ”’
Fetchable implementation for VidCommonQueryData.

StructsΒ§

AggregatorMetrics πŸ”’
Builder
Builder for FetchingDataSource with configuration.
Fetcher πŸ”’
Asynchronous retrieval and storage of Fetchable resources.
FetchingDataSource
The most basic kind of data source.
Heights πŸ”’
Notifiers πŸ”’
Pruner
ScannerMetrics πŸ”’

TraitsΒ§

AvailabilityProvider
A provider which can be used as a fetcher by the availability service.
FetchRequest πŸ”’
Fetchable πŸ”’
Objects which can be fetched from a remote DA provider and cached in local storage.
RangedFetchable πŸ”’
ResultExt πŸ”’
Storable πŸ”’
An object which can be stored in the database.

FunctionsΒ§

passive πŸ”’
Turn a fallible passive fetch future into an infallible β€œfetch”.
range_chunks πŸ”’
Break a range into fixed-size chunks.
range_chunks_rev πŸ”’
Break a range into fixed-size chunks, starting from the end and moving towards the start.
select_some πŸ”’
Get the result of the first future to return Some, if either do.

Type AliasesΒ§

PassiveFetch πŸ”’