kappa-core is a minimal peer-to-peer database, based on append-only logs and materialized views.
New to kappa architecture? There is a short introduction.
This example sets up an on-disk log store and an in-memory view store. The view tallies the sum of all of the numbers in the logs, and provides an API for getting that sum.
var kappa = require('kappa-core')
var view = require('kappa-view')
var memdb = require('memdb')
// Store logs in a directory called "log". Store views in memory.
var core = kappa('./log', { valueEncoding: 'json' })
var store = memdb()
// View definition
var sumview = view(store, function (db) {
return {
// Called with a batch of log entries to be processed by the view.
// No further entries are processed by this view until 'next()' is called.
map: function (entries, next) {
db.get('sum', function (err, value) {
var sum
if (err && err.notFound) sum = 0
else if (err) return next(err)
else sum = value
})
entries.forEach(function (entry) {
if (typeof entry.value === 'number') sum += entry.value
})
db.put('sum', sum, next)
},
// Whatever is defined in the "api" object is publicly accessible
api: {
get: function (core, cb) {
this.ready(function () { // wait for all views to catch up
cb(null, sum)
})
}
}
}
})
// the api will be mounted at core.api.sum
core.use('sum', 1, sumview) // name the view 'sum' and consider the 'sumview' logic as version 1
core.writer('default', function (err, writer) {
writer.append(1, function (err) {
core.api.sum.get(function (err, value) {
console.log(value) // 1
})
})
})
var kappa = require('kappa-core')
Create a new kappa-core database.
storage
is a random-access-storage function, or a string. If a string is given, random-access-file is used with that string as the filename.- Valid
opts
include:valueEncoding
: a string describing how the data will be encoded.multifeed
: A preconfigured instance of multifeed
Get or create a local writable log called name
. If it already exists, it is
returned, otherwise it is created. A writer is an instance of
hypercore.
Fetch a log / feed by its public key (a Buffer
or hex string).
An array of all hypercores in the kappa-core. Check a feed's key
to find the
one you want, or check its writable
/ readable
properties.
Only populated once core.ready(fn)
is fired.
Install a view called name
to the kappa-core instance. A view is an object of
the form
// All are optional except "map"
{
// Process each batch of entries
map: function (entries, next) {
entries.forEach(function (entry) {
// ...
})
next()
},
// Your useful functions for users of this view to call
api: {
someSyncFunction: function (core) { return ... },
someAsyncFunction: function (core, cb) { process.nextTick(cb, ...) }
},
// Save progress state so processing can resume on later runs of the program.
// Not required if you're using the "kappa-view" module, which handles this for you.
fetchState: function (cb) { ... },
storeState: function (state, cb) { ... },
clearState: function (cb) { ... }
// Runs after each batch of entries is done processing and progress is persisted
indexed: function (entries) { ... },
// Number of entries to process in a batch
maxBatch: 100,
}
NOTE: The kappa-core instance core
is always passed as the first parameter
in all of the api
functions you define.
version
is an integer that represents what version you want to consider the
view logic as. Whenever you change it (generally by incrementing it by 1), the
underlying data generated by the view will be wiped, and the view will be
regenerated again from scratch. This provides a means to change the logic or
data structure of a view over time in a way that is future-compatible.
The fetchState
, storeState
, and clearState
functions are optional: they
tell the view where to store its state information about what log entries have
been indexed thus far. If not passed in, they will be stored in memory (i.e.
reprocessed on each fresh run of the program). You can use any backend you want
(like leveldb) to store the Buffer
object state
. If you use a module like
kappa-view, it will handle state
management on your behalf.
indexed
is an optional function to run whenever a new batch of entries have
been indexed and written to storage. Receives an array of entries.
Wait until all views named by viewNames
are caught up. E.g.
// one
core.ready('sum', function () { ... })
// or several
core.ready(['kv', 'refs', 'spatial'], function () { ... })
If viewNames is []
or not included, all views will be waited on.
Pause some or all of the views' indexing process. If no viewNames
are given,
they will all be paused. cb
is called once the views finish up any entries
they're in the middle of processing and are fully stopped.
Resume some or all paused views. If no viewNames
is given, all views are
resumed.
Create a duplex replication stream. opts
are passed in to
multifeed's API of the same name.
Ensure that isInitiator
to true
to one side, and false
on the other. This is necessary for setting up the encryption mechanism.
Event emitted when an error within kappa-core has occurred. This is very important to listen on, lest things suddenly seem to break and it's not immediately clear why.
With npm installed, run
$ npm install kappa-core
Here are some useful modules that play well with kappa-core for building materialized views:
- unordered-materialized-bkd: spatial index
- unordered-materialized-kv: key/value store
- unordered-materialized-backrefs: back-references
kappa-core is built atop two major building blocks:
- hypercore, which is used for (append-only) log storage
- materialized views, which are built by traversing logs in potentially out-of-order sequence
hypercore provides some very useful superpowers:
- all data is cryptographically associated with a writer's public key
- partial replication: parts of logs can be selectively sync'd between peers, instead of all-or-nothing, without loss of cryptographic integrity
Building views in arbitrary sequence is more challenging than when order is known to be topographic or sorted in some way, but confers some benefits:
- most programs are only interested in the latest values of data; the long tail of history can be traversed asynchronously at leisure after the tips of the logs are processed
- the views are tolerant of partially available data. Many of the modules listed in the section below depend on topographic completeness: all entries referenced by an entry must be present for indexes to function. This makes things like the equivalent to a shallow clone (think git), where a small subset of the full dataset can be used and built on without breaking anything.
kappa-core is built atop ideas from a huge body of others' work:
- flumedb
- secure scuttlebutt
- hypercore
- hyperdb
- forkdb
- hyperlog
- a harmonious meshing of ideas with @substack in the south of spain
ISC