Skip to content

Latest commit

 

History

History
57 lines (41 loc) · 3.88 KB

CHANGELOG.md

File metadata and controls

57 lines (41 loc) · 3.88 KB

WIP

  • Introduce stream abstraction
    • meta only mode for sync
  • Importing existing connections
  • Complete stream abstraction
    • API only stateless mode mode
    • stream specific sync

2023-02-18 Removed NEXT_PUBLIC_SERVER_URL env var

  • We now automatically get the needed value from window / hostname / VERCEL_URL depending on the deployment environment

2023-01-27 Replace pg_cron + graphile-worker with inngest for reliable background sync

  • Inngest is much more reliable, scalable and debuggable.
  • It also allows the core worker implementation to be not dependent on postgres / Supabase if you choose to use a different metadata backend instead of Postgres
  • Step function support in Inngest will allow us to workaround the 10 - 60s limitation on Vercel functions for syncs that take longer
    • It also opens up the possibility for us to implement sync in a more advanced patterns, whether granular resumable or fan out

2022-10-29 Background sync MVP

Venice will now automatically sync in the background in addition addition to listening for webhook.

This is particularly useful for situations where webhooks aren't always available (e.g. sync current portfolio value due to price updating) or if your server was offline / encountered a problem during webhook handling

By default we will look for all connections that have not been sync'ed in the last 24 hours.

Upgrade requirements

  • Add a WORKER_INVOCATION_SECRET env var for security
  • Ensure your postgres instance supports the following two extensions
  • Make sure your NEXT_PUBLIC_SERVER_URL is set correctly if not already

Implementation detail

The main design goal is to keep the infrastructure lean and use the existing tooling as much as possible. Therefore we kept the job queue in postgres via graphile-worker

  • graphile-worker's main run function is implemented a next.js api route /api/worker, each run can execute multiple jobs
    • This means that the actual the actual work performed by the tasks are happening inside HTTP requests.
    • This design should scale really well because you only have to scale the HTTP server (which is stateless and automatic on Vercel)
  • All business logic lives inside graphile-worker tasks. Notably graphile-worker has its own implementation of cron, which contains two tasks (recurring task scheduleTasks and syncPipeline)
    • Recurring tasks (aka cron) in graphile-worker is not to be confused with pg_cron and serves a separate purpose. graphile-worker is a node program and cannot run indefinitely in serverless environments, therefore pg_cron is needed to trigger the worker event loop.
  • pg_cron has only a single schedule called runWorkerEveryMinute, and does a HTTP POST to $NEXT_PUBLIC_SERVER_URL/api/worker to trigger the worker
    • This means that jobs will normally get run once per minute
    • If you are using Supabase, then we added an optimization in the form of a database webhook that will trigger /api/worker anytime new job gets added. This means jobs should run with sub-second latency on supabase.
  • graphile-worker normally deletes jobs, we archive them into graphile_worker.jobs_completed table for logging / analytics
  • Migrates are required and they are run as part of the build:worker command in apps/web. This should be automatic if you use Vercel

Architecture diagram Architecture

Limitations

  • If you are using vercel, then sync tasks may not take more than 10 seconds on hobby plan or 60 seconds on team plan, otherwise the function execution will time out.