Bulk Data Import Server (Data Consumer)
This server is an experimental prototype implementation of a Data Consumer application, as defined in the Bulk Data Ping and Pull Import Proposal. It is available online here, but you will have more options if you run it locally.
The server behaves like a Bulk Data Client and consumes bulk-data NDJSON files. The imported files are immediately discarded if using the online version. When ran locally, it can also be configured to temporarily store the files in the filesystem, or to upload them to S3 bucket.
Also, the online version is configured to use the Bulk Data Reference Implementation server as data provider. To use other bulk data servers you will have to run this app locally.
git clone https://github.com/smart-on-fhir/bulk-import-consumer.git
cd bulk-import-consumer
Once you are into the project folder, make sure you are using NodeJS >= 15. If
you have nvm
just run nvm use
. Then install the dependencies:
npm i
You need to set a fey configuration variables before the server is started.
There is an example configuration file example.env
in the project root. Start
by renaming it to .env
and then edit it as needed.
-
NODE_ENV
- (optional, string) Can beproduction
ordevelopment
. Defaults toproduction
-
PORT
- (optional, number) Defaults to3001
-
HOST
- (optional, string) Defaults tolocalhost
-
JOBS_PATH
- (optional, string) A path to a folder in which the import jobs are (temporarily) stored. Should be relative to the project root. Defaults tojobs
-
JOBS_ID_LENGTH
- (optional, number) The length of the random job ID, also used as a sub-folder name inside theJOBS_PATH
folder. Defaults to8
-
JOBS_MAX_AGE
- (optional, number) Number of minutes after which the import jobs are deleted. If the server stores files in the file system, those files will be deleted as well. Note that this timeout is computed starting from the moment the import procedure is started, so make sure it is long enough for the import to be completed, plus some additional time for the files to be available if you need them. Defaults to5
. -
JSON_CONTENT_TYPES
- (optional, string) Comma-separated list of mime types types that should be recognized as JSON. I shouldn't need to change this one. Defaults toapplication/json,application/fhir+json,application/json+fhir
-
JWT_SECRET
- (REQUIRED, string) Random string secret used by the server to sign tokens -
ACCESS_TOKEN_EXPIRE_IN
- (optional, string) The lifespan of the access tokens issued by this server in minutes. Defaults to5
. -
NDJSON_MAX_LINE_LENGTH
- (optional, number) The longest (as number of characters) NDJSON line that we can parse without taking too much memory. Defaults to500000
-
PUBLIC_KEY
- (REQUIRED, string) The public key should be provided while registering this server as a client of a Data Provider -
PRIVATE_KEY
- (REQUIRED, string) The private key used by this server as to sign tokens sent to the Data Provider. This should be a JWK as JSON string -
DESTINATION_TYPE
- (optional, string) What to do with imported files. Options are:dev-null
(default) - discard them immediatelytmp-fs
- store them in the file system. SeeJOBS_PATH
,JOBS_ID_LENGTH
andJOBS_MAX_AGE
.s3
upload them to S3 bucket (see AWS options below)
-
AWS_S3_BUCKET_NAME
- (string, required if DESTINATION_TYPE is s3) -
AWS_ACCESS_KEY_ID
- (string, required if DESTINATION_TYPE is s3) -
AWS_SECRET_ACCESS_KEY
- (string, required if DESTINATION_TYPE is s3) -
AWS_API_VERSION
- (string, required if DESTINATION_TYPE is s3) Can be2006-03-01
orlatest
. Defaults to2006-03-01
. -
AWS_REGION
- (string, required if DESTINATION_TYPE is s3) Defaults tous-east-1
After the configuration is complete, start the server by running npm start
.
Then open the URL printed in the terminal.
To trigger an import you should make a "ping" request to the server as described here.
The import kick-off endpoint of the server is http://{HOST}:{PORT}/$import
.
You can use this server for testing, yf you are extending your Data Provider with import ping functionality. You can also try the sample import client from https://github.com/smart-on-fhir/bulk-import-client
In order for an export and import to happen, both the Data Provider and Data consumer need to be registered as clients of each other.
- The consumer needs to know the provider and allow it to send ping (import kick-off) requests.
- The provider needs to know the consumer and allow it to make bulk data exports.
- Both sides should use SMART backend services authentication
There is a dedicated UI for that. Just start the server go to its URL (http://HOST
:PORT
).
The client should have a public and private keys. The private key is used by the client to sign
its authentication tokens. The server only needs to know the client's public key, which can be
provided in two ways:
- Static - At registration time as JWK
- Dynamic - At authentication time the server will fetch it from the JWKS URL provided
during registration.
Once a public key is provided as
JWK
orJWKS URI
, click "Register" and you'll get back aclient_id
which the client should use to authenticate.
This is basically the same procedure but in reversed order.
You need to have a public/private key pair. The server already has those kays pre-configured.
You can change/re-generate those kays if you want (see below).
- Provide toy public key as JWK - you can find your public key in PUBLIC_KEY
in your
.env
configuration file.
- Provide a JWKS URL - your jwks url is http://{HOST}:{PORT}/jwks
. Note that this will
only work if this server is deployed on the internet. If it is running on localhost or
in your local network, the Data Provider won't be able to access it.
The server comes with a key generator which can also be used to generate its own keys. To do so:
- Go to the server UI, open the generator, select an algorithm and click generate keys.
- Copy the generated public and private keys (as JWK) and paste them as
PUBLIC_KEY
andPRIVATE_KEY
in your config file. - Restart the server
WARNING: Once the keys are regenerated, you will have to do the registrations (described above) again.