Skip to content

Commit

Permalink
first release!
Browse files Browse the repository at this point in the history
  • Loading branch information
ak--47 committed Mar 1, 2022
1 parent 9c37bb9 commit a017d1c
Show file tree
Hide file tree
Showing 3 changed files with 145 additions and 91 deletions.
77 changes: 52 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,19 @@
**note:** if you're trying to add real-time tracking mixpanel to a node.js web application - this module is **NOT** what you want; you want **[mixpanel-node](https://github.com/mixpanel/mixpanel-node)** the official node.js SDK.

## wat.

![stream events, users, and groups into mixpanel](https://aktunes.neocities.org/mixpanel-import.gif)

This module is designed for streaming large amounts of event or object data to Mixpanel from a node.js environment. It implements the [`/import`](https://developer.mixpanel.com/reference/events#import-events), [`/engage`](https://developer.mixpanel.com/reference/profile-set), and [`/groups`](https://developer.mixpanel.com/reference/group-set-property) APIs by streaming JSON files that are compliant with Mixpanel's [data model](https://developer.mixpanel.com/docs/data-structure-deep-dive).

This is particularly useful for running one-time backfills or streaming data into Mixpanel from cloud-based data pipelines.
This utility is particularly useful for running one-time backfills or streaming data into Mixpanel from cloud-based data pipelines where RETL is not available.


## tldr;

this module can be used in *two ways*; as a native module or as a standalone ETL.
this module can be used in *two ways*:
- as a module via `require('mixpanel-import')`
- as a standalone script via `npx mixpanel-import`

### module usage
install `mixpanel-import` as a dependency
Expand All @@ -39,22 +44,9 @@ run it and providing a path to the data you wish to import:
```
$ node index.js ./pathToData
```
when running stand-alone, `pathToData` can be a `.json`, `.jsonl`, or `.ndjson` file OR a directory which contains said files.
when running stand-alone, `pathToData` can be a `.json`, `.jsonl`, `.ndjson`, or `.txt` file OR a directory which contains said files.

you will also need a `.env` configuration file with the following values:

```
# if using service account auth; these 3 values are required:
MP_PROJECT={{your-mp-project}}
MP_ACCT={{your-service-acct}}
MP_PASS={{your-service-pass}}
# if using secret based auth; only this value is required
MP_SECRET={{your-api-secret}}
# this is optional (but strongly encouraged)
MP_TOKEN={{your-mp-token}}
```
you will also need a [`.env` configuration file](#environment-variables)

## arguments

Expand All @@ -63,7 +55,7 @@ when using `mixpanel-import` in code, you will pass in 3 arguments: [`credentia
### credentials
mixpanel's ingestion APIs authenticate with [service accounts](https://developer.mixpanel.com/reference/service-accounts) OR [API secrets](https://developer.mixpanel.com/reference/authentication#service-account); service accounts are the preferred authentication method.

using a service account:
#### service account:
```javascript
const creds = {
acct: `{{my-servce-acct}}`, //service acct username
Expand All @@ -73,14 +65,16 @@ const creds = {
}
const importedData = await mpImport(creds, data, options);
```
using secrets:
#### API secret:
```javascript
const creds = {
secret: `{{my-api-secret}}`, //api secret (deprecated auth)
token: `{{my-project-token}}` //project token
}
const importedData = await mpImport(creds, data, options);
```

#### environment variables:
note: it is possible to delegate the authentication details to environment variables, using a `.env` file of the form:

```
Expand All @@ -96,7 +90,7 @@ MP_SECRET={{your-api-secret}}
MP_TOKEN={{your-mp-token}}
```

if using environment variables for authentication, pass `null` as the first argument to the module:
if using environment variables for authentication, pass `null` as the `creds` (first argument) to the module:

```javascript
const importedData = await mpImport(null, data, options);
Expand All @@ -107,12 +101,12 @@ the `data` param represents the data you wish to import; this might be [events](

the value of data can be:

- a path to a _file_, which contains records as `.json`, or `.jsonl/.ndjson`
- a path to a _file_, which contains records as `.json`, `.jsonl`, `.ndjson`, or `.txt`
```javascript
const data = `./myEventsToImport.json`
const importedData = await mpImport(creds, data, options);
```
- a path to a _directory_, which contains files that have records as `.json`, or `.jsonl/.ndjson`
- a path to a _directory_, which contains files that have records as `.json`, `.jsonl`, `.ndjson`, or `.txt`
```javascript
const data = `./myEventsToImport/`
const importedData = await mpImport(creds, data, options);
Expand Down Expand Up @@ -145,13 +139,46 @@ const options = {
strict: true, //use strict mode?
logs: false, //print to stdout?

//a reference to a function that will be called on every record
//useful if you need to transform the data before streaming
//a function reference to be called on every record
//useful if you need to transform the data
transformFunc: function noop(a) { return a }
}
```
**note**: the `recordType` param is very important; by default this module assumes you wish to import `events` but change this value to `user` or `group` if you are importing other entities.
**note**: the `recordType` param is very important; by default this module assumes you wish to import `event` records but change this value to `user` or `group` if you are importing other entities.

## recipies
the `transformFunc` is useful because it can preprocess the records using arbitrary javascript. here are some examples:

- putting a `token` on every `user` record:
```javascript
function addToken(user) {
user.token = `{{my token}}`
return user
}

let res = await mpImport(creds, data, { transformFunc: addToken, recordType: 'user' })
```
- constructing an `$insert_id` for each event:

```javascript
const md5 = require('md5')

function addInsert(event) {
let hash = md5(event);
event.properties.$insert_id = hash;
return event
}
let res = await mpImport(creds, data, { transformFunc: addInsert }
```
## test data
sometimes it's helpful to generate test data, so this module includes [a separate utility](https://github.com/ak--47/mixpanel-import/blob/main/generateFakeData.js) to do that:
```bash
$ npm run generate
```
`someTestData.json` will be written to `./testData`
## why?
because... i needed this and it didn't exists... so i made it.
Expand Down
Loading

0 comments on commit a017d1c

Please sign in to comment.