Skip to content

Commit

Permalink
Merge pull request #274 from NEU-DSG/cm/contributor-audio-migrations
Browse files Browse the repository at this point in the history
Database migrations for contributor audio and curation
  • Loading branch information
GracefulLemming authored May 18, 2023
2 parents cb3334b + bf20a07 commit 258ebaf
Show file tree
Hide file tree
Showing 5 changed files with 86 additions and 20 deletions.
11 changes: 6 additions & 5 deletions doc/database/media.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,12 @@
A timed media resource like video or audio, from an external source.
The main use case is audio recordings of each document.

| column | type | description |
| ------------- | ------- | --------------------------------------- |
| `id` | `uuid` | Primary key |
| `url` | `text` | Full URL for this media resource |
| `recorded_at` | `date?` | Date and time this resource was created |
| column | type | description |
| ------------- | --------------------- | -------------------------------------------------------------------------------------------- |
| `id` | `uuid` | Primary key |
| `url` | `text` | Full URL for this media resource |
| `recorded_at` | `date?` | Date and time this resource was created |
| `recorded_by` | `uuid? -> dailp_user` | The user that recorded this audio, if the audio was recorded by a Contributor on the website |

- Deleting also deletes all `media_slice` rows that reference it

Expand Down
14 changes: 14 additions & 0 deletions doc/database/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,19 @@ The docs here describe every single one of our database tables and columns.
If you write a database schema migration, you should change the corresponding docs in this folder to match the new shape of the database.
If you work with database tables that aren't sufficiently documented here, please add!

## How to write a new migration

To create a migration file, use the follow command inside your `nix develop` shell.

```zsh
cd types
sqlx migrate add <migration_description>
```

To test your migration without clearing your database, run `sqlx migrate run`.

Other developers will get your migrations when they run `dev-migrate-schema`.

## Abbreviations in this Folder

Most of our columns are `not null`, which is long to write so we introduced shorthand for describing database columns.
Expand Down Expand Up @@ -33,3 +46,4 @@ Most of our columns are `not null`, which is long to write so we introduced shor
- [collections](./collections.md): Edited collections tables
- [words](./words.md): Words, word parts, and abbreviation systems
- [media](./media.md): Audio and image resources
- [user](./user.md): User account records
13 changes: 13 additions & 0 deletions doc/database/user.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# User

## `dailp_user`

Metadata assocated with a user. `dailp_user.id` on this table is equal to `sub` in
AWS. Users are not to be confused with `contributor` entires, which are imported
from Google Sheets.

| column | type | description |
| -------------- | ------ | -------------------------------------------------- |
| `id` | `uuid` | Primary key, AWS Cognito `sub` claim |
| `display_name` | `text` | How the user's name should be presented in the app |
| `created_at` | `date` | When the user record was created |
41 changes: 26 additions & 15 deletions doc/database/words.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,35 @@

## `word`

| column | type | description |
| ------------------- | ------------------------ | --------------------------------------------------------------------------------------------------- |
| `id` | `uuid` | Primary key |
| `source_text` | `text` | Unambiguous transcription of the whole word |
| `simple_phonetics` | `text?` | Romanized phonetic spelling |
| `phonemic` | `text?` | Underlying phonemic representation, with more pronunciation details |
| `english_gloss` | `text?` | English translation |
| `recorded_at` | `date?` | When this word was written, only specified if it differs from when the document overall was written |
| `commentary` | `text?` | Linguistic or historical commentary supplied by an annotator |
| `audio_slice_id` | `uuid? -> media_slice` | Audio recording of the word read aloud |
| `document_id` | `uuid -> document` | Document the word is in |
| `page_number` | `text?` | Page number, only supplied for documents like dictionaries that may not have `document_page` rows |
| `index_in_document` | `bigint` | Position of the word in the whole document |
| `page_id` | `uuid? -> document_page` | Physical page containing this word |
| `character_range` | `int8range?` | Order of words in a paragraph is determined by character indices |
| column | type | description |
| ------------------------ | ------------------------ | --------------------------------------------------------------------------------------------------- |
| `id` | `uuid` | Primary key |
| `source_text` | `text` | Unambiguous transcription of the whole word |
| `simple_phonetics` | `text?` | Romanized phonetic spelling |
| `phonemic` | `text?` | Underlying phonemic representation, with more pronunciation details |
| `english_gloss` | `text?` | English translation |
| `recorded_at` | `date?` | When this word was written, only specified if it differs from when the document overall was written |
| `commentary` | `text?` | Linguistic or historical commentary supplied by an annotator |
| `audio_slice_id` | `uuid? -> media_slice` | Audio recording of the word read aloud, as ingested from Google Sheets. |
| `curated_audio_slice_id` | `uuid? -> media_slice` | A Contributor audio recording of the word read aloud, which has been selected by an Editor |
| `audio_curated_by` | `uuid? -> dailp_user` | The Editor who selected the Contributor audio recording to show, if one has been selected |
| `document_id` | `uuid -> document` | Document the word is in |
| `page_number` | `text?` | Page number, only supplied for documents like dictionaries that may not have `document_page` rows |
| `index_in_document` | `bigint` | Position of the word in the whole document |
| `page_id` | `uuid? -> document_page` | Physical page containing this word |
| `character_range` | `int8range?` | Order of words in a paragraph is determined by character indices |

- One of `page_id` or `character_range` must be supplied

## `word_user_media`

A join table linking user audio contributions to words in documents. This is a many-to-many relationship, so should be indexed on both keys, with a compound unique constraint. Ie. you cannot link the same audio to the same word multiple times. Additions should be written as upserts.

| column | type | description |
| ---------------- | --------------------- | ---------------------------------------- |
| `word_id` | `uuid -> word` | Word that is assocated with media slice. |
| `media_slice_id` | `uuid -> media_slice` | Media slice that is assocated with word. |

## `word_segment`

A part of a word, also known as a morpheme within a morphemic segmentation.
Expand Down
27 changes: 27 additions & 0 deletions types/migrations/20230504182127_add_user_audio.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
-- Add migration script here

create table dailp_user (
id autouuid primary key,
display_name text not null,
created_at date not null
);

alter table media_resource
add column recorded_by uuid,
add constraint recorded_by_fkey foreign key (recorded_by) references dailp_user (id) on delete set null;


alter table word
add column curated_audio_slice_id uuid,
add constraint curated_audio_slice_id_fkey
foreign key (curated_audio_slice_id) references media_slice (id) on delete set null,
add column audio_curated_by uuid,
add constraint audio_curated_by_fkey
foreign key (audio_curated_by) references dailp_user (id) on delete set null;


create table word_user_media (
word_id uuid not null references word (id) on delete cascade,
media_slice_id uuid not null references media_slice (id) on delete cascade,
primary key (word_id, media_slice_id)
);

0 comments on commit 258ebaf

Please sign in to comment.