Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating sr readme, adding raws, & batch info #70

Merged
merged 14 commits into from
Nov 18, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 6 additions & 7 deletions repository_level_conventions.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
> Some estimates of imprecision are given by Margin of Error.
> Directionality definitions help frame the boundaries meant by annotated times.
> The fields in the gold datasets should be standardized.
> Naming conventions - batches: `issue-number-identifiers.txt`
> Naming conventions - batches: `repoName-issueNumber(-identifier).txt`

## Data Formatting and Precision Conventions
### Time Point Notation
Expand Down Expand Up @@ -80,12 +80,11 @@ especially in cases of human perception.
The conventions for precision hold until new needs of the project are required.

## File Naming Conventions
Batches should be named all in lower case in this format: `issue-number-identifiers.txt`.
Where the issue number is the repository (usually [AAPB Collaborations Repo Issues](https://github.com/clamsproject/aapb-collaboration)) that is the discussion/documentation
of how this batch was chosen and created. This includes an issue number.
Any other identifiers come after this, and can be used to denote different batches created from the same issue.
Because batches can be reused for disparate projects, identifiers should indicate some property about the GUIDs in that batch,
as opposed to what happens to that batch during a project.
Batches should be named all in lower case in this format: `repoName-issueNumber(-identifier).txt` (parenthesis means optional parts).
The `repoName-issueNumber` part points to a GitHub issue (usually on [AAPB Collaborations Repo Issues](https://github.com/clamsproject/aapb-collaboration)) that contains the discussion/documentation of how this batch was chosen and created.
Any other `identifier`s come after this, and can be used to denote different batches created from the same issue. This will allow a family of batches stay together in usual "listing" operations in file systems.
Because batches can be reused for disparate projects, an identifier can indicate some property about the GUIDs in that batch,
but should not indicate particularity of the annotation project that the batch was used.
If no real discerning quality can be used as an identifier, use `abcd` lettering to denote numbering.
Finally, the whole name of the batch should use lowercase and `-`dashes.

Expand Down
41 changes: 21 additions & 20 deletions scene-recognition/readme.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,25 @@
# Scene Recognition

## Project Overview
This project is a new attempt at detecting "frames of interest" or "scenes with text" in general as an update to the previous efforts of seeking different frames out separately.
"Frames of Interest" tend to be frames from a video that contain textual information on screen that is useful for archiving purposes. This can include slates, chyrons, images of people/video subjects, and credits.

> The essential goal for scene recognition is to assign the semantic labels to the given images, these semantic labels are defined by human beings including different natural views, indoor scenes, outdoor environments and etc.
> -- [Scene recognition: A comprehensive survey](https://www.sciencedirect.com/science/article/pii/S003132032030011X)

This project is an attempt at developing a dataset for a new CLAMS app that detects "frames of interest" or "scene recognition" in general as an update to the previous efforts of seeking different frames out separately.
"Frames of Interest" tend to be frames from a video that contain information (primarily in some overlaying textual forms) on screen that is useful for archiving purposes. This can include slates, chyrons, credits, images of people or other visual objects.

From the annotation side, the project is done by sampling videos at a certain rate (e.g. currently 1 frame every 2 seconds) to create a diverse set of frames as a collection of stills (going forward called "image sets").
The frames are then annotated for if they fit one of the interest categories or not.

Downstream to this project, results from the Scene Recognition detection can be used to stitch together time intervals of when an audiovisual phenomenon takes place in the video.
Conceptually, the annotation project simply annotates stills found at recurring intervals (but arbitrarily chosen) that do not themselves describe
the start and end times of a phenomena. A model trained with this information could label more fine-grained-ly when moments display a phenomenon,
and post-processing/data-smoothening can determine when the phenomena truly starts and ends via computer vision.
Conceptually, the annotation project simply annotates stills found at recurring intervals (but arbitrarily chosen) that do not themselves describe the start and end times of a _scene_. Additional post-processing by software can stitch together these still level annotations into time interval annotations, but manually annotating time intervals is not under scope of the project.

### Specs
* Annotation Project Name - `scene-recognition`
* Annotator Demographics
* Number of annotators - 2
* Occupational description - College Student and Metadata Operation Specialist GBH
* Age range - 20-40s
* Education - College & PhD
* Education - Higher education
* Annotation Environment Information
* Name - Keystroke Labeler
* Version - unknown
Expand All @@ -28,27 +29,26 @@ and post-processing/data-smoothening can determine when the phenomena truly star
* Batch information: There are two batches used for training and evaluation split during the first iteration: Scenes With Text. [`27-a`](231002-aapb-collaboration-27-a) was densely-seen/labeled (20 GUIDs), while [`27-b`](231002-aapb-collaboration-27-b) was sparsely-seen/labeled (21 GUIDs).
* The split is done at the video/image-set level to avoid adding similar images from one video in training into evaluation also.
* Other version control information - none

## Tool Installation: Keystroke Labeler
[Keystroke Labeler](https://github.com/WGBH-MLA/keystrokelabeler) Annotation Tool is developed in collaboration with GBH by Owen King.
Documentation: Explanation of inner parts and fields in the labeler [here](https://github.com/WGBH-MLA/keystrokelabeler/blob/main/labeler_data_readme.md).
Please see the first link for installation and usage.
We use [Keystroke Labeler](https://github.com/WGBH-MLA/keystrokelabeler), an annotation tool that is developed in collaboration with GBH by Owen King for this project.
Documentation of the tool, including explanation of inner parts and fields in the labeler can be found [here](https://github.com/WGBH-MLA/keystrokelabeler/blob/main/labeler_data_readme.md).
Please refer to the tool source code repository for instructions for installation and usage.

#### Tool Access
Currently CLAMS annotators are accessing the tool via a local-host instance built through Ivanti. Each instance is one GUID/video on its own, and changing the name of the saved file is not possible nor necessary.
### Tool Access
Currently CLAMS annotators are accessing the tool via web app instances deployed on servers that CLAMS team manages. Each instance is one GUID/video on its own, and once annotation is done for a video, annotators must _export_ the annotation data into csv or json file and upload to a shared cloud storage space (google drive). This is because the tool doesn't support save-on-server, and during the export process annotators must rename the file name to match the video GUID.

## Annotation Guidelines
> [!Important]
> Please read this explanation of the types of frames first.
> [`Types of frames`](https://docs.google.com/document/d/1IyM_rCsCr_1XQ39j36WMX-XnVVBT4T_01j-M0eYqyDs/edit) is the guidelines for this project along with more specific instructions from this `readme.md`.
### Preparation
The tool must be downloaded or accessed via Ivanti.
Then still images must be extracted from chosen videos.
A sampling rate is recommended, e.g. 1 frame every 2 seconds.

### Preparation (Project manager)
The annotation project manager first need to extract still images from chosen videos, using the extraction script included in the tool source code (so far all annotation is done with images sampled at 1 frame every 2 seconds).
This intends to give some diversity to the frames extracted from the video.
The set of frames must be then loaded into the [tool](https://github.com/WGBH-MLA/keystrokelabeler/blob/main/labeler_data_readme.md).

### What to Annotate
### What to Annotate (Annotator)
This tool creates an annotation file that has different columns for each frame.
For each frame, pick which category of Frame of Interest or none.
Then choose a subtype if needed.
Expand All @@ -75,8 +75,8 @@ The subtypes of slates (blue) is also important to annotate.
However, the non-important cases (grey) are various different negative cases that are not frames of interest. These may be similar to positive cases. These are sometimes less distinct between each other. Do the best possible, but move on if too much time is spent figuring out the distinctions.
Add the [modifier](https://docs.google.com/document/d/1IyM_rCsCr_1XQ39j36WMX-XnVVBT4T_01j-M0eYqyDs/edit#heading=h.xnfilznsrhpe) where needed. I.e. Pick the most preferred, clearest `type label`, add "Shift" when making the key combo.

### How to Annotate It
The tool uses one or two key-combination presses to annotate the different kinds of frames. A key combination can be a single key, or could be a combo like "Shift P". Press the relevant one to annotate the `type`.
### How to Annotate It (Annotators)
jarumihooi marked this conversation as resolved.
Show resolved Hide resolved
The tool uses one or two key-combination presses to annotate the different kinds of frames. A key combination can be a single key, or could be a combo like "Shift + P". Press the relevant one to annotate the `type`.
To add a `subtype`, you will need to enter editor mode, use "Esc" key to do that.
In editor mode, you will be able to use the up and down arrows to move between `type` and `subtype` entering.
Press the key combo needed to annotate the main `type`. The press down to move to `subtype` and press another key combo for the relevant choice. Move on with "Enter" or "Return".
Expand Down Expand Up @@ -104,6 +104,7 @@ Sections with all the same label were also skimmed unless something caught the e
No `.csv` files were edited to corrections/checker-decisions.

Results:

* `cpb-aacip-516-8c9r20sq57.csv` #1
* There are many ones where Shift should have been used. Not counting these, but suspect about 12/920.
* Important errors: 3/920 (all classified as positive, should be negative. False Pos.)
Expand Down