Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reformatting of Gold Data and process.pys #56

Closed
6 of 7 tasks
jarumihooi opened this issue Oct 7, 2023 · 0 comments · Fixed by #77
Closed
6 of 7 tasks

Reformatting of Gold Data and process.pys #56

jarumihooi opened this issue Oct 7, 2023 · 0 comments · Fixed by #77

Comments

@jarumihooi
Copy link
Collaborator

jarumihooi commented Oct 7, 2023

Because

there is now a new template and conventions for the whole dataset repository, we now have discrepancies between the requested gold conventions as requested for the understandability of the data.
Thus, all 5 currently existing projects may need to be redone by:

  1. redoing the process.py.
  2. regenerating the golds datasets.
  3. Informing needed downstream stakeholders, (e.g. MMIF? tool evaluators? for tool ingestion)

When redoing, these are the two major conventions that are to be conformed to:
A. Time format - should be displayed/stored as ISO Time format. This can be achieved a few ways, it can be saved as hh:mm:ss.mmm or saved as two integers of seconds and milliseconds to be reconverted into a more readable hh:mm:ss.mmm.
B. Column Headers/Fields - of the golds data should use conventional names, such as start and end instead of start_time and end_time. The chyrons gold data is an example where this is needed. Other column fields should be investigated for similarities.

Done when

  • Investigate all existing golds datasets for similar fields to rename into one convention.

  • Slates - update process.py for time & field_names, regenerate golds data

  • Chyrons - as above

  • NamedEntity - as above

  • NamedEntityWiki - as above

  • Transcript - as above

  • Inform downstream tasks/teams where needed.

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant