Skip to content

Latest commit

 

History

History
53 lines (34 loc) · 1.91 KB

clin28sharedtask.md

File metadata and controls

53 lines (34 loc) · 1.91 KB

Annotation of spelling correction for CLIN28 Shared Task

Metadata

  • *Status: Completed
  • Type: Specific
  • Work Package: WP3
  • Research Coordinators: Merijn Beeksma
  • Coordinators for CLARIAH: Maarten van Gompel
  • Participating Institutes: Radboud University, Nijmegen
  • End-users: The CLIN28 shared-task organisers
  • Developers: The CLIN28 shared-task organisers
  • Interest Groups: Text
  • Task IDs: T062 (FLAT), T108 (FoLiA)

Description

A gold-standard corpus with spelling errors and corrections thereof needed to be established for the CLIN28 Shared Task (2018).

What is the research about?

The efficacy of spelling correction systems by shared task participants was to be assessed. An annotation environment was needed so annotators could establish a gold standard.

What problem is hindering the research?

(What is currently lacking that inhibits this research?)

What is needed to do the research?

Data

Data was extracted from Wikipedia and stored in the FoLiA format.

Tools

We need an annotation environment with support for spelling correction in many forms, including complexities such as run-on errors, split-errors, missing words and redundant words. FLAT was used as a solution, as it, and the underlying FoLiA format, has significant correction spelling correction features.

What software and services are involved?

References

References to related resources and publications and especially links to related use-cases: