Skip to content

Latest commit

 

History

History
4 lines (2 loc) · 1.54 KB

NOTES.md

File metadata and controls

4 lines (2 loc) · 1.54 KB

In this derived version of the original Intercontinental Dictionary Series, all entries have been roughly mapped to the International Phonetic Alphabet in the version provided by the Cross-Linguistic Transcriptin Systems initiative (https://clts.clld.org). The conversions should be largely consistent, but it is quite possible that there remain many erroneous conversions, where experts of the respective language varieties would select different transcription values. We emphasize that this is but a first step towards a standardizatio of the IDS data, and that additional work can and should be carried out by experts in the future.

In a first version, we later found several errors for languages for which orthographies were confused with phonetic or phonemic transcriptions in the original. The updated version now tries to take this into account by removing the respective languages with problems and also by slowly planning to make targeted segmentations for particular languages in the dataset. As a recent example for such a targeted segmentation, consider the segmented version of Panoan languages in the dataset keypano, which contains targeted transcriptions for two dozen languages. These languages have also been ignored in this version, since we consider that a better solution for transcriptions in segmented form has been provided. This means also, that the current dataset will hopefully shrink in the future whenever we manage to provide targeted segmentations for more languages in the sample.