Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New another language to add or change. #28

Open
GLWmax opened this issue Mar 19, 2024 · 7 comments
Open

New another language to add or change. #28

GLWmax opened this issue Mar 19, 2024 · 7 comments

Comments

@GLWmax
Copy link

GLWmax commented Mar 19, 2024

Error:
download_lexicon.py --name 'vgt' --directory ./vgt
usage: download_lexicon.py [-h] --name {signsuisse} --directory DIRECTORY
download_lexicon.py: error: argument --name: invalid choice: 'vgt' (choose from 'signsuisse')

I have add 'VGT'


from .types import Gloss
from .common import load_spacy_model

LANGUAGE_MODELS_SPACY = {
"de": "de_core_news_lg",
"fr": "fr_core_news_lg",
"vgt": "vgt_core_news_lg",
"en": "en_core_web_lg",
}

def text_to_gloss(text: str, language: str, ignore_punctuation: bool = False) -> Gloss:

if language not in LANGUAGE_MODELS_SPACY:
    raise NotImplementedError("Don't know language '%s'." % language)

model_name = LANGUAGE_MODELS_SPACY[language]

# disable unnecessary components to make lemmatization faster

spacy_model = load_spacy_model(model_name, disable=("parser", "ner"))

doc = spacy_model(text)

glosses = []  # type: Gloss

for token in doc:
    if ignore_punctuation is True:
        if token.is_punct:
            continue

    gloss = (token.text, token.lemma_)
    glosses.append(gloss)

return glosses

@AmitMY
Copy link
Collaborator

AmitMY commented Mar 21, 2024

I see that you are trying to use this repository with VGT.

The download_lexicon script does not support any VGT dataset, so to support Flemish, you would have to go through the following process:

  1. Collect a lexicon (Download videos from https://vlaamsegebarentaal.be/signbank/signs/show_all/ or collect your own)
  2. Extract poses using this library and the command video_to_pose --format mediapipe -i example.mp4 -o example.pose
  3. Construct a lexicon CSV file with the words, matching the poses, for example https://github.com/sign-language-processing/spoken-to-signed-translation/blob/main/assets/dummy_lexicon/index.csv
path,spoken_language,signed_language,start,end,words,glosses,priority
sgg/kleine.pose,de,sgg,0,0,kleine,Kleine,0
sgg/kinder.pose,de,sgg,0,0,kinder,Kinder,0

Now, once you have this index.csv, under a directory called, let's say, lexicon, you can run for example:

text_to_gloss_to_pose \
  --text "Hallo mijn naam is john." \
  --glosser "simple" \
  --lexicon "lexicon" \
  --spoken-language "nl" \
  --signed-language "vgt" \
  --pose "quick_test.pose"

@KhayitboevElbekjon
Copy link

hello,I have one problem, look at.

download_lexicon
--name
--directory <path_to_directory>

What should I put in "name" and "directory" in this code?

@KhayitboevElbekjon
Copy link

which file should i run to use this program?

@AmitMY
Copy link
Collaborator

AmitMY commented Apr 9, 2024

hello,I have one problem, look at.

download_lexicon --name --directory <path_to_directory>

What should I put in "name" and "directory" in this code?

the only dataset available in this repository is signsuisse.
If you have a further issue that is not related to the issue at hand, please create a different issue.

@cleong110
Copy link

https://www.corpusvgt.be/ might work

@cleong110
Copy link

@KhayitboevElbekjon
Copy link

KhayitboevElbekjon commented Jun 12, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants