Skip to content

Commit

Permalink
update docs and add french translation
Browse files Browse the repository at this point in the history
  • Loading branch information
tamiyoshiroya committed Oct 4, 2023
1 parent 6e061e4 commit e819d5a
Show file tree
Hide file tree
Showing 68 changed files with 933 additions and 422 deletions.
428 changes: 13 additions & 415 deletions README.md

Large diffs are not rendered by default.

5 changes: 5 additions & 0 deletions _navbar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
<!-- Nav bar-->

* Tanslation
* 🇫🇷 [French](/fr/)
* 🇺🇸 English
14 changes: 13 additions & 1 deletion _sidebar.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,14 @@
* [Documentation](/)
* [**Getting Started**](home.md)
* **Configuration**
* * [Project configuration](projectConfig.md)
* * [Collaborating](collaborating.md)
* **Treebank graphic annotation**
* * [Annotation](annotation.md)
* * [Advanced Annotation Options](advancedAnnotation.md)
* [**Github Synchronization**](githubSync.md)
* [**Parser**](parser.md)
* [**Blind Annotation**](blindAnnotation.md)




90 changes: 90 additions & 0 deletions advancedAnnotation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
## Tree types

In ArboratorGrew, trees are organized into distinct categories, including:

- **"Your Trees"**: The trees you have worked on.
- **"The Most Recent Trees"**: The latest annotated tree of all sentences.
- **"Your Recent Trees Filled Up with the Most Recent Trees"**: A combination of your recent trees and the most recent trees added.
- **"All Trees"**: An inclusive collection of all available trees.
- **"Validated Trees"**: Trees that have been reviewed and approved by a validator.
- **"Pending Trees"**: Trees associated with sentences that do not have validated trees.

These tree types will be used in the following features.
## Grew search

ArboratorGrew offers an outstanding set of features, not least its powerful pattern search system. With **[Grew](https://grew.fr/)**, ArboratorGrew lets you search using a variety of criteria, including `POS` query, `Form` query, `Lemma` query, `Dependency Relation` query, and `Relation and Tags` query, all matched to the specific type of tree you've selected.

<div style="text-align:center">
<img src="assets/images/grew-search.png" alt="drawing" width="900"/>
</div>

?> The nodes that match the pattern are then highlighted in the
trees on the results page.

<div style="text-align:center">
<img src="assets/images/Grew-result.png" alt="drawing" width="900"/>
</div>


!> In order to detect the errors, it can filter out these results based on negative
patterns (patterns that must not appear in the graph). Once the faulty tree has been found, it can then directly be
edited and saved.

## Grew Rewrite

With ArboratorGrew you have the ability to modify and rewrite your trees using Grew rewriting rules(see **[Grew Rules](https://grew.fr/doc/rule/)**).

<div style="text-align:center">
<img src="assets/images/grew-writing-rule.png" alt="drawing" width="900"/>
</div>

Nodes that match the specified rule will be highlighted on the results page. To save these results, you can choose to either select individual results or opt for all results and then click the 'Apply Rules' button.

## Relation tables

ArboratorGrew also offers the capability to cluster the treebank according to one or multiple features. These features can be employed to construct a relation table that provides a comprehensive summary of all dependencies within a project, focusing on the dependency relation.

<div style="text-align:center">
<img src="assets/images/relation-table.png" alt="drawing" width="900"/>
</div>

!> Having this is a great way to look for
rare structures and potential errors inside a treebank. The user can access
directly the trees that match the negative pattern and update it.

## Lexicon

Lexicon is one of the advanced options available in **[Arborator-Grew](https://arboratorgrew.elizia.net/#)**. The user select two list of features.
- $L_1 = [f_1, …, f_m]$ as main features.
- $L_2 = [g_1, …, g_m]$ as auxiliary features.

<div style="text-align:center">
<img src="assets/images/select-lexicon-features.png" alt="drawing" width="900"/>
</div>

The output table corresponds to the possible values of all features, such that for all tuples of values for $f_i$, there is more than one tuples of values for $g_i$. The idea is to show only $f$ values which are ambiguous with respect to
values $g$.

#### Exemple

For $L_1 = [$`Form`, `Lemma`, `Upos`$]$ and $L_2 = [$`Gender `, `Number`$]$

<div style="text-align:center">
<img src="assets/images/13-lexicon-exemple.png" alt="drawing" width="900"/>
</div>

This will show the entries where there is more than one couple of value for `Gender` and `Number` with the same combination (`Form`, `Lemma`, `Upos`)


<div style="text-align:center">
<img src="assets/images/14-lexicon-ambiguous.png" alt="drawing" width="900"/>
</div>

After that, the user can correct them directly using the grew rewrite rule option.
<br/><br/>
<div style="text-align:center">
<img src="assets/images/modify-lexicon.png" alt="drawing" width="900"/>
</div>


?> The second list $L_2$ is optional, if it is not given all the entries of lexicon are displayed.
58 changes: 58 additions & 0 deletions annotation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
## Treebank Graphic Annotation

To get started with the treebank annotation, you have two options:
- Import `CoNLL` file as an input.
- Introduce text as input and use the different options of the tokenizer.

<div style="text-align:center">
<img src="assets/images/upload-sample.png" alt="drawing" width="900"/>
</div>

Arborator translates the conll data inside into graphical trees. In the annotation user interface:

?> To make a dependency relation between two tokens, you just need to pull the bow from the token to the other.

?> To create the root dependency, you need to pull up the bow from the token.

?> To change the token's features, you click on it and you can choose and set different features of `Universal Features` and `Miscellaneous Features` and `Lemmas`

?> To assign a category for the token, you click on the underscore and select the approriate category.

?> To delete annotation, the purple delete button will allow you to delete the different parts of annotation mentionned above.



<div style="text-align:center">
<video autoplay loop width="900">
<source src="assets/videos/1-Annotation.webm" type=video/webm>
</video>
</div>


## Annotation Functionnalities

There is a toolbar located on the top left of the document area. From it the user can perform these actions:

<div style="text-align:center">
<img src="assets/images/edit-treebank.png" alt="drawing" width="400"/>
</div>

- Each time a change is made in the tree (e.g. new annotation or relation added), yellow diskette will appear to indicate there are changes to **save**. Click on the **Save button** to **save** the changes.

- You have the option to apply **tags** to your trees using the **tagging feature**. These tags serve as a valuable organizational tool, simplifying your annotation process. Additionally, they facilitate effective collaboration within your team by enabling you to communicate your progress. There are predefined tags available, and you also have the option to create your own tags.

<div style="text-align:center">
<img src="assets/images/tags.png" alt="drawing" width="400"/>
</div>

- You can activate the difference mode to highlight annotation differences between annotators in the tree structure. This feature can also be activated directly by right-clicking on the user for which you look for differences.

- You have also the possibility to get a `PNG`, `SVG` file and `CONLL` of the tree.

#### Tokens editing options

- You have the possibility to manipulate tokens within a sentence by merging them, splitting them, or inserting a token at a particular location. To do this, just select the token you want to modify within the sentence input displayed in the tree view, and you will get a menu with these available options.

<div style="text-align:center">
<img src="assets/images/edit-tokens.png" alt="drawing" width="200"/>
</div>
Binary file added assets/images/1-github.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/10-github.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed assets/images/14-github-sync.png
Binary file not shown.
Binary file removed assets/images/15-github-sync.png
Binary file not shown.
Binary file removed assets/images/16-github-sync.png
Binary file not shown.
Binary file removed assets/images/17-github-sync.png
Binary file not shown.
Binary file removed assets/images/18-github-sync.png
Binary file not shown.
Binary file removed assets/images/19-github-sync.png
Binary file not shown.
Binary file removed assets/images/2-Project-creation.png
Binary file not shown.
Binary file added assets/images/2-github.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed assets/images/20-github-sync.png
Binary file not shown.
Binary file removed assets/images/21-github-sync.png
Binary file not shown.
Binary file removed assets/images/24-upload-sample.png
Binary file not shown.
Binary file removed assets/images/26-show-divergences.png
Binary file not shown.
Binary file removed assets/images/3-add-users.png
Binary file not shown.
Binary file added assets/images/3-github.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed assets/images/4-Functionnalities.png
Binary file not shown.
Binary file added assets/images/4-github.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed assets/images/5-Grew-search.png
Binary file not shown.
Binary file added assets/images/5-github.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/6-github.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed assets/images/7-Relation-table.png
Binary file not shown.
Binary file added assets/images/7-github.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
Binary file removed assets/images/8-project-configuration.png
Binary file not shown.
Binary file removed assets/images/9-Apply-Rule.png
Diff not rendered.
File renamed without changes
File renamed without changes
File renamed without changes
Binary file added assets/images/add-user.png
File renamed without changes
Binary file added assets/images/edit-treebank.png
Binary file added assets/images/grew-search.png
Binary file added assets/images/grew-writing-rule.png
File renamed without changes
File renamed without changes
File renamed without changes
Binary file added assets/images/parser.png
Binary file added assets/images/project-config.png
Binary file added assets/images/project-creation.png
Binary file added assets/images/relation-table.png
Binary file added assets/images/select-lexicon-features.png
File renamed without changes
Binary file added assets/images/tags.png
Binary file added assets/images/upload-sample.png
36 changes: 36 additions & 0 deletions blindAnnotation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
## Blind annotation mode

Blind annotation is a widely used in research, where the annotators independently label trees without access to the annotations made by other annotators. This practice can minimize bias and ensure the objectivity. Additionally, the concept of the blind annotation can be used for teaching syntactic annotation for the students.

This concept is already implemented in ArboratorGrew through the Blind Annotation mode. To activate this feature, administrator can configure it during project creation or within the project settings window.

## Blind annotation levels

In ArboratorGrew, the blind annotation mode offers multiple levels of configuration, allowing the admins to adjust the degree of blind annotation for each sample. These levels are explained below for better understanding:

| Blind annotation level |property |
| ------------------- |-------------------------------------------------------------------------- |
| `1:validated_visible`| <div style="width:100%,">When Editing, the annotators can see the validated tree, the differences are highlighted in red and they can access to the statistics</div>|
| `2:local_feedback`|<div style="width:100%,">Here, the validated tree is not visible, but differences are still highlighted and statistics are available.</div> |
|`3:global_feedback`|The annotators can only access to the statics statistics|
|`4:no_feedback`|<div style="width:100%,">Nothing is provided only the validator who can see the trees and access to the statistics</div>|


## Tree types in the blind annotation mode

Within this mode, we have three distinct types of trees: the base tree, the user tree, and the validated tree.

**Base Tree**: The base tree serves as the starting point for the annotator's work. It provides the initial structure upon which the annotation process begins.

**User Tree**: This refers to the tree created and modified by the annotator during the annotation process. It represents the annotator's input and efforts.

**Validated Tree**: The validated tree signifies the version of the tree that has been approved by a validator.

Users have the access to the same annotation options as previously mentioned. However, due to the limited visibility of trees for annotators, certain tree types may not be accessible to them.

For all the features where tree types need to be selected, the validator has access to all trees, including the most recent trees, validated trees, and pending trees.However, the annotator has access only to their own trees.





31 changes: 31 additions & 0 deletions collaborating.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
## Collaborating in projects

Collaborative annotation is one of powerful features of Arborator-Grew. It provides a way to multiple users to share the access of the project so they can work together.

When you create a project, you are the **owner**. In this case, you have total control over the treebank. You can manage its configuration, invite collaborators and view the trees of other users.


?> Also, we can distinguish the following roles:

- `Administrator` has the same options as the **owner** except the github synchronization and the project freezing, The **admin** can:
- Modify the settings of the project.
- Assign new members to the project.
- Upload new samples and use the tokenizer.
- Edit the sentences of a sample (like split or merge tokens ...)
- Use the parser.
- View the other users' trees
- Remove samples, or user's trees.

- `Validator` takes the role of the linguist by correcting the treebank and choosing the validated tree between the annotators trees.

- `Annotator` Each sample has a list of **annotators**. **Annotators** can browse and modify the treebank (modify in the sense that a modified tree is saved under their name).

- `Guest` This role is only available for private projects, where you can choose which guests can only see the tree bank.

## Inviting Collaborators

You can invite collaborators directly to your project by accessing the project settings. You can search for the user in the user list, define the role and click on the "share" button.

<div style="text-align:center">
<img src="assets/images/add-user.png" alt="drawing" width="900"/>
</div>
19 changes: 19 additions & 0 deletions fr/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
## Bienvenue dans la documentation d'ArboratorGrew

Ceci est la documentation officielle pour **[ArboratorGrew](https://arboratorgrew.elizia.net/#/)**

Un outil d'annotation **collaborative** pour le développement de **treebank**.

Vous trouverez ci-dessous un bref résumé des principales fonctionnalités d'Arborator et des cas d'utilisation. Pour plus de détails et de tutoriels, utilisez la barre de navigation à gauche.

Jetons un coup d'oeil à la documentation.


## Caractéristiques principales

Arborator-Grew combine les fonctionnalités de deux outils préexistants : `Arborator` et `Grew`.

**[Arborator](https://arborator.ilpga.fr/)** est un outil d'annotation des arbres de dépendences en ligne, graphique et collaboratif, largement utilisé.
d'annotation de treebank.

**[Grew](https://grew.fr/)** est un outil d'interrogation et de réécriture de graphes spécialisé dans les structures nécessaires au NLP, c'est-à-dire les arbres et les graphes de dépendance syntaxiques et sémantiques. Grew a également une version en ligne, **[Grew-match](http://match.grew.fr/)**, où tous les treebank du **[UD Universal Dependencies](https://universaldependencies.org/)** et **[SUD Surface Syntactic Universal Dependencies](https://surfacesyntacticud.github.io/)** peuvent être consultés
5 changes: 5 additions & 0 deletions fr/_navbar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
<!-- Nav bar-->

* Taduction
* 🇫🇷 Français
* [🇺🇸 Englais](/)
14 changes: 14 additions & 0 deletions fr/_sidebar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
* [**Pour Commencer**](/fr/home.md)
* **Configuration du Projet**
* * [Configuration du Projet](/fr/projectConfig.md)
* * [Collaboration](/fr/collaborating.md)
* **Annotation Graphique de Treebank**
* * [Annotation](/fr/annotation.md)
* * [Options d'Annotation Avancées](/fr/advancedAnnotation.md)
* [**Synchronisation avec Github**](/fr/githubSync.md)
* [**Parseur**](/fr/parser.md)
* [**Annotation à l'Aveugle**](/fr/blindAnnotation.md)




84 changes: 84 additions & 0 deletions fr/advancedAnnotation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
## Tree types

Dans ArboratorGrew, les arbres sont organisés en différentes catégories, notamment :

- **"Vos Arbres"** : Les arbres sur lesquels vous avez travaillé.
- **"Les Arbres les Plus Récents"** : Le dernier arbre annoté de toutes les phrases.
- **"Vos Arbres Récents Remplis avec les Arbres les Plus Récents"** : Une combinaison de vos arbres récents et des arbres les plus récents ajoutés.
- **"Tous les Arbres"** : Une collection inclusive de tous les arbres disponibles.
- **"Arbres Validés"** : Les arbres qui ont été examinés et approuvés par un validateur.
- **"Arbres en Attente"** : Les arbres associés à des phrases qui n'ont pas d'arbres validés.

Ces types d'arbres seront utilisés dans les fonctionnalités suivantes.
## Grew search

ArboratorGrew offre un ensemble exceptionnel de fonctionnalités, notamment son puissant système de recherche de pattern. Avec **[Grew](https://grew.fr/)**, ArboratorGrew vous permet de rechercher en utilisant divers critères, notamment la recherche par `POS`, la recherche par `Form`, la recherche par `Lemme`, la recherche par `Relation de Dépendance`, et la recherche par `Relation et Étiquettes`, le tout adapté au type spécifique d'arbre que vous avez sélectionné.

<div style="text-align:center">
<img src="assets/images/grew-search.png" alt="drawing" width="900"/>
</div>

?> Les nœuds qui correspondent au pattern sont ensuite mis en évidence dans les arbres de la page de résultats.

<div style="text-align:center">
<img src="assets/images/Grew-result.png" alt="drawing" width="900"/>
</div>


?> Pour détecter les erreurs, il est possible de filtrer ces résultats en fonction de motifs négatifs (des motifs qui ne doivent pas apparaître dans le graphe). Une fois que l'arbre défectueux a été trouvé, il peut alors être directement édité et enregistré.

## Grew Rewrite

Avec ArboratorGrew, vous avez la possibilité de modifier et de réécrire vos arbres en utilisant les règles de réécriture Grew (voir **[Grew Rules](https://grew.fr/doc/rule/)**).

<div style="text-align:center">
<img src="assets/images/grew-writing-rule.png" alt="drawing" width="900"/>
</div>

Les nœuds qui correspondent à la règle spécifiée seront mis en évidence sur la page des résultats. Pour sauvegarder ces résultats, vous pouvez choisir de sélectionner des résultats individuels ou opter pour l'ensemble des résultats, puis cliquer sur le bouton 'Appliquer les Règles'.

## Relation tables

ArboratorGrew offre également la possibilité de regrouper le treebank en fonction d'une ou de plusieurs caractéristiques. Ces caractéristiques peuvent être utilisées pour construire un tableau de relations qui fournit un résumé complet de toutes les dépendances au sein d'un projet, en mettant l'accent sur la relation de dépendance.

<div style="text-align:center">
<img src="assets/images/relation-table.png" alt="drawing" width="900"/>
</div>

!> C'est un excellent moyen de rechercher des structures rares et des erreurs potentielles au sein d'un treebank. L'utilisateur peut accéder directement aux arbres qui correspondent au pattern négatif et les mettre à jour. Cela facilite grandement la détection et la correction d'erreurs dans le treebank.

## Lexicon

Le lexique est l'une des options avancées disponibles dans **[Arborator-Grew](https://arboratorgrew.elizia.net/#)**. L'utilisateur sélectionne deux listes de caractéristiques :
- $L_1 = [f_1, …, f_m]$ comme caractéristiques principales.
- $L_2 = [g_1, …, g_m]$ comme caractéristiques auxiliaires.

<div style="text-align:center">
<img src="assets/images/select-lexicon-features.png" alt="drawing" width="900"/>
</div>

La table de sortie correspond aux valeurs possibles de toutes les caractéristiques, de telle manière que pour toutes les combinaisons de valeurs pour $f_i$, il existe plus d'une combinaison de valeurs pour $g_i$. L'idée est de n'afficher que les valeurs de $f$ qui sont ambiguës par rapport aux valeurs de $g$.

#### Exemple

Pour $L_1 = [$`Form`, `Lemma`, `Upos`$]$ et $L_2 = [$`Gender `, `Number`$]$

<div style="text-align:center">
<img src="assets/images/13-lexicon-exemple.png" alt="drawing" width="900"/>
</div>

Cela affichera les entrées où il y a plus d'un couple de valeurs pour `Genre` et `Nombre` avec la même combinaison (`Form`, `Lemme`, `Upos`).


<div style="text-align:center">
<img src="assets/images/14-lexicon-ambiguous.png" alt="drawing" width="900"/>
</div>

Ensuite, l'utilisateur peut les corriger directement en utilisant l'option de règle de réécriture Grew.
<br/><br/>
<div style="text-align:center">
<img src="assets/images/modify-lexicon.png" alt="drawing" width="900"/>
</div>


?> La deuxième liste $L_2$ est facultative. Si elle n'est pas fournie, toutes les entrées du lexique sont affichées.
Loading

0 comments on commit e819d5a

Please sign in to comment.