-
Notifications
You must be signed in to change notification settings - Fork 38
/
installation.Rmd
132 lines (87 loc) · 4.94 KB
/
installation.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
An R wrapper to the spaCy "industrial strength natural language processing" Python library from https://spacy.io.
## Installing the package
1. Install the **spacyr** R package:
* From CRAN:
```{r, eval = FALSE}
install.packages("spacyr")
```
* From GitHub:
To install the latest package from source, you can simply run the following.
```{r, eval = FALSE}
remotes::install_github("quanteda/spacyr")
```
2. Install spaCy and requirements
Simply run:
```{r, eval = FALSE}
library(spacyr)
spacy_install()
```
If you want to install a specific version, simply add it to the install command:
```{r, eval = FALSE}
library(spacyr)
spacy_install(version = "apple")
```
Check the helpful version tool on <https://spacy.io/usage> and to see what is available.
3. (optional) Add more language models
If left unchanged, `spacy_install()` adds the default "en_core_web_sm" model. You can add more language models with `spacy_download_langmodel()`. For instance, to install a small and efficient German language model:
```{r, eval = FALSE}
spacy_download_langmodel("de_core_news_sm")
```
Check out available models at <https://spacy.io/usage/models>.
If you run into any problems, you can try the manual installation path described below.
### Manual installation and troubleshooting
`spacy_install()` performs a number of tasks to set up a virtual environment in which spaCy is installed.
Virtual environments are the recommended way to install Python applications, as the lack of central dependency conflict control (which is performed by CRAN in the `R`-world) means that conflicts between packages are a lot more common.
Hence each Python package and its dependencies are usually installed in their own folder.
Usually, none of this should concern you.
However, experience shows that some systems run into problems during installation that are hard to foresee by developers.
Below, we therefore explain how you can perform the steps in `spacy_install()` manually, to debug any problems that might occur.
Please only file a GitHub issue after you have tried to manually run through the steps, so we can provide you with more targeted help.
1. Install Python
You can use your own installation of Python for the steps below.
By default, `spacy_install()` downloads and installs a minimal Python version in the default directory used by the `reticulate` package for simplicity.
This can be done with a single command:
```{r eval=FALSE}
python_exe <- reticulate::install_python()
```
The function returns the path to the Python executable file.
You can run this again at any time to get that path (the installation is skipped if the files are already present).
If you prefer to use a specific version of Python, you can use this function to install it and it will be picked up by `spacyr`.
2. Set up a virtual environment
By default, `spacyr` uses an environment called "r-spacyr", which is located in a directory managed by `reticulate`.
We can create it with:
```{r eval=FALSE}
reticulate::virtualenv_create("r-spacyr", python = python_exe)
```
If this causes trouble for some reason, you can install the environment in any location that is convenient for you like so:
```{r eval=FALSE}
reticulate::virtualenv_create("path/to/directory", python = python_exe)
```
Note, that `spacyr` does not know of the existence of this environment unless you tell it through the environment variable `SPACY_PYTHON`.
You can do that either in each session with:
```{r eval=FALSE}
Sys.setenv(SPACY_PYTHON = "path/to/directory")
```
or you put it into your `.Renviron` file.
You can use this little helper function to make the change permanent:
```{r eval=FALSE}
usethis::edit_r_environ(scope = "user")
```
We also need to tell `reticulate` that it should use this environment from now on.
```{r eval=FALSE}
reticulate::use_virtualenv(Sys.getenv("SPACY_PYTHON", unset = "r-spacyr"))
```
We use `Sys.getenv("SPACY_PYTHON", unset = "r-spacyr")` to check if `SPACY_PYTHON` is set and use the default otherwise.
3. Install spaCy
Installing `spaCy` and its dependencies is again done through `reticulate`.
We check again if SPACY_PYTHON is set, in case you chose a non-default folder.
```{r eval=FALSE}
reticulate::py_install("spacy", envname = Sys.getenv("SPACY_PYTHON", unset = "r-spacyr"))
```
4. Install spaCy language models
The language models are installed in the same way.
```{r eval=FALSE}
reticulate::py_install("en_core_web_sm", envname = Sys.getenv("SPACY_PYTHON", unset = "r-spacyr"))
```
If any of those steps fail, please file an [issue](https://github.com/quanteda/spacyr/issues) (after checking if one already exists for your error).
You can also use the individual commands to customise your setup.