Skip to content

Commit

Permalink
Merge pull request #41 from hthetiot/update-libstemmer
Browse files Browse the repository at this point in the history
update to libstemmer_c-2.2.0 from snowballstem.org
  • Loading branch information
hthetiot authored Oct 25, 2023
2 parents dee3c19 + 7e06edf commit 3608e63
Show file tree
Hide file tree
Showing 117 changed files with 24,937 additions and 18,217 deletions.
21 changes: 16 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,16 @@ Examples:
> "argued", "argues", "arguing", and "argus" reduce to the stem "argu" (illustrating the case where the stem
> is not itself a word or root) but "argument" and "arguments" reduce to the stem "argument".
This library is using bindings to the [libstemmer](http://snowball.tartarus.org/download.html) C library.
This library is using bindings to the [libstemmer](https://snowballstem.org/download.html) C library.
It's support

More about Stemming:
- [Stemming wikipedia](http://en.wikipedia.org/wiki/Stemming)
- [Racinisation wikipedia](http://fr.wikipedia.org/wiki/Racinisation)

More about Snowball libstemmer library:
- [https://snowballstem.org](https://snowballstem.org)

## Install
```
npm install node-snowball
Expand Down Expand Up @@ -57,13 +60,17 @@ snowball.stemword(
### Supported language second argument:

* arabic
* armenian
* basque
* catalan
* catalan
* danish
* dutch
* dutch
* english
* finnish
* french
* french
* german
* greek
* hindi
Expand All @@ -75,12 +82,14 @@ snowball.stemword(
* nepali
* norwegian
* portuguese
* spanish
* swedish
* romanian
* russian
* serbian
* spanish
* swedish
* tamil
* turkish
* yiddish
* porter (not a language)

### Supported encoding third argument:
Expand All @@ -91,10 +100,12 @@ snowball.stemword(

## To compile, run

Supported NodeJS versions: 10.x, 11.x, 12.x
Supported NodeJS versions: 14.x, 16.x, 18.x

```
npm build .
npm npm run clean
npm npm run configure
npm npm run build
npm test
```

Expand Down
6 changes: 4 additions & 2 deletions binding.gyp
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
"src/libstemmer/src_c/stem_ISO_8859_1_finnish.c",
"src/libstemmer/src_c/stem_ISO_8859_1_french.c",
"src/libstemmer/src_c/stem_ISO_8859_1_german.c",
"src/libstemmer/src_c/stem_ISO_8859_1_hungarian.c",
"src/libstemmer/src_c/stem_ISO_8859_1_indonesian.c",
"src/libstemmer/src_c/stem_ISO_8859_1_irish.c",
"src/libstemmer/src_c/stem_ISO_8859_1_italian.c",
Expand All @@ -32,6 +31,7 @@
"src/libstemmer/src_c/stem_ISO_8859_2_romanian.c",
"src/libstemmer/src_c/stem_KOI8_R_russian.c",
"src/libstemmer/src_c/stem_UTF_8_arabic.c",
"src/libstemmer/src_c/stem_UTF_8_armenian.c",
"src/libstemmer/src_c/stem_UTF_8_basque.c",
"src/libstemmer/src_c/stem_UTF_8_catalan.c",
"src/libstemmer/src_c/stem_UTF_8_danish.c",
Expand All @@ -53,10 +53,12 @@
"src/libstemmer/src_c/stem_UTF_8_portuguese.c",
"src/libstemmer/src_c/stem_UTF_8_romanian.c",
"src/libstemmer/src_c/stem_UTF_8_russian.c",
"src/libstemmer/src_c/stem_UTF_8_serbian.c",
"src/libstemmer/src_c/stem_UTF_8_spanish.c",
"src/libstemmer/src_c/stem_UTF_8_swedish.c",
"src/libstemmer/src_c/stem_UTF_8_tamil.c",
"src/libstemmer/src_c/stem_UTF_8_turkish.c"
"src/libstemmer/src_c/stem_UTF_8_turkish.c",
"src/libstemmer/src_c/stem_UTF_8_yiddish.c",
]
}
]
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
"configure": "node-gyp configure",
"clean": "node-gyp clean",
"build": "node-gyp build",
"release": "npm run configure && npm run build",
"release": "npm run clean && npm run configure && npm run build",
"test": "jest"
},
"devDependencies": {
Expand Down
2 changes: 2 additions & 0 deletions src/libstemmer/COPYING
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
Copyright (c) 2001, Dr Martin Porter
Copyright (c) 2004,2005, Richard Boulton
Copyright (c) 2013, Yoshiki Shibukawa
Copyright (c) 2006,2007,2009,2010,2011,2014-2019, Olly Betts
All rights reserved.

Redistribution and use in source and binary forms, with or without
Expand Down
8 changes: 7 additions & 1 deletion src/libstemmer/MANIFEST
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
README
README.rst
src_c/stem_ISO_8859_1_basque.c
src_c/stem_ISO_8859_1_basque.h
src_c/stem_ISO_8859_1_catalan.c
Expand Down Expand Up @@ -39,6 +39,8 @@ src_c/stem_KOI8_R_russian.c
src_c/stem_KOI8_R_russian.h
src_c/stem_UTF_8_arabic.c
src_c/stem_UTF_8_arabic.h
src_c/stem_UTF_8_armenian.c
src_c/stem_UTF_8_armenian.h
src_c/stem_UTF_8_basque.c
src_c/stem_UTF_8_basque.h
src_c/stem_UTF_8_catalan.c
Expand Down Expand Up @@ -81,6 +83,8 @@ src_c/stem_UTF_8_romanian.c
src_c/stem_UTF_8_romanian.h
src_c/stem_UTF_8_russian.c
src_c/stem_UTF_8_russian.h
src_c/stem_UTF_8_serbian.c
src_c/stem_UTF_8_serbian.h
src_c/stem_UTF_8_spanish.c
src_c/stem_UTF_8_spanish.h
src_c/stem_UTF_8_swedish.c
Expand All @@ -89,6 +93,8 @@ src_c/stem_UTF_8_tamil.c
src_c/stem_UTF_8_tamil.h
src_c/stem_UTF_8_turkish.c
src_c/stem_UTF_8_turkish.h
src_c/stem_UTF_8_yiddish.c
src_c/stem_UTF_8_yiddish.h
runtime/api.c
runtime/api.h
runtime/header.h
Expand Down
11 changes: 7 additions & 4 deletions src/libstemmer/Makefile
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
include mkinc.mak
ifeq ($(OS),Windows_NT)
EXEEXT=.exe
endif
CFLAGS=-O2
CPPFLAGS=-Iinclude
all: libstemmer.o stemwords
libstemmer.o: $(snowball_sources:.c=.o)
all: libstemmer.a stemwords$(EXEEXT)
libstemmer.a: $(snowball_sources:.c=.o)
$(AR) -cru $@ $^
stemwords: examples/stemwords.o libstemmer.o
stemwords$(EXEEXT): examples/stemwords.o libstemmer.a
$(CC) $(CFLAGS) -o $@ $^
clean:
rm -f stemwords *.o src_c/*.o examples/*.o runtime/*.o libstemmer/*.o
rm -f stemwords$(EXEEXT) libstemmer.a *.o src_c/*.o examples/*.o runtime/*.o libstemmer/*.o
Loading

0 comments on commit 3608e63

Please sign in to comment.