Change the repository type filter
All
Repositories list
21 repositories
common-crawl-utils
PublicVarious Common Crawl utilities in Clojure.dictionary-annotator
PublicFast and configurable UIMA dictionary annotator.crawling-framework
PublicEasily crawl news portals or blog sites using Storm Crawler.beagle
PublicBeagle helps you identify keywords, phrases, regexes, and complex search queries of interest in streams of text documents.- Leiningen template for AWS Lambda custom runtime with GraalVM native image compiled Clojure projects.
doccano
Publicgf-wordnet
Publicopenccg
Publicsnowball
PublicSnowball version of the Porter stemmer for the Lithuanian language.docx-utils
Publices-utils
Public- Demonstration on how to use the Crawling Framework to setup a simple science news crawler and store results in ElasticSearch. Use this configuration to set up your own crawler.
docker-images
PublicDocker configurations, images, and examples of Dockerfiles for various TokenMill products and projects.Official source for Docker configurations, images, and examples of Dockerfiles for TokenMill products and projectsfast-url-access-checker
PublicEasily run HTTP GET requests against a list of URLs to check their HTTP status.timewords
PublicMultilingual library to easily parse date strings to java.util.Date objects.spaCy
Publicfaraday
Publicmetadata-detector
Public archiveltlangpack
Public archive