Skip to content

Latest commit

 

History

History
23 lines (18 loc) · 1.85 KB

README.md

File metadata and controls

23 lines (18 loc) · 1.85 KB

tosca

Tools for Statistical Content Analysis
created at TU Dortmund University.

About

tosca is a framework for statistical methods in content analysis. We offer a pipeline for preprocessing, model text corpora using a link to the implemantation of Latent Dirichlet Allocation from the lda package. Useful plot routines for both - pre- and post-modeled corpora - are given for the descriptive analysis of text corpora and topic models. Moreover, an implementation of Chang's intruder words and intruder topics is provided; as well as reasoned sampling of text ids to get effective sets of texts for human labeling/coding regarding accuracy of estimating Precision and Recall.

Installation

See examples how to use tosca at the Vignette.

Citation

For a BibTeX entry please use citation(package = "tosca").

Contribution

This R package is licensed under the GPLv3. For wishes, issues, and bugs please use the issue tracker.

Build Status Coverage Status CRAN Status Badge CRAN Downloads Total Downloads DOI