-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
103 lines (80 loc) · 3.29 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
---
title: abclass
output: github_document
---
[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/abclass)](https://CRAN.R-project.org/package=abclass)
[![Build Status](https://github.com/wenjie2wang/abclass/workflows/R-CMD-check/badge.svg)](https://github.com/wenjie2wang/abclass/actions)
[![codecov](https://codecov.io/gh/wenjie2wang/abclass/branch/main/graph/badge.svg)](https://app.codecov.io/gh/wenjie2wang/abclass)
The package **abclass** provides implementations of the multi-category
angle-based classifiers (Zhang & Liu, 2014) with the large-margin unified
machines (Liu, et al., 2011) for high-dimensional data.
> **Note**
> This package is still very experimental and under active development.
> The function interface is subject to change without guarantee of backward
> compatibility.
## Installation
One can install the released version from
[CRAN](https://CRAN.R-project.org/package=abclass).
``` r
install.packages("abclass")
```
Alternatively, the version under development can be installed as follows:
``` r
if (! require(remotes)) install.packages("remotes")
remotes::install_github("wenjie2wang/abclass", upgrade = "never")
```
## Getting Started
A toy example is as follows:
```{r example-abclass}
library(abclass)
packageVersion("abclass")
## toy examples for demonstration purpose
## reference: example 1 in Zhang and Liu (2014)
ntrain <- 400 # size of training set
ntest <- 10000 # size of testing set
p0 <- 5 # number of actual predictors
p1 <- 45 # number of random predictors
k <- 5 # number of categories
set.seed(1)
n <- ntrain + ntest; p <- p0 + p1
train_idx <- seq_len(ntrain)
y <- sample(k, size = n, replace = TRUE) # response
mu <- matrix(rnorm(p0 * k), nrow = k, ncol = p0) # mean vector
## normalize the mean vector so that they are distributed on the unit circle
mu <- mu / apply(mu, 1, function(a) sqrt(sum(a ^ 2)))
x0 <- t(sapply(y, function(i) rnorm(p0, mean = mu[i, ], sd = 0.25)))
x1 <- matrix(rnorm(p1 * n, sd = 0.3), nrow = n, ncol = p1)
x <- cbind(x0, x1)
train_x <- x[train_idx, ]
test_x <- x[- train_idx, ]
y <- factor(paste0("label_", y))
train_y <- y[train_idx]
test_y <- y[- train_idx]
### logistic deviance loss with elastic-net penalty
model1 <- cv.abclass(train_x, train_y, nlambda = 100, nfolds = 5,
loss = "logistic", grouped = FALSE)
pred1 <- predict(model1, test_x)
table(test_y, pred1)
mean(test_y == pred1) # accuracy
### with groupwise lasso
model2 <- cv.abclass(train_x, train_y, nlambda = 100, nfolds = 5,
loss = "logistic", grouped = TRUE)
pred2 <- predict(model2, test_x)
table(test_y, pred2)
mean(test_y == pred2) # accuracy
## tuning by ET-Lasso instead of cross-validation
model3 <- et.abclass(train_x, train_y, nlambda = 100,
loss = "logistic", grouped = TRUE)
pred3 <- predict(model3, test_x)
table(test_y, pred3)
mean(test_y == pred3) # accuracy
```
## References
- Zhang, C., & Liu, Y. (2014). Multicategory Angle-Based Large-Margin
Classification. *Biometrika*, 101(3), 625--640.
- Liu, Y., Zhang, H. H., & Wu, Y. (2011). Hard or soft classification?
large-margin unified machines. *Journal of the American Statistical
Association*, 106(493), 166--177.
## License
[GNU General Public License](https://www.gnu.org/licenses/) (≥ 3)
Copyright holder: Eli Lilly and Company