-
Notifications
You must be signed in to change notification settings - Fork 2
/
aa.html
71 lines (64 loc) · 4.47 KB
/
aa.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
<div id="toc_container" class="display-none">
<p class="toc_title"><i class="fas fa-list"></i>Contents</p>
<ul class="toc_list">
<li><a href="#Sample data description"><span class="toc_label">0</span>Sample data description</a></li>
<li><a href="#Multiple linear regression"><span class="toc_label">1</span>Multiple linear regression</a></li>
<ul>
<li><a href="#Data encoding - regression with categorical variables"><span class="toc_label">Notes:</span>Data encoding - regression with categorical variables</a></li>
<li><a href="#2D linear regression with scikit-learn"><span class="toc_label">Pythonic Tip:</span>2D regression with scikit-learn</a></li>
<li><a href="#Forcing zero y-intercept"><span class="toc_label">Pythonic Tip:</span>Forcing zero y-intercept</a></li>
<li><a href="#3D+ linear regression with scikit-learn"><span class="toc_label">Pythonic Tip:</span>3D+ regression with scikit-learn</a></li>
</ul>
<li><a href="#Introduction to multicollinearity"><span class="toc_label">2</span>Introduction to multicollinearity</a></li>
</ul>
</div>
<div id="toc_container" class="display-none">
<p class="toc_title"><i class="fas fa-list"></i>Contents</p>
<ul class="toc_list">
<li><a href="#1.-Review-on-Word2Vec-Skip-Gram"><span class="toc_label">1.</span>
Review on Word2Vec Skip-Gram</a></li>
<ul>
<li><a href="#1.1.-Review-on-Softmax"><span class="toc_label">1.1.</span>
Review on Softmax</a></li>
<li><a href="#1.2.-Softmax-is-computationally-very-expensive"><span class="toc_label">1.2.</span>
Softmax is computationally very expensive</a></li>
</ul>
<li><a href="#2.-Skip-Gram-Negative-Sampling"><span class="toc_label">2.</span>
Skip-Gram negative sampling</a></li>
<li><a href="#sample_pop_var"><span class="toc_label">Notes:</span>Population variance $\sigma^2$ vs. Sample variance $s^2$</a></li>
<ul>
<li><a href="#2.1.-How-does-negative-sampling-work?"><span class="toc_label">2.1.</span>
How does negative sampling work?</a></li>
<ul>
<li><a href="#2.1.1.-What-is-a-positive-word-$c_{pos}$?"><span class="toc_label">2.1.1.</span>
What is a positive word $c_{pos}$?</a></li>
<li><a href="#2.1.2.-What-is-a-negative-word-$c_{neg}$?"><span class="toc_label">2.1.2.</span>
What is a negative word $c_{neg}$?</a></li>
<li><a href="#2.1.3.-What-is-a-noise-distribution-$P_n(w)$?"><span class="toc_label">2.1.3.</span>
What is a noise distribution $P_n(w)$?</a></li>
<li><a href="#2.1.4.-How-are-negative-samples-drawn?"><span class="toc_label">2.1.4.</span>
How are negative samples drawn?</a></li>
</ul>
<li><a href="#2.2.-Derivation-of-Cost-Function-in-Negative-Sampling"><span class="toc_label">2.2.</span>
Derivation of cost function in negative sampling</a></li>
<li><a href="#2.3.-Derivation-of-gradients"><span class="toc_label">2.3.</span>
Derivation of gradients</a></li>
<ul>
<li><a href="#2.3.1.-Gradients-with-respect-to-output-weight-matrix-$\frac{\partial-J}{\partial-W_{output}}$"><span class="toc_label">2.3.1.</span>
Gradients with respect to output weight matrix $\frac{\partial-J}{\partial-W_{output}}$</a></li>
<li><a href="#2.3.2.-Gradients-with-respect-to-input-weight-matrix-$\frac{\partial-J}{\partial-W_{input}}$"><span class="toc_label">2.3.2.</span>
Gradients with respect to input weight matrix $\frac{\partial-J}{\partial-W_{input}}$</a></li>
</ul>
<li><a href="#2.4-Negative-Sampling-Algorithm"><span class="toc_label">2.4.</span>
Negative sampling algorithm</a></li>
<li><a href="#2.5.-Numerical-Demonstration"><span class="toc_label">2.5.</span>
Numerical demonstration</a></li>
</ul>
</ul>
</div>
<div id="toc_container" class="display-none">
<p class="toc_title"><i class="fas fa-list"></i>Contents</p>
<ul class="toc_list">
<li><a href="#Understanding confidence interval with analogy"><span class="toc_label">1</span>Understandingconfidence interval with analogy</a></li>
</ul>
</div>