-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.xml
1951 lines (1868 loc) · 196 KB
/
index.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>n4mine's blog</title>
<link>https://n4mine.github.io/</link>
<description>Recent content on n4mine's blog</description>
<generator>Hugo -- gohugo.io</generator>
<language>zh-cn</language>
<lastBuildDate>Sun, 13 Mar 2022 16:17:20 +0800</lastBuildDate>
<atom:link href="https://n4mine.github.io/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>About</title>
<link>https://n4mine.github.io/about/</link>
<pubDate>Wed, 23 Jul 2014 07:41:43 +0000</pubDate>
<guid>https://n4mine.github.io/about/</guid>
<description><h3 id="技能栈">技能栈</h3>
<ul>
<li>Golang、Shell、Python、Lua</li>
<li>Linux</li>
</ul>
<h3 id="联系我">联系我</h3>
<hr>
<ul>
<li><a href="http://weibo.com/n4mine">n4mine@weibo</a></li>
<li><a href="http://github.com/n4mine">n4mine@github</a></li>
</ul>
<h3 id="工作经历">工作经历</h3>
<hr>
<table>
<thead>
<tr>
<th>公司</th>
<th>职位</th>
<th>时间</th>
</tr>
</thead>
<tbody>
<tr>
<td>滴滴</td>
<td>运维开发</td>
<td>2016.03 ~</td>
</tr>
<tr>
<td>美团</td>
<td>SRE</td>
<td>2015.01 ~ 2016.03</td>
</tr>
<tr>
<td>小米</td>
<td>SRE</td>
<td>2013.11 ~ 2015.01</td>
</tr>
<tr>
<td>工信部信息中心</td>
<td>运维工程师</td>
<td>2012.02 ~ 2013.11</td>
</tr>
<tr>
<td>长春嘉诚网络工程有限公司</td>
<td>工程师</td>
<td>2009.05 ~ 2011.06</td>
</tr>
</tbody>
</table>
</description>
</item>
<item>
<title>当可观测遇到serverless</title>
<link>https://n4mine.github.io/post/observability_and_serverless/</link>
<pubDate>Sun, 13 Mar 2022 16:17:20 +0800</pubDate>
<guid>https://n4mine.github.io/post/observability_and_serverless/</guid>
<description><h2 id="背景">背景</h2>
<p>可观测(监控)的产品形态,注定了其数据是写 &gt;&gt; 读的</p>
<p>市面上大多数的技术,均都对写做了大量优化</p>
<p>而对读却往往“无能为力”,开并发似乎是人们能做到的“极限”,但还是限于单机的瓶颈</p>
<p><img src="https://n4mine.github.io/img/read_and_write.png" alt="read and write"></p>
<h2 id="溯源">溯源</h2>
<p>读不能高性能的原因,其实可以大致归为以下几点:</p>
<ol>
<li>与局部性更高的写相比,读往往是”随机“的,而随机往往是性能受制的根本原因</li>
<li>即使 ssd 技术已经出现多年,但硬盘的速度仍是大多数系统的瓶颈所在,所以才有了业界的<code>google monarch</code>, <code>facebook beringei</code> 等基于内存的 tsdb 出现</li>
<li>企业发展到一定程度,成本问题就一定会摆在台面上,谈必及<code>ROI</code>,<code>降本提效</code></li>
</ol>
<p>那有什么方法/技术,是可以改善/解决这些问题的呢?</p>
<h2 id="曙光">曙光</h2>
<p>serverless,自 aws 2014 年推出 Lambda,faas 大火。</p>
<p>这里不细说 serverless、lambda、faas 的定义,只关注它的特性:</p>
<ol>
<li>事件驱动</li>
<li>按量收费</li>
<li>扩展性好</li>
</ol>
<p>再结合可观测的读场景:</p>
<ol>
<li>非长期运行</li>
<li>有突发查询,并一次获取大量数据</li>
<li>“成本敏感”</li>
</ol>
<p>似乎与 faas 的使用场景 match</p>
<h2 id="业界">业界</h2>
<p>下面看看业界是怎么使用 serverless 解决可观测数据读取能力不足的问题的。</p>
<h3 id="cortex-query-frontend">Cortex Query Frontend</h3>
<p>Cortex Query Frontend 最早在 2019 年由 Tom Wilkie 提出,其设计其实可以用下面这张图表示</p>
<p><img src="https://n4mine.github.io/img/frontend.jpg" alt="frontend"></p>
<p>位于 query frontend 下方的 query 不直接提供服务,而是消费由 frontend 切割后的查询。</p>
<p>这让我不由想起 golang 并行的趣图</p>
<p><img src="https://n4mine.github.io/img/parallelism.png" alt="parallelism"></p>
<h3 id="tempo-backend-search">Tempo Backend Search</h3>
<p>与 Cortex 的架构一脉相承,用于 trace 的 tempo 最近推出了试验性质的<code>Backend search</code>,其中就有使用 serverless 的部分,其主要架构如下</p>
<p><img src="https://n4mine.github.io/img/tempo.jpg" alt="tempo"></p>
<h3 id="honeycomb-retriever">Honeycomb Retriever</h3>
<p>如果说,cortex, tempo 还处于比较少人使用,或是试验阶段</p>
<p>honeycomb 已将 lambda 用在了生产环境中,而且产出了很多生产级别的经验</p>
<p>honeycomb 的顶层架构图如下描述,lambda 用于查询存储在 s3 上的数据</p>
<p><img src="https://n4mine.github.io/img/hny.jpg" alt="hny"></p>
<h2 id="结尾">结尾</h2>
<p>serverless 作为近年来大火的技术/产品,很适合可观测数据查询场景使用。相信其可以在可观测领域中中大放异彩。</p>
<p><em>-EOF-</em></p></description>
</item>
<item>
<title>读《监控的自我修养,过去十年和未来十年》有感</title>
<link>https://n4mine.github.io/post/reviews_of_monitoring/</link>
<pubDate>Sat, 19 Feb 2022 11:22:37 +0800</pubDate>
<guid>https://n4mine.github.io/post/reviews_of_monitoring/</guid>
<description><p>拜读<a href="https://mp.weixin.qq.com/s/iOSc4jFRRv61kPdrZ85dnQ">《监控的自我修养,过去十年和未来十年》</a>,有一些感想。</p>
<p>从 falcon 的最早的用户,到成为 falcon 的开发,再到基于 vm 底层存储打造全新的监控架构体系。一恍7、8年过去了:)</p>
<p>prometheus 流行的原因,其实是抱对了大腿,赶上了好时代,随着 k8s 的普及,云、云原生被越来越多人提及。
尤其行业内,言必及 prometheus,连我们从未使用过 prometheus 的同学,都开始聊联邦,聊 remote_write/read。</p>
<p>prometheus 虽流行,但自身的一些固定甚至是固执的范式,和初期单机版的定位,使得后来者诸如 thanos/cortex/vm/m3 等解决方案日趋受欢迎。
尤其 vm 在入口支持 pull + push 的方式,解决了 prometheus 在数据摄入逻辑固执己见带来的业内采集方式的纠结甚至是口水战。当然最近 2 年也看到 prometheus 有了一些“改进”。</p>
<p>很高兴看到 Nightingale 在架构方面的转变,敢于推陈出新,离云原生监控更近了一步。</p>
<p>个人斗胆对云原生监控的技术做一下未来“预测”:</p>
<ol>
<li>监控架构的内部技术闭环是监控能够云原生的重要抓手</li>
<li>pull+push 采集在未来缺一不可,东风压倒不了西风,一味痴迷其一的,定会食其恶果</li>
<li>报警会逐渐聚焦在 slo 上,而非面面俱到</li>
<li>虽然各种成本在云时代有所下降。采集的内容仍不会“应收尽收”,也不会像现在这样“高度抽象”。起码在存储架构有突破之前,成本仍是制约采集内容向“应收尽收”所谓可观测目标的最大“障碍”</li>
<li>MTL(metrics, traces, logging) 的打通是过去和未来几年的最重要的产品形态</li>
<li>相对报警能力,多样化的看图产品会变得愈发被需要</li>
<li>多副本的内存 TSDB + 廉价长期存储会变的越来越流行</li>
</ol>
<p><em>-EOF-</em></p></description>
</item>
<item>
<title>nginx 对 2xx 日志采样</title>
<link>https://n4mine.github.io/post/nginx_sample_log_2xx_only/</link>
<pubDate>Sat, 11 Sep 2021 09:50:48 +0800</pubDate>
<guid>https://n4mine.github.io/post/nginx_sample_log_2xx_only/</guid>
<description><h1 id="背景">背景</h1>
<p>一些场景下,我们并不需要记录 nginx 所有的 2xx 日志。
但需记录<code>所有</code> 4xx、5xx。
网上搜索到的方案,要么不记录 2xx,要么采样所有日志,都不满足需求。
下面我们使用 <code>ngx_lua</code> 来达成这样的目标:</p>
<ol>
<li>对于 1xx、2xx、3xx,按照一定的百分比采样</li>
<li>对于 4xx、5xx,100% 的记录</li>
</ol>
<p>直接上代码:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">lua_shared_dict c 1m<span class="p">;</span>
</span></span><span class="line"><span class="cl">init_by_lua_block <span class="o">{</span>
</span></span><span class="line"><span class="cl"> ngx.shared.c:add<span class="o">(</span><span class="s2">&#34;countme&#34;</span>, 0<span class="o">)</span>
</span></span><span class="line"><span class="cl"><span class="o">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">server <span class="o">{</span>
</span></span><span class="line"><span class="cl"> listen 8888<span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> location /logme <span class="o">{</span>
</span></span><span class="line"><span class="cl"> <span class="nb">set</span> <span class="nv">$logcon</span> 1<span class="p">;</span>
</span></span><span class="line"><span class="cl"> log_by_lua_block <span class="o">{</span>
</span></span><span class="line"><span class="cl"> <span class="nb">local</span> <span class="nv">c</span> <span class="o">=</span> ngx.shared.c
</span></span><span class="line"><span class="cl"> c:incr<span class="o">(</span><span class="s2">&#34;countme&#34;</span>, 1<span class="o">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="nb">local</span> <span class="nv">ks</span> <span class="o">=</span> <span class="m">10</span> -- keep 1/10
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="k">if</span> <span class="o">(</span>ngx.status/100 &lt; <span class="m">4</span> and c:get<span class="o">(</span><span class="s2">&#34;countme&#34;</span><span class="o">)</span>%ks ~<span class="o">=</span> 0<span class="o">)</span>
</span></span><span class="line"><span class="cl"> <span class="k">then</span>
</span></span><span class="line"><span class="cl"> ngx.var.logcon <span class="o">=</span> <span class="m">0</span>
</span></span><span class="line"><span class="cl"> end
</span></span><span class="line"><span class="cl"> <span class="o">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> access_log /tmp/access.log main <span class="k">if</span><span class="o">=</span><span class="nv">$logcon</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"> error_log /tmp/error.log<span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="k">return</span> 200<span class="p">;</span>
</span></span><span class="line"><span class="cl"> <span class="c1">#return 400;</span>
</span></span><span class="line"><span class="cl"> <span class="o">}</span>
</span></span><span class="line"><span class="cl"><span class="o">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>上面代码中,<code>ks = 10</code>,代表只记录 10% 的日志。
测试过程就不写了,大家自己写个循环看下效果吧 :)</p>
<p>玩得开心 :)</p>
<p><em>-EOF-</em></p></description>
</item>
<item>
<title>用 map 实现 nginx 的动态 upstream</title>
<link>https://n4mine.github.io/post/nginx-dynamic-upstream-use-map/</link>
<pubDate>Mon, 16 Mar 2020 18:40:57 +0800</pubDate>
<guid>https://n4mine.github.io/post/nginx-dynamic-upstream-use-map/</guid>
<description><h1 id="背景">背景</h1>
<p>在我们的场景下,nginx 的 upstream servers 是一堆容器。
容器的主机名不变,但其 ip 可能发生变化。
我希望在容器 ip 发生变化时,无需变更 nginx 的配置。</p>
<h1 id="实现">实现</h1>
<p>直接上代码</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-nginx" data-lang="nginx"><span class="line"><span class="cl"><span class="k">map</span> <span class="nv">$x</span> <span class="nv">$backend</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl"> <span class="kn">1</span> <span class="n">server1</span><span class="p">:</span><span class="mi">12345</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"> <span class="kn">2</span> <span class="n">server2</span><span class="p">:</span><span class="mi">12345</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"> <span class="kn">3</span> <span class="n">server3</span><span class="p">:</span><span class="mi">12345</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="kn">default</span> <span class="n">backupserver</span><span class="p">:</span><span class="mi">12345</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">server</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl"> <span class="kn">listen</span> <span class="mi">7777</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="kn">location</span> <span class="s">/d</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl"> <span class="kn">resolver</span> <span class="mi">114</span><span class="s">.114.114.114</span> <span class="s">valid=60s</span> <span class="s">ipv6=off</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"> <span class="kn">set_by_lua_block</span> <span class="nv">$x</span> <span class="p">{</span> <span class="kn">return</span> <span class="s">math.random(3)</span> <span class="err">}</span>
</span></span><span class="line"><span class="cl"> <span class="s">proxy_pass</span> <span class="s">http://</span><span class="nv">$backend</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"> <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>做一下性能测试</p>
<p>静态 upstream</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">$ ./wrk --latency -d 1m -t <span class="m">10</span> -c <span class="m">100</span> http://127.0.0.1:7777/s
</span></span><span class="line"><span class="cl">Running 1m <span class="nb">test</span> @ http://127.0.0.1:7777/s
</span></span><span class="line"><span class="cl"> <span class="m">10</span> threads and <span class="m">100</span> connections
</span></span><span class="line"><span class="cl"> Thread Stats Avg Stdev Max +/- Stdev
</span></span><span class="line"><span class="cl"> Latency 5.93ms 5.66ms 177.31ms 97.39%
</span></span><span class="line"><span class="cl"> Req/Sec 1.85k 238.42 2.30k 69.12%
</span></span><span class="line"><span class="cl"> Latency Distribution
</span></span><span class="line"><span class="cl"> 50% 5.16ms
</span></span><span class="line"><span class="cl"> 75% 5.85ms
</span></span><span class="line"><span class="cl"> 90% 7.06ms
</span></span><span class="line"><span class="cl"> 99% 27.34ms
</span></span><span class="line"><span class="cl"> <span class="m">1105047</span> requests in 1.00m, 188.59MB <span class="nb">read</span>
</span></span><span class="line"><span class="cl"> Non-2xx or 3xx responses: <span class="m">1105047</span>
</span></span><span class="line"><span class="cl">Requests/sec: 18412.50
</span></span><span class="line"><span class="cl">Transfer/sec: 3.14MB
</span></span></code></pre></td></tr></table>
</div>
</div><p>动态 upstream</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">$ ./wrk --latency -d 1m -t <span class="m">10</span> -c <span class="m">100</span> http://127.0.0.1:7777/d
</span></span><span class="line"><span class="cl">Running 1m <span class="nb">test</span> @ http://127.0.0.1:7777/d
</span></span><span class="line"><span class="cl"> <span class="m">10</span> threads and <span class="m">100</span> connections
</span></span><span class="line"><span class="cl"> Thread Stats Avg Stdev Max +/- Stdev
</span></span><span class="line"><span class="cl"> Latency 6.33ms 3.98ms 101.19ms 96.57%
</span></span><span class="line"><span class="cl"> Req/Sec 1.68k 185.44 2.27k 77.58%
</span></span><span class="line"><span class="cl"> Latency Distribution
</span></span><span class="line"><span class="cl"> 50% 5.73ms
</span></span><span class="line"><span class="cl"> 75% 6.46ms
</span></span><span class="line"><span class="cl"> 90% 7.28ms
</span></span><span class="line"><span class="cl"> 99% 25.91ms
</span></span><span class="line"><span class="cl"> <span class="m">1005401</span> requests in 1.00m, 171.58MB <span class="nb">read</span>
</span></span><span class="line"><span class="cl"> Non-2xx or 3xx responses: <span class="m">1005401</span>
</span></span><span class="line"><span class="cl">Requests/sec: 16751.96
</span></span><span class="line"><span class="cl">Transfer/sec: 2.86MB
</span></span></code></pre></td></tr></table>
</div>
</div><p>玩得开心 :)</p>
<p><em>-EOF-</em></p></description>
</item>
<item>
<title>Get Funcname and Callername in Golang</title>
<link>https://n4mine.github.io/post/get-funcname-and-callername-in-golang/</link>
<pubDate>Wed, 21 Aug 2019 17:48:57 +0800</pubDate>
<guid>https://n4mine.github.io/post/get-funcname-and-callername-in-golang/</guid>
<description><h1 id="背景">背景</h1>
<p>在 golang 中,往往需要获知当前的函数名及其调用者。</p>
<p>例如在监控场景,自动获取函数名的需求比较常见。</p>
<h1 id="实现">实现</h1>
<p>直接上代码</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="kd">func</span> <span class="nf">FuncAndCallerFunc</span><span class="p">()</span> <span class="p">(</span><span class="kt">string</span><span class="p">,</span> <span class="kt">string</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl"> <span class="nx">pc</span> <span class="o">:=</span> <span class="nb">make</span><span class="p">([]</span><span class="kt">uintptr</span><span class="p">,</span> <span class="mi">15</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"> <span class="nx">n</span> <span class="o">:=</span> <span class="nx">runtime</span><span class="p">.</span><span class="nf">Callers</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="nx">pc</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"> <span class="nx">frames</span> <span class="o">:=</span> <span class="nx">runtime</span><span class="p">.</span><span class="nf">CallersFrames</span><span class="p">(</span><span class="nx">pc</span><span class="p">[:</span><span class="nx">n</span><span class="p">])</span>
</span></span><span class="line"><span class="cl"> <span class="nx">frame</span><span class="p">,</span> <span class="nx">more</span> <span class="o">:=</span> <span class="nx">frames</span><span class="p">.</span><span class="nf">Next</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"> <span class="kd">var</span> <span class="nx">caller</span> <span class="kt">string</span>
</span></span><span class="line"><span class="cl"> <span class="k">if</span> <span class="nx">more</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl"> <span class="nx">callerFra</span><span class="p">,</span> <span class="nx">_</span> <span class="o">:=</span> <span class="nx">frames</span><span class="p">.</span><span class="nf">Next</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"> <span class="nx">caller</span> <span class="p">=</span> <span class="nx">callerFra</span><span class="p">.</span><span class="nx">Function</span>
</span></span><span class="line"><span class="cl"> <span class="p">}</span>
</span></span><span class="line"><span class="cl"> <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="nx">caller</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl"> <span class="nx">caller</span> <span class="p">=</span> <span class="s">&#34;-&#34;</span>
</span></span><span class="line"><span class="cl"> <span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="k">return</span> <span class="nx">frame</span><span class="p">.</span><span class="nx">Function</span><span class="p">,</span> <span class="nx">caller</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>测试一下</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="kd">type</span> <span class="nx">A</span> <span class="kd">struct</span><span class="p">{}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">func</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl"> <span class="nx">_f</span><span class="p">,</span> <span class="nx">_c</span> <span class="o">:=</span> <span class="nf">FuncAndCallerFunc</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"> <span class="nx">fmt</span><span class="p">.</span><span class="nf">Printf</span><span class="p">(</span><span class="s">&#34;in func: %v, caller is: %v\n&#34;</span><span class="p">,</span> <span class="nx">_f</span><span class="p">,</span> <span class="nx">_c</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"> <span class="nx">a</span> <span class="o">:=</span> <span class="nx">A</span><span class="p">{}</span>
</span></span><span class="line"><span class="cl"> <span class="nf">x</span><span class="p">(</span><span class="nx">a</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">func</span> <span class="nf">x</span><span class="p">(</span><span class="nx">a</span> <span class="nx">A</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl"> <span class="nx">_f</span><span class="p">,</span> <span class="nx">_c</span> <span class="o">:=</span> <span class="nf">FuncAndCallerFunc</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"> <span class="nx">fmt</span><span class="p">.</span><span class="nf">Printf</span><span class="p">(</span><span class="s">&#34;in func: %v, caller is: %v\n&#34;</span><span class="p">,</span> <span class="nx">_f</span><span class="p">,</span> <span class="nx">_c</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"> <span class="nx">a</span><span class="p">.</span><span class="nb">print</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">func</span> <span class="p">(</span><span class="nx">A</span><span class="p">)</span> <span class="nb">print</span><span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl"> <span class="nx">_f</span><span class="p">,</span> <span class="nx">_c</span> <span class="o">:=</span> <span class="nf">FuncAndCallerFunc</span><span class="p">()</span>
</span></span><span class="line"><span class="cl"> <span class="nx">fmt</span><span class="p">.</span><span class="nf">Printf</span><span class="p">(</span><span class="s">&#34;in func: %v, caller is: %v\n&#34;</span><span class="p">,</span> <span class="nx">_f</span><span class="p">,</span> <span class="nx">_c</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>输出为</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">in func: main.main, <span class="nb">caller</span> is: runtime.main
</span></span><span class="line"><span class="cl">in func: main.x, <span class="nb">caller</span> is: main.main
</span></span><span class="line"><span class="cl">in func: main.A.print, <span class="nb">caller</span> is: main.x
</span></span></code></pre></td></tr></table>
</div>
</div><p><em>-EOF-</em></p></description>
</item>
<item>
<title>15 分钟了解容器</title>
<link>https://n4mine.github.io/post/understand-container-in-15-minutes/</link>
<pubDate>Sat, 27 Jul 2019 20:02:57 +0800</pubDate>
<guid>https://n4mine.github.io/post/understand-container-in-15-minutes/</guid>
<description><h2 id="什么是容器">什么是容器?</h2>
<p><img src="https://n4mine.github.io/img/upgrade-vr.png" alt="upgrade vr"></p>
<!-- raw HTML omitted -->
<p>所谓容器,其实就像上图:<br>
把 VR 玩家放在一个其以为是家的地方(<code>chroot</code>),创造目标所需环境(<code>namespaces</code>),再定期定量的提供水、食物(<code>cgroups</code>)供其维续生命。</p>
<p>chroot、namespace、cgroups,就是容器的核心技术。</p>
<p>本文将使用几个有限的命令,用 15 分钟的时间,让读者直观的了解<code>容器</code>。</p>
<h2 id="chroot">chroot</h2>
<p>chroot 比较简单,不演示了。</p>
<h2 id="namespaces">namespaces</h2>
<p>namespaces,<a href="https://en.wikipedia.org/wiki/Linux_namespaces">wikipedia</a> 定义: <code>Namespaces are a feature of the Linux kernel that partitions kernel resources such that one set of processes sees one set of resources while another set of processes sees a different set of resources.</code></p>
<p>文档见 <a href="https://lwn.net/Articles/531114/">namespaces</a></p>
<p>不同的<code>进程</code>,可以处于不同的 namespaces 中,这样它们就被隔离了。</p>
<p>namespaces 从类型来讲,分为 6 种,分别用于不同场景。这 6 种 namespaces 分别是:</p>
<table>
<thead>
<tr>
<th>名称</th>
<th>宏定义</th>
<th>隔离内容</th>
</tr>
</thead>
<tbody>
<tr>
<td>Mount namespaces</td>
<td>CLONE_NEWNS</td>
<td>Mount points</td>
</tr>
<tr>
<td>UTS namespaces</td>
<td>CLONE_NEWUTS</td>
<td>Hostname and NIS domain name</td>
</tr>
<tr>
<td>IPC namespaces</td>
<td>CLONE_NEWIPC</td>
<td>System V IPC, POSIX message queues</td>
</tr>
<tr>
<td>PID namespaces</td>
<td>CLONE_NEWPID</td>
<td>Process IDs</td>
</tr>
<tr>
<td>Network namespaces</td>
<td>CLONE_NEWNET</td>
<td>Network devices, stacks, ports, etc.</td>
</tr>
<tr>
<td>User namespaces</td>
<td>CLONE_NEWUSER</td>
<td>User and group IDs</td>
</tr>
</tbody>
</table>
<h3 id="uts-namespaces">uts namespaces</h3>
<p>我们先从最简单的 <code>uts namespaces</code>开始。</p>
<p>下面开始我们的第一个目标:隔离<code>hostname</code>。</p>
<ol>
<li>查看当前 namespaces 的 hostname 和当前进程所在的 namespaces:</li>
</ol>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="o">[</span>root@outside ~<span class="o">]</span><span class="c1"># hostname</span>
</span></span><span class="line"><span class="cl">outside
</span></span><span class="line"><span class="cl"><span class="o">[</span>root@outside ~<span class="o">]</span><span class="c1"># readlink /proc/$$/ns/uts</span>
</span></span><span class="line"><span class="cl">uts:<span class="o">[</span>4026531838<span class="o">]</span>
</span></span></code></pre></td></tr></table>
</div>
</div><ol start="2">
<li>解除 namespaces share, 创建新的进程:</li>
</ol>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="o">[</span>root@outside ~<span class="o">]</span><span class="c1"># unshare --uts /bin/sh</span>
</span></span><span class="line"><span class="cl">sh-4.2# hostname inside
</span></span><span class="line"><span class="cl">sh-4.2# hostname
</span></span><span class="line"><span class="cl">inside
</span></span><span class="line"><span class="cl">sh-4.2# readlink /proc/<span class="nv">$$</span>/ns/uts
</span></span><span class="line"><span class="cl">uts:<span class="o">[</span>4026532328<span class="o">]</span>
</span></span><span class="line"><span class="cl">sh-4.2# <span class="nb">exit</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>以上可看到,新的 hostname 在内部已经生效,并且内部进程的 uts namespaces 已发生变化(<code>4026531838</code> -&gt; <code>4026532328</code>)。</p>
<ol start="3">
<li>验证外部的 namespaces,其 hostname 未受到影响:</li>
</ol>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="o">[</span>root@outside ~<span class="o">]</span><span class="c1"># hostname</span>
</span></span><span class="line"><span class="cl">outside
</span></span></code></pre></td></tr></table>
</div>
</div><p>那是如何实现的?下例说明:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="o">[</span>root@outside ~<span class="o">]</span><span class="c1"># strace -fe unshare,execve unshare --uts /bin/sh</span>
</span></span><span class="line"><span class="cl">execve<span class="o">(</span><span class="s2">&#34;/usr/bin/unshare&#34;</span>, <span class="o">[</span><span class="s2">&#34;unshare&#34;</span>, <span class="s2">&#34;--uts&#34;</span>, <span class="s2">&#34;/bin/sh&#34;</span><span class="o">]</span>, <span class="o">[</span>/* <span class="m">27</span> vars */<span class="o">])</span> <span class="o">=</span> <span class="m">0</span>
</span></span><span class="line"><span class="cl">unshare<span class="o">(</span>CLONE_NEWUTS<span class="o">)</span> <span class="o">=</span> <span class="m">0</span> <span class="c1"># &lt;-- 关键点</span>
</span></span><span class="line"><span class="cl">execve<span class="o">(</span><span class="s2">&#34;/bin/sh&#34;</span>, <span class="o">[</span><span class="s2">&#34;/bin/sh&#34;</span><span class="o">]</span>, <span class="o">[</span>/* <span class="m">27</span> vars */<span class="o">])</span> <span class="o">=</span> <span class="m">0</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>对于<code>unshare(2)</code>, 请参考 <a href="http://man7.org/linux/man-pages/man2/unshare.2.html">man page</a></p>
<h3 id="pid-namespaces">pid namespaces</h3>
<p>如果只是修改 hostname 而不影响宿主,那也没什么意思。下面我们来看看如何隔离<code>pid 资源</code>。</p>
<p>实现目标:在新的进程中看到<code>全新</code>的<code>一套</code> <code>pids</code>。</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="o">[</span>root@outside ~<span class="o">]</span><span class="c1"># unshare --fork --pid --mount-proc /bin/sh</span>
</span></span><span class="line"><span class="cl">sh-4.2# ps
</span></span><span class="line"><span class="cl"> PID TTY TIME CMD
</span></span><span class="line"><span class="cl"> <span class="m">1</span> pts/5 00:00:00 sh
</span></span><span class="line"><span class="cl"> <span class="m">2</span> pts/5 00:00:00 ps
</span></span></code></pre></td></tr></table>
</div>
</div><p>如此,我们就在新的 namespaces 中,隔离了 pids,在这个<code>容器</code>内部看来,就是全新的一套 pids(第一个 pid 为 1)。</p>
<p>如何实现的呢?仍使用<code>strace</code>验证:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="o">[</span>root@outside ~<span class="o">]</span><span class="c1"># strace -fe unshare,execve,mount unshare --fork --pid --mount-proc /bin/sh</span>
</span></span><span class="line"><span class="cl">execve<span class="o">(</span><span class="s2">&#34;/usr/bin/unshare&#34;</span>, <span class="o">[</span><span class="s2">&#34;unshare&#34;</span>, <span class="s2">&#34;--fork&#34;</span>, <span class="s2">&#34;--pid&#34;</span>, <span class="s2">&#34;--mount-proc&#34;</span>, <span class="s2">&#34;/bin/sh&#34;</span><span class="o">]</span>, <span class="o">[</span>/* <span class="m">27</span> vars */<span class="o">])</span> <span class="o">=</span> <span class="m">0</span>
</span></span><span class="line"><span class="cl">unshare<span class="o">(</span>CLONE_NEWNS<span class="p">|</span>CLONE_NEWPID<span class="o">)</span> <span class="o">=</span> <span class="m">0</span> <span class="c1"># &lt;-- 关键点</span>
</span></span><span class="line"><span class="cl">Process <span class="m">97554</span> attached
</span></span><span class="line"><span class="cl"><span class="o">[</span>pid 97554<span class="o">]</span> mount<span class="o">(</span><span class="s2">&#34;none&#34;</span>, <span class="s2">&#34;/proc&#34;</span>, NULL, MS_REC<span class="p">|</span>MS_PRIVATE, NULL<span class="o">)</span> <span class="o">=</span> <span class="m">0</span>
</span></span><span class="line"><span class="cl"><span class="o">[</span>pid 97554<span class="o">]</span> mount<span class="o">(</span><span class="s2">&#34;proc&#34;</span>, <span class="s2">&#34;/proc&#34;</span>, <span class="s2">&#34;proc&#34;</span>, MS_NOSUID<span class="p">|</span>MS_NODEV<span class="p">|</span>MS_NOEXEC, NULL<span class="o">)</span> <span class="o">=</span> <span class="m">0</span>
</span></span><span class="line"><span class="cl"><span class="o">[</span>pid 97554<span class="o">]</span> execve<span class="o">(</span><span class="s2">&#34;/bin/sh&#34;</span>, <span class="o">[</span><span class="s2">&#34;/bin/sh&#34;</span><span class="o">]</span>, <span class="o">[</span>/* <span class="m">27</span> vars */<span class="o">])</span> <span class="o">=</span> <span class="m">0</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="cgroups">cgroups</h2>
<p>cgroups,<a href="https://en.wikipedia.org/wiki/Cgroups">wikipedia</a> 定义: <code>cgroups is a Linux kernel feature that limits, accounts for, and isolates the resource usage (CPU, memory, disk I/O, network, etc.) of a collection of processes</code></p>
<p>我们只需要知道 cgroups,能限制资源就行了。下面我们来直观感受一下 cgroup 的功能。</p>
<p>限制目标: <code>进程数</code></p>
<p>新启动一个<code>sh</code>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="o">[</span>root@outside ~<span class="o">]</span><span class="c1"># sh</span>
</span></span><span class="line"><span class="cl">sh-4.2# <span class="nb">echo</span> <span class="nv">$$</span>
</span></span><span class="line"><span class="cl"><span class="m">14838</span> <span class="c1"># &lt;-- 记住这个 pid</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>先挂载 cgroups(pids子系统):</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="o">[</span>root@outside ~<span class="o">]</span><span class="c1"># mkdir -p cgroup/pids</span>
</span></span><span class="line"><span class="cl"><span class="o">[</span>root@outside ~<span class="o">]</span><span class="c1"># mount -t cgroup -o pids pids ./cgroup/pids</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>配置 cgroups:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="o">[</span>root@outside ~<span class="o">]</span><span class="c1"># cd ./cgroup/pids/</span>
</span></span><span class="line"><span class="cl"><span class="o">[</span>root@outside pids<span class="o">]</span><span class="c1"># mkdir x</span>
</span></span><span class="line"><span class="cl"><span class="o">[</span>root@outside pids<span class="o">]</span><span class="c1"># cd x</span>
</span></span><span class="line"><span class="cl"><span class="o">[</span>root@outside x<span class="o">]</span><span class="c1"># echo 3 &gt; pids.max # &lt;-- 限制只能启动 3 个进程</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>将上面的 sh 加入到我们新建立的 cgroups 中(默认也影响其子进程):</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="o">[</span>root@outside x<span class="o">]</span><span class="c1"># echo 14838 &gt; cgroup.procs</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>下面来看看效果:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">sh-4.2# pstree -p <span class="nv">$$</span>
</span></span><span class="line"><span class="cl">sh<span class="o">(</span>14838<span class="o">)</span>───pstree<span class="o">(</span>15421<span class="o">)</span>
</span></span><span class="line"><span class="cl">sh-4.2# sleep <span class="m">100</span> <span class="p">&amp;</span> <span class="c1"># &lt;-- 启动第 1 个进程</span>
</span></span><span class="line"><span class="cl"><span class="o">[</span>1<span class="o">]</span> <span class="m">15427</span>
</span></span><span class="line"><span class="cl">sh-4.2# sleep <span class="m">100</span> <span class="p">&amp;</span> <span class="c1"># &lt;-- 启动第 2 个进程</span>
</span></span><span class="line"><span class="cl"><span class="o">[</span>2<span class="o">]</span> <span class="m">15429</span>
</span></span><span class="line"><span class="cl">sh-4.2# sleep <span class="m">100</span> <span class="p">&amp;</span> <span class="c1"># &lt;-- 第 3 个进程启动失败, 因为最开始的 sh 进程也算</span>
</span></span><span class="line"><span class="cl">sh: fork: retry: Resource temporarily unavailable
</span></span><span class="line"><span class="cl">sh: fork: retry: Resource temporarily unavailable
</span></span><span class="line"><span class="cl">sh: fork: retry: Resource temporarily unavailable
</span></span><span class="line"><span class="cl">sh: fork: retry: Resource temporarily unavailable
</span></span><span class="line"><span class="cl">sh: fork: Resource temporarily unavailable
</span></span></code></pre></td></tr></table>
</div>
</div><p>可以查看一下当前的<code>pids</code>数量:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="o">[</span>root@outside x<span class="o">]</span><span class="c1"># cat pids.current</span>
</span></span><span class="line"><span class="cl"><span class="m">3</span> <span class="c1"># &lt;-- 的确是 3 个</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="总结">总结</h2>
<p>以上使用<code>unshare</code>命令与<code>cgroups</code>,初步了解了容器的几个基本技术。</p>
<p>关于容器更深入的内容,有待读者自己去学习。</p>
<p>玩得开心 :)</p>
<p><em>-EOF-</em></p></description>
</item>
<item>
<title>LTTB 降采样算法初试</title>
<link>https://n4mine.github.io/post/lttb-downsample/</link>
<pubDate>Thu, 25 Jul 2019 17:44:41 +0800</pubDate>
<guid>https://n4mine.github.io/post/lttb-downsample/</guid>
<description><h2 id="降采样">降采样</h2>
<p>什么是降采样?在时序数据应用场景中,降采样通常是将<code>原始</code>的 N 个数据点,通过某种算法计算,得到 1 个数据点,并在较长周期保存曲线趋势的算法。</p>
<p>降采样带来的好处:</p>
<ol>
<li>降低成本。例如将原来的 6 个数据点,降低为 1 个数据点。如此压缩比就是 6:1。而在一些复杂的场景下,6:1 已经是一个很高的比例。</li>
<li>减少计算。降采后,前端的绘图速度和资源占用也会得到极大的优化。</li>
</ol>
<h2 id="降采样算法">降采样算法</h2>
<p><code>求平均</code>是一个常见的降采算法,例如我写的一个 <a href="https://github.com/devtoolkits/downsample">demo</a>,就是将 N 个点求平均,得到 1 个新的数据点的实现。</p>
<p>例如原始数据(周期为 10s)为</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="p">[]</span><span class="nx">Point</span><span class="p">{</span>
</span></span><span class="line"><span class="cl"> <span class="nx">Point</span><span class="p">{</span><span class="mi">10</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">},</span>
</span></span><span class="line"><span class="cl"> <span class="nx">Point</span><span class="p">{</span><span class="mi">20</span><span class="p">,</span> <span class="mf">0.3</span><span class="p">},</span>
</span></span><span class="line"><span class="cl"> <span class="nx">Point</span><span class="p">{</span><span class="mi">30</span><span class="p">,</span> <span class="mf">0.4</span><span class="p">},</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>降采后(周期为 15s)的结果为</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="p">[]</span><span class="nx">Point</span><span class="p">{</span>
</span></span><span class="line"><span class="cl"> <span class="nx">Point</span><span class="p">{</span><span class="mi">10</span><span class="p">,</span> <span class="mf">0.4</span><span class="p">},</span>
</span></span><span class="line"><span class="cl"> <span class="nx">Point</span><span class="p">{</span><span class="mi">25</span><span class="p">,</span> <span class="mf">0.4</span><span class="p">},</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><blockquote>
<p>题外话:为什么上述实现的时间戳是向过去时间对齐的?
主要原因是便于理解,例如当前时间是 <code>21:28:35</code>,在周期是 <code>30s</code> 的情况下,向过去时间对齐的结果是 <code>21:28:30</code>,而向未来时间对齐,结果则是 <code>21:29:00</code>,<code>21:29:00</code> 是未来的时间,很难向用户解释。</p>
</blockquote>
<p>那么<code>求平均</code>降采算法的问题是什么?我想放一张图会很直观(<code>以下数据均来自线上生产环境</code>)</p>
<p><img src="https://n4mine.github.io/img/mix_origin_avg_60s.png" alt="origin vs avg 60s"></p>
<p>上图中,将原始曲线的 6 个点,降采为 1 个点。</p>
<p>红色是原始曲线,绿色是经过平均降采算法后的曲线。</p>
<p>可以看出,细节<code>全部丢失</code>。</p>
<p>那么,有没有一种算法,可以兼顾<code>保留细节</code>与<code>达到降采目的</code>的效果呢?</p>
<h2 id="lttb">LTTB</h2>
<p>LTTB(Largest Triangle Three Buckets) <a href="https://skemman.is/bitstream/1946/15343/3/SS_MSthesis.pdf">论文链接</a></p>
<p>其实一句话就可以概括这个算法的功能:<strong>使用较少的数据点,保持原始曲线的<code>视觉特性</code></strong></p>
<p>那么效果是否如此,看下图:</p>
<p><img src="https://n4mine.github.io/img/mix_origin_lttb_60s.png" alt="origin vs lttb 60s"></p>
<p>上图中,红色是原始曲线,绿色则是经过 LTTB 算法降采后的曲线。</p>
<p>可以看出,细节<code>得以保留</code>。</p>
<p>到这里,效果就已经验证。这个算法是满足我们的需求的。</p>
<p>那么在追求更高压缩比的场景下,LTTB 的表现又如何?</p>
<h2 id="lttb-在不同-threshold-下的表现">LTTB 在不同 threshold 下的表现</h2>
<h3 id="原始-10s---降采至-60s">原始 10s -&gt; 降采至 60s</h3>
<p><img src="https://n4mine.github.io/img/mix_origin_lttb_60s.png" alt="origin vs lttb 60s"></p>
<ul>
<li>压缩比 6:1</li>
<li>绝大部分<code>细节</code>均得以保留</li>
</ul>
<h3 id="原始-10s---降采至-90s">原始 10s -&gt; 降采至 90s</h3>
<p><img src="https://n4mine.github.io/img/mix_origin_lttb_90s.png" alt="origin vs lttb 90s"></p>
<ul>
<li>压缩比 9:1</li>
</ul>
<h3 id="原始-10s---降采至-180s">原始 10s -&gt; 降采至 180s</h3>
<p><img src="https://n4mine.github.io/img/mix_origin_lttb_180s.png" alt="origin vs lttb 180s"></p>
<ul>
<li>压缩比 18:1</li>
</ul>
<h3 id="原始-10s---降采至-300s">原始 10s -&gt; 降采至 300s</h3>
<p><img src="https://n4mine.github.io/img/mix_origin_lttb_300s.png" alt="origin vs lttb 300s"></p>
<ul>
<li>压缩比 30:1</li>
</ul>
<p>在降采至 300s时,虽然很多细节都已丢失。但这时候的<code>平均</code>降采算法又是什么样的?</p>
<p><img src="https://n4mine.github.io/img/mix_origin_avg_300s.png" alt="origin vs avg 300s"></p>
<p>已经完全看不出原来曲线的样子,只能大概的看出一个<code>趋势</code>。</p>
<h2 id="总结">总结</h2>
<p>本文没有什么高深的内容,只是从一个用户的角度,简单的考察一下 LTTB 这个算法,在不同场景下的表现。</p>
<p>总体来说是可以作为生产环境下的降采算法的,比<code>平均</code>降采算法要优秀得多。</p>
<p>附: 相关代码均已放在 <a href="https://github.com/n4mine/lttb-practice">github</a></p>
<p><em>-EOF-</em></p></description>
</item>
<item>
<title>cacheserver - 内存TSDB的设计思想</title>
<link>https://n4mine.github.io/post/cacheserver-in-memory-tsdb-design/</link>
<pubDate>Mon, 22 Jul 2019 22:14:31 +0800</pubDate>
<guid>https://n4mine.github.io/post/cacheserver-in-memory-tsdb-design/</guid>
<description><h2 id="cacheserver是什么">cacheserver是什么?</h2>
<ul>
<li>基于facebook 的 gorilla paper 的一个服务。在内存中实现的,一个高性能、高压缩比的时序数据库</li>
<li>其原理在以前的blog中有过描述, 见 <a href="https://n4mine.github.io/post/in-memory-tsdb/#%E6%95%B0%E6%8D%AE%E6%A8%A1%E5%9E%8B%E7%9A%84%E5%AE%9E%E7%8E%B0">Falcon 存储优化: 高性能内存 TSDB 的诞生#数据模型的实现</a></li>
<li>本文主要描述在设计 cacheserver 过程中的一些思考</li>
</ul>
<p><img src="https://n4mine.github.io/img/cacheserver.png" alt="cacheserver"></p>
<h2 id="核心架构">核心架构</h2>
<p>如上图</p>
<ol>
<li>instance: cacheserver的实例</li>
<li>shard: 一个 cacheserver instance 内的多组series chunks</li>
<li>chunks: chunk slice</li>
<li>chunk: 一段时间的(ts, value)数据</li>
<li>series: 一条监控曲线</li>
</ol>
<p>下面依次介绍以上每个组件, 详细的内容仍可到 <a href="https://n4mine.github.io/post/in-memory-tsdb/#%E6%95%B0%E6%8D%AE%E6%A8%A1%E5%9E%8B%E7%9A%84%E5%AE%9E%E7%8E%B0">Falcon 存储优化: 高性能内存 TSDB 的诞生#数据模型的实现</a> 中查看。</p>
<h3 id="instance">instance</h3>
<p>instance 即 cacheserver 部署的实例。</p>
<p>集群还是分片?分片。</p>
<h3 id="shard">shard</h3>
<p>为什么需要 shard?分片锁降低锁冲突。</p>
<h3 id="chunkschunk">chunks/chunk</h3>
<p>chunk 是真正存放 series 数据的结构。chunk 内存放 series 的 bit 流。
chunks 只是 chunk 的 slice,使用 ringbuffer 技术,这样可以使用固定空间来存储多个 chunk。</p>
<h2 id="设计思想">设计思想</h2>
<h3 id="为什么独立成一个服务而不是嵌入到现有存储中">为什么独立成一个服务,而不是嵌入到现有存储中</h3>
<p>设计之初,cacheserver 的定位就是一个独立服务。
这样它可以与<code>graph</code>互相兜底。
例如<code>graph</code>挂掉了,仍可以调用cacheserver来给用户呈现最近的数据。而<code>graph</code>本身就是cacheserver的主存。</p>
<h3 id="为什么将数据放在内存中">为什么将数据放在内存中</h3>
<ol>
<li>快</li>
<li>gorilla 论文实现的算法压缩比高(11:1),存放热数据,内存已经足够</li>
<li>redis?据了解,<code>baidu</code>内部的tsdb。热数据就是放在<code>redis</code>中的。</li>
</ol>
<h3 id="怎么解决-series-爆炸问题">怎么解决 series 爆炸问题</h3>
<p>周期检测, 不活跃数据直接从内存中清除</p>
<h3 id="怎么解决-series-identify">怎么解决 series identify</h3>
<p>什么是<code>series identify</code>, 即根据tags搜索对应的 series。
cacheserver 不解决这个问题!这是索引要解决的问题, 本质上是一个搜索问题,不该在这里解决。
cacheserver 中,每个series都是用户定义的,没有业务含义。</p>
<p><em>-EOF-</em></p></description>
</item>
<item>
<title>浅谈监控层次模型</title>
<link>https://n4mine.github.io/post/monitoring-system-hierarchy/</link>
<pubDate>Thu, 18 Jul 2019 15:37:11 +0800</pubDate>
<guid>https://n4mine.github.io/post/monitoring-system-hierarchy/</guid>
<description><p><!-- raw HTML omitted -->开局一张图,剩下全靠编。<!-- raw HTML omitted --></p>
<p><img src="https://n4mine.github.io/img/monitoring_system_hierarchy.png" alt="Monitoring System Hierarchy"></p>
<ul>
<li>
<p>监控系统从使用者角度,一般可分为客户和开发同学</p>
<ul>
<li>客户,只关注业务</li>
<li>开发同学,关注应用、服务与<!-- raw HTML omitted -->基础设施<!-- raw HTML omitted -->(9102年了,不应该再关注,下文详谈)</li>
</ul>
</li>
<li>
<p>对监控的使用方式应该是 <code>Top-down</code> 的,而不应该是 <code>Bottom-up</code></p>
</li>
<li>
<p>监控开发者在建立其基础框架后,应尽快满足用户 <code>Top-down</code> 的需求</p>
<ul>
<li>实时聚合,无论是效率还是成本,一定是无法满足需求的(例如动辄单次上万条曲线的实时计算)</li>
<li>预聚合是业界的趋势,例如 prometheus 的 <code>recording rules</code></li>
</ul>
</li>
<li>
<p><code>drill down</code> 一定是解决开发同学使用监控系统最大痛点的有效手段</p>
<ul>
<li>
<p>问题在哪里?</p>
<ul>
<li>监控系统只提供数值型的一条条曲线,而开发同学想在曲线上看 <code>raw logs</code></li>
<li>开发同学想上报 traceID。而携带 traceID 的曲线可能会打爆监控系统的时序数据库</li>
</ul>
</li>
<li>
<p>如何解?</p>
<ul>
<li>监控系统提供能力(异构的存储模型),能从应用曲线下钻到服务,再到下游服务。见上图红框的 <code>drill down</code></li>
</ul>
</li>
</ul>
</li>
<li>
<p>为什么说开发同学同学不应该再关注基础设施?上图到基础设置的 <code>drill down</code> 为何是灰色?</p>
<ul>
<li>单体时代早已过去,如今已经是容器化时代,十个八个的实例挂掉,不应该影响全局</li>
<li>应该关注什么?
<ul>
<li>应用是否健康</li>
<li>SLO 是否达标</li>
<li>本季度还有多少分钟的不可用时长供你<code>挥霍</code>,用于创新与开拓</li>
<li>至于单个实例用多少内存、cpu 是否掉底,随它去吧</li>
</ul>
</li>
</ul>
</li>
<li>
<p>未来的监控应该是什么样的?</p>
<ul>
<li><code>drilllllllllll down</code>, 甚至 down 到 log(开发同学的最爱)</li>
<li><code>observability</code> 是大势所趋</li>
</ul>
</li>
</ul>
<p><em>-EOF-</em></p></description>
</item>
<item>
<title>Falcon 存储优化: 高性能内存 TSDB 的诞生</title>
<link>https://n4mine.github.io/post/in-memory-tsdb/</link>
<pubDate>Mon, 04 Mar 2019 14:48:36 +0800</pubDate>
<guid>https://n4mine.github.io/post/in-memory-tsdb/</guid>