-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
1368 lines (1093 loc) · 79.5 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Hexo</title>
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
<meta name="og:type" content="blog">
<meta name="og:title">
<meta name="og:url" content="http://yoursite.com/">
<meta name="og:image">
<meta name="og:site_name" content="Hexo">
<meta name="og:description">
<meta name="twitter:card" content="summary">
<link rel="alternative" href="/atom.xml" title="Hexo" type="application/atom+xml">
<link rel="icon" href="/favicon.png">
<link href="http://fonts.googleapis.com/css?family=Source+Code+Pro" rel="stylesheet" type="text/css">
<link rel="stylesheet" href="/css/style.css" type="text/css">
<!--[if lt IE 9]><script src="//html5shiv.googlecode.com/svn/trunk/html5.js"></script><![endif]-->
</head>
<body>
<div id="container">
<div id="wrap">
<header id="header">
<div id="banner"></div>
<div id="header-outer" class="outer">
<div id="header-title" class="inner">
<h1 id="logo-wrap">
<a href="/" id="logo">Hexo</a>
</h1>
</div>
<div id="header-inner" class="inner">
<nav id="main-nav">
<a id="main-nav-toggle" class="nav-icon"></a>
<a class="main-nav-link" href="/">Home</a>
<a class="main-nav-link" href="/archives">Archives</a>
</nav>
<nav id="sub-nav">
<a id="nav-rss-link" class="nav-icon" href="/atom.xml" title="RSS Feed"></a>
<a id="nav-search-btn" class="nav-icon" title="Search"></a>
</nav>
<div id="search-form-wrap">
<form action="//google.com/search" method="get" accept-charset="UTF-8" class="search-form"><input type="search" name="q" results="0" class="search-form-input" placeholder="Search"><input type="submit" value="" class="search-form-submit"><input type="hidden" name="q" value="site:http://yoursite.com"></form>
</div>
</div>
</div>
</header>
<div class="outer">
<section id="main">
<article id="post-hello-world" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/2014/03/06/hello-world/" class="article-date">
<time datetime="2014-03-06T10:19:08.000Z" itemprop="datePublished">Mar 6 2014</time>
</a>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/2014/03/06/hello-world/">Hello World</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<p>Welcome to <a href="http://zespia.tw/hexo" target="_blank">Hexo</a>! This is your very first post. Check <a href="http://zespia.tw/hexo/docs" target="_blank">documentation</a> to learn how to use.</p>
</div>
<footer class="article-footer">
<a data-url="http://yoursite.com/2014/03/06/hello-world/" data-id="mwu3frsbgo2f2fsa" class="article-share-link">Share</a>
</footer>
</div>
</article>
<article id="post-generate-dynamic-proxy-class-at-runtime-with-ilgenerator-c-sharp" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/2014/03/01/generate-dynamic-proxy-class-at-runtime-with-ilgenerator-c-sharp/" class="article-date">
<time datetime="2014-02-28T18:50:56.000Z" itemprop="datePublished">Mar 1 2014</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/挨踢/">挨踢</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/2014/03/01/generate-dynamic-proxy-class-at-runtime-with-ilgenerator-c-sharp/">用C# ILGenerator在运行时动态生成proxy</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<h2 id="-">问题描述</h2>
<p>C#中经常会遇到通过单一入口动态调用对象或服务的情况,形如:</p>
<figure class="highlight cs"><table><tr><td class="gutter"><pre>1
2
3
4
</pre></td><td class="code"><pre><span class="keyword">public</span> <span class="keyword">abstract</span> <span class="keyword">class</span> ProxyBase
{
<span class="keyword">protected</span> <span class="keyword">abstract</span> <span class="keyword">object</span> <span class="title">Invoke</span>(<span class="keyword">object</span> someMethodRelatedInfo, <span class="keyword">object</span>[] arguments);
}
</pre></td></tr></table></figure>
<p>比如Reflection,远程服务,Host动态脚本引擎时从C#调用引擎context内的方法等等情况都可以归类于这样的模型。</p>
<p>一种较好的工程实现就是把这些服务方法用接口定义,获得强类型的校验,避免出现不必要的bug,并便于维护。如:</p>
<figure class="highlight cs"><table><tr><td class="gutter"><pre>1
2
3
4
5
</pre></td><td class="code"><pre><span class="keyword">public</span> <span class="keyword">interface</span> IFooService
{
<span class="keyword">void</span> MethodWithNoReturn();
<span class="keyword">int</span> MethodTakeParameterAndReturn(<span class="keyword">int</span> a, <span class="keyword">int</span> b);
}
</pre></td></tr></table></figure>
<p>对于不同的后端,需要有具体的调用实现:</p>
<figure class="highlight cs"><table><tr><td class="gutter"><pre>1
2
3
4
5
6
7
8
9
</pre></td><td class="code"><pre><span class="keyword">public</span> <span class="keyword">class</span> FooProxyBase : ProxyBase
{
<span class="keyword">protected</span> <span class="keyword">override</span> <span class="keyword">object</span> <span class="title">Invoke</span>(<span class="keyword">object</span> someMethodRelatedInfo, <span class="keyword">object</span>[] arguments)
{
<span class="comment">// Pack to JSON and send via http</span>
<span class="comment">// Or adapte and call other classes</span>
<span class="comment">// Or whatever</span>
}
}
</pre></td></tr></table></figure>
<p>最终的Proxy类通过继承调用实现类,同时实现服务约定接口实现:</p>
<figure class="highlight cs"><table><tr><td class="gutter"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
</pre></td><td class="code"><pre><span class="keyword">public</span> <span class="keyword">class</span> FooService : FooProxyBase, IFooService
{
<span class="preprocessor">#<span class="keyword">region</span> Implement IFooService</span>
<span class="keyword">public</span> <span class="keyword">void</span> <span class="title">MethodWithNoReturn</span>()
{
Invoke(<span class="string">"MethodWithNoReturn"</span>, <span class="keyword">new</span> <span class="keyword">object</span>[<span class="number">0</span>]);
}
<span class="keyword">public</span> <span class="keyword">int</span> <span class="title">MethodTakeParameterAndReturn</span>(<span class="keyword">int</span> a, <span class="keyword">int</span> b)
{
<span class="keyword">return</span> Invoke(<span class="string">"MethodTakeParameterAndReturn"</span>, <span class="keyword">new</span> <span class="keyword">object</span>[] { a, b });
}
<span class="preprocessor">#<span class="keyword">endregion</span></span>
}
</pre></td></tr></table></figure>
<p>这样一来有一个显然的问题,Proxy类包含大量重复的代码,方法越多实现起来越费劲。这个问题的point of interest就在于Proxy类的动态生成,实现以后只需要一行代码就能替代人肉实现一个巨大的Proxy类:</p>
<figure class="highlight cs"><table><tr><td class="gutter"><pre>1
</pre></td><td class="code"><pre>IFooService proxy = ProxyEmitter.CreateProxy<FooProxyBase, IFooService>(<span class="comment">/*Constructor parameters are supported*/</span>);
</pre></td></tr></table></figure>
<p>要动态生成Proxy类有很多种方法(如生成源代码然后编译),这里采用在运行时通过Reflection获取服务接口的方法,动态生成Proxy类,最后用ILGenerator.Emit用.Net IL实现代码逻辑。</p>
<h2 id="-">实现要点</h2>
<p>如何动态创建Assembly, Module, Type的框架性代码MSDN有详尽的walkthrough,不在本文讨论重点,具体实现可参考源代码。</p>
<p>这一节记录在实现这个项目中几处逻辑的IL代码生成,有几点是必须要知道的:</p>
<ul>
<li>.Net CLR是基于栈的虚拟机</li>
<li>.Net CLR(在生成C#类时)是强类型的</li>
<li>参数顺序入栈</li>
<li>非static method的第一个参数总是this指针</li>
</ul>
<p>1. 有参数的constructor</p>
<p>在C#中很多涉及自动生成的情况(如serialization)都要求无参数的constructor,在有的情况下很让人忧桑,其实要支持有参数的constructor也是可行的。</p>
<p>如果父类只有一个有参数的constructor,子类的constructor实现必须用足够的参数构造:</p>
<figure class="highlight cs"><table><tr><td class="gutter"><pre>1
2
3
4
</pre></td><td class="code"><pre>class Derived: Base
{
<span class="keyword">public</span> <span class="title">Derived</span>(<span class="keyword">int</span> may, <span class="keyword">string</span> para, <span class="keyword">object</span>[] meters): <span class="title">base</span>(may, para, meters) {}
}
</pre></td></tr></table></figure>
<p>用IL实现上述代码,需要将参数重新压栈,然后call base的ctor指针:</p>
<figure class="highlight cs"><table><tr><td class="gutter"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
</pre></td><td class="code"><pre><span class="keyword">private</span> <span class="keyword">static</span> <span class="keyword">void</span> <span class="title">EmitCtor</span>(TypeBuilder tBuilder, ConstructorInfo ctor)
{
<span class="keyword">var</span> pTypes = ctor.GetParameters().Select(p => p.ParameterType).ToArray();
<span class="keyword">var</span> builder = Emitter.GetConstructor(
tBuilder,
MethodAttributes.Public |
MethodAttributes.HideBySig |
MethodAttributes.SpecialName |
MethodAttributes.RTSpecialName,
pTypes
);
<span class="keyword">var</span> ilGen = builder.GetILGenerator();
<span class="comment">// No locals</span>
<span class="comment">// Load all args, note arg 0 is this pointer, so must emit one more</span>
<span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i <= pTypes.Length; i++)
{
DoEmit(ilGen, OpCodes.Ldarg_S, i);
}
<span class="comment">// Call base ctor</span>
DoEmit(ilGen, OpCodes.Call, ctor);
<span class="comment">// Return</span>
DoEmit(ilGen, OpCodes.Ret);
}
</pre></td></tr></table></figure>
<p>生成的IL形如:</p>
<figure class="highlight cs"><table><tr><td class="gutter"><pre>1
2
3
4
5
6
</pre></td><td class="code"><pre>IL_0000: ldarg<span class="number">.0</span>
IL_0001: ldarg<span class="number">.1</span>
IL_0002: ldarg<span class="number">.2</span>
IL_0003: ldarg<span class="number">.3</span>
IL_0004: call instance <span class="keyword">void</span> Base::.ctor(int32, <span class="keyword">string</span>, <span class="keyword">object</span>)
IL_0009: ret
</pre></td></tr></table></figure>
<p>2. Array的初始化</p>
<p>由于Invoke的长相,决定了这个生成器中需要大量的生成object[]对象,并把参数装进去。
创建一个local variable,首先需要declare:</p>
<figure class="highlight cs"><table><tr><td class="gutter"><pre>1
</pre></td><td class="code"><pre>ilGen.DeclareLocal(<span class="keyword">typeof</span>(<span class="keyword">object</span>[]))
</pre></td></tr></table></figure>
<p>每个method的运行环境里维护了一个local列表,IL代码通过index把local入栈和出栈。
创建Array对象,并设置到local:</p>
<figure class="highlight cs"><table><tr><td class="gutter"><pre>1
2
3
4
5
6
7
</pre></td><td class="code"><pre><span class="comment">// Initialize array</span>
<span class="comment">// IL_0006: ldc.i4.x</span>
DoEmit(ilGen, OpCodes.Ldc_I4_S, pTypes.Length);
<span class="comment">// IL_0007: newarr [mscorlib]System.Object</span>
DoEmit(ilGen, OpCodes.Newarr, <span class="keyword">typeof</span>(Object));
<span class="comment">// IL_000c: stloc.1</span>
DoEmit(ilGen, OpCodes.Stloc_0);
</pre></td></tr></table></figure>
<p>对Array元素的逐条赋值由4~5条机器指令完成:</p>
<ul>
<li>ldloc.?将array入栈</li>
<li>ldc<em>i4</em>?将当前元素的index入栈</li>
<li>将需要赋给元素的值入栈(本例中为参数用ldarg_s,注意参数0为this指针)</li>
<li>如果是value type需要box</li>
<li>stelem.ref指令完成赋值</li>
</ul>
<figure class="highlight cs"><table><tr><td class="gutter"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
</pre></td><td class="code"><pre><span class="comment">// Now fill the array</span>
<span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span>; i < pTypes.Length; i++)
{
<span class="comment">// Load the array first</span>
<span class="comment">// IL_000X + 00: ldloc.0</span>
DoEmit(ilGen, OpCodes.Ldloc_0);
<span class="comment">// Push the index</span>
<span class="comment">// IL_000X + 01: ldc_i4_x</span>
DoEmit(ilGen, OpCodes.Ldc_I4_S, i);
<span class="comment">// Load argument i + 1 (note that argument 0 is this pointer(?))</span>
<span class="comment">// IL_000X + 02: ldarg_X</span>
DoEmit(ilGen, OpCodes.Ldarg_S, i + <span class="number">1</span>);
<span class="comment">// Box value type</span>
<span class="keyword">if</span> (pTypes[i].IsValueType)
{
<span class="comment">// IL_000X + 03: box pTypes[i]</span>
DoEmit(ilGen, OpCodes.Box, pTypes[i]);
}
<span class="comment">// Set arrary element</span>
<span class="comment">// IL_00X + ??: stelem.ref</span>
DoEmit(ilGen, OpCodes.Stelem_Ref);
}
</pre></td></tr></table></figure>
<h2 id="-">源代码及使用方法</h2>
<p>见<a href="https://github.com/akfish/ProxyEmitter" target="_blank">GitHub</a>。</p>
</div>
<footer class="article-footer">
<a data-url="http://yoursite.com/2014/03/01/generate-dynamic-proxy-class-at-runtime-with-ilgenerator-c-sharp/" data-id="avpkn1yjod865ca8" class="article-share-link">Share</a>
<ul class="article-tag-list"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/.Net/">.Net</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/compiler/">compiler</a></li></ul>
</footer>
</div>
</article>
<article id="post-the-making-of-sarcasm-2-ast-generators-and-fun-with-visualization" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/2014/02/28/the-making-of-sarcasm-2-ast-generators-and-fun-with-visualization/" class="article-date">
<time datetime="2014-02-27T18:46:18.000Z" itemprop="datePublished">Feb 28 2014</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/挨踢/">挨踢</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/2014/02/28/the-making-of-sarcasm-2-ast-generators-and-fun-with-visualization/">The Making Of Sarcasm (2) - AST, Generators and Fun with Visualization</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<p>In <a href="http://catx.me/2014/02/25/the-making-of-sarcasm-1/" target="_blank">part 1</a> we discussed the design goals of Sarcasm and devised a grammar specification that covers most of Irony's features.</p>
<p>Continuing from <a href="https://github.com/akfish/Sarcasm/commit/15c9e6e1ef69bd1150d51af558e3a897e09accb8" target="_blank">commit 15c9e6</a>, in which I implemented the <a href="https://github.com/akfish/Sarcasm/blob/15c9e6e1ef69bd1150d51af558e3a897e09accb8/Sarcasm/Parser/SarcasmGrammar.cs" target="_blank">grammar specs</a> by hand with Irony, we will discuss the following topics:</p>
<ul>
<li>Construction of abstract syntax tree</li>
<li>Generator workflow</li>
<li>MarkDown generator</li>
<li>Having some fun with visualization</li>
</ul>
<h2 id="ast-overview">AST Overview</h2>
<p>After <a href="https://github.com/akfish/Sarcasm/blob/develop/Sarcasm/Parser/SarcasmGrammar.cs" target="_blank">grammar class</a> is implemented, the first thing you are going to do is to create a parser instance:</p>
<figure class="highlight cs"><table><tr><td class="gutter"><pre>1
2
</pre></td><td class="code"><pre><span class="keyword">var</span> language = <span class="keyword">new</span> LanguageData(<span class="keyword">new</span> SarcasmGrammar());
<span class="keyword">var</span> parser = <span class="keyword">new</span> Irony.Parsing.Parser(_language);
</pre></td></tr></table></figure>
<p>With the Parser instance, we can parse source code by simply:</p>
<figure class="highlight cs"><table><tr><td class="gutter"><pre>1
2
3
</pre></td><td class="code"><pre><span class="keyword">var</span> parseTree = parser.Parse(sourceCode);
<span class="keyword">var</span> parseRoot = parseTree.Root;
<span class="keyword">var</span> astRoot = ParseRoot.AstNode;
</pre></td></tr></table></figure>
<p>If something is wrong with the grammar or source code, parseRoot and astRoot will be null. For now I will not go into error handling.</p>
<p>Two kinds of trees are generated when Irony parses the source code: parsing tree and optional abstract syntax tree. To create an AST, you must do the following:</p>
<p>1. Set language flag in grammar class's constructor</p>
<figure class="highlight cs"><table><tr><td class="gutter"><pre>1
</pre></td><td class="code"><pre>LanguageFlags |= LanguageFlags.CreateAst;
</pre></td></tr></table></figure>
<p>2. Create a bunch of AST node class deriving from Irony.Interpreter.Ast (also remember to add reference to assembly Irony.Interpreter.dll).</p>
<figure class="highlight cs"><table><tr><td class="gutter"><pre>1
2
3
4
5
6
</pre></td><td class="code"><pre><span class="comment">// Make your own base class will make life eaiser</span>
<span class="keyword">public</span> <span class="keyword">abstract</span> <span class="keyword">class</span> SarcasmNode : AstNode {<span class="comment">/*...*/</span>}
<span class="comment">// For other nodes</span>
<span class="keyword">public</span> <span class="keyword">class</span> Document : SarcasmNode {<span class="comment">/*...*/</span>}
<span class="keyword">public</span> <span class="keyword">class</span> IdNode : SarcasmNode {<span class="comment">/*...*/</span>}
<span class="comment">/*...*/</span>
</pre></td></tr></table></figure>
<p>3. Assign AST node class to each Terminal/NonTerminal instances.</p>
<pre class="brush:csharp">// For terminals
var ID = new IdentifierTerminal("ID");
ID.AstConfig.NodeType = typeof (IdNode);
// For non-terminals
var Directive = new NonTerminal("Directive", typeof(DirectiveNode));</pre>
4\. Override AST node's Init method to handle initialization.
<pre class="brush:csharp">// In AST node class
public AstNode ChildNode;
public override void Init(AstContext context, ParseTreeNode treeNode)
{
// Keep this
base.Init(context, treeNode);
// treeNode is the corresponding node in parse tree, contains information like:
// Token
var token = treeNode.Token;
// Term
var term = treeNode.Term;
// Child nodes
var nodes = treeNode.GetMappedChildNodes();
// Set AsString to a human readable format. It will be used to display AST in Irony.GrammarExplorer
AsString = "Id: " + token.Text;
// Use AddChild to build tree structure, it returns an AstNode instance of child's AST node
ChildNode = AddChild(string.Empty, nodes[0]);
}</pre>
That's almost all you need to know about how to construct an AST. However, if you mess it up, things can get ugly since the debug information is extremely not helpful. The most common exception on can get is:
> System.NullReferenceException: Object reference not set to an instance of an object.
> at Irony.Ast.AstBuilder.BuildAst(ParseTreeNode parseNode) in f:\Dev\Tool\Irony_2013_12_12\Irony\Ast\AstBuilder.cs:line 97
This will not help you at all. But I will tell you that this always has to do with forgetting to set AST node type to one of your Terminals/Non-Terminals.
Here are some tips I learned in the hard way (the only way mostly, since Irony's documentation is poor):
* Assign AST node type to all Terminals/NonTerminals, including any intermediate/temporary ones.
* Except the ones marked as transient. They will not be created at all.
* CommentTerminals will NOT be mapped to AST at all. You will get the above error regardless the AST node type is set or not.
## Generator Workflow
AST marks the watershed between compiler's front end and back end. Although there're still some work (e.g. type validation and semantic analysis) left to be done, we can already generate something with this AST now. The most commonly used method here is the [visitor pattern](http://en.wikipedia.org/wiki/Visitor_pattern):
1\. Declare a interface for Visitor, one overload for each AST node
<pre class="brush:csharp"> public interface ISarcasmVisitor
{
void Visit(IdNode node);
void Visit(Document node);
void Visit(StringValueNode node);
// others
}</pre>
2\. Add virtual/abstract method to AST base class and implement in all derived class
<pre class="brush:csharp"> public abstract class SarcasmNode : AstNode
{
public abstract void Accept(ISarcasmVisitor visitor);
}
public class Document : SarcasmNode
{
public override void Accept(ISarcasmVisitor visitor)
{
visitor.Visit(this);
}
}</pre>
3\. Then we can create a generator by implement specific ISarcasmVisitor for different workflow, not only for target code generation but also outlining, semantic analysis.
<pre class="brush:csharp"> public abstract class TargetGenerator
{
public SarcasmParser Parser { get; set; }
protected TargetGenerator(SarcasmParser parser)
{
Parser = parser;
}
protected virtual void BeforeVisitor(StreamWriter writer, ISarcasmVisitor visitor) { }
protected abstract ISarcasmVisitor MakeVisitor(StreamWriter writer);
protected virtual void AfterVisitor(StreamWriter writer, ISarcasmVisitor visitor) { }
public bool Generate(StreamReader sourceStream, StreamWriter targetStream)
{
// Parse first
Parser.Parse(sourceStream);
if (Parser.IsValid)
{
var visitor = MakeVisitor(targetStream);
BeforeVisitor(targetStream, visitor);
// Visit AST
Parser.Document.Accept(visitor);
AfterVisitor(targetStream, visitor);
}
targetStream.Flush();
return Parser.IsValid;
}
}</pre>
<h2 id="markdown-generation">MarkDown Generation</h2>
<p>MarkDown generation for Sarcasm is very straight forward, since the syntax is in MarkDown. All I need to do is remove comment delimiters, add correct amount of line endings, format grammar rule into block and escape some special characters.</p>
<p>Again I won't bother with the details here. Just see <a href="https://github.com/akfish/Sarcasm/blob/develop/Sarcasm/Generator/MarkDownGenerator.cs" target="_blank">the code</a> for yourself.</p>
<h2 id="something-fun-with-visualization">Something Fun with Visualization</h2>
<p>The original plan was to start generate C# parser class from here. Then I found an interesting project <a href="http://arborjs.org" target="_blank">arbor.js</a> (especially its <a href="http://arborjs.org/halfviz/" target="_blank">halfviz</a> demo) and decided to do something fun with it. The idea is to make a better tool for debug. What debug information is better than a visualized one?</p>
<p>The halfviz demo converts a simple language called HalfTone to a node network. With the generator framework in place, it took me less than half an hour to <a href="https://github.com/akfish/Sarcasm/blob/feature/Visualization/Sarcasm/Generator/HalfToneGenerator.cs" target="_blank">generate node</a> representation from Sarcasm grammar source file. This can be used to visualize references between terminals and non-terminals:</p>
<p><a href="http://catx.me/wordpress/wp-content/uploads/2014/02/vis.png" target="_blank"><img src="http://catx.me/wordpress/wp-content/uploads/2014/02/vis.png" alt="sarcasm-vis"></a></p>
<p>You can play with it live <a href="http://arborjs.org/halfviz/#/NjM0OA" target="_blank">here</a>. It looks more confusing in this form, for now. But with some interaction (filtering, folding, highlighting for example), it can help develops quickly navigate though the grammar.
Here's another concept of how to visualize grammar related errors in this form (click to enlarge):
<a href="http://catx.me/wordpress/wp-content/uploads/2014/02/concept.png" target="_blank"><img src="http://catx.me/wordpress/wp-content/uploads/2014/02/concept.png" alt="sarcasm-concept"></a></p>
<p>Imagine view build errors in Visual Studio with this graph and navigate to the line that is responsible by click on the node. I definitely will try to create something like that later when I begin to make tool chain for Sarcasm.</p>
</div>
<footer class="article-footer">
<a data-url="http://yoursite.com/2014/02/28/the-making-of-sarcasm-2-ast-generators-and-fun-with-visualization/" data-id="cskpvglgbht2slya" class="article-share-link">Share</a>
<ul class="article-tag-list"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/.Net/">.Net</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/C++/">C++</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Markdown/">Markdown</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Sarcasm/">Sarcasm</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/compiler/">compiler</a></li></ul>
</footer>
</div>
</article>
<article id="post-the-making-of-sarcasm-1" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/2014/02/25/the-making-of-sarcasm-1/" class="article-date">
<time datetime="2014-02-24T20:13:46.000Z" itemprop="datePublished">Feb 25 2014</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/挨踢/">挨踢</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/2014/02/25/the-making-of-sarcasm-1/">The Making Of Sarcasm (1) - Design Goals And Grammar</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<h2 id="introduction">Introduction</h2>
<p>This is not a tutorial on how to use <a href="https://irony.codeplex.com/" target="_blank">Irony.net</a>. When I am done with this series of articles, hopefully we will never need to deal with Irony directly ever again.</p>
<p>In case you didn't know what Irony is, here is the introduction on its official site:</p>
<blockquote>
<p><strong>Irony</strong> is a development kit for implementing languages on .NET platform. Unlike most existing yacc/lex-style solutions Irony does not employ any scanner or parser code generation from grammar specifications written in a specialized meta-language. In Irony the target language grammar is coded directly in c# using operator overloading to express grammar constructs. Irony's scanner and parser modules use the grammar encoded as c# class to control the parsing process.
Looks fantastic. However, after I tried for days to implement CoffeeScript grammar with it, I encountered some issues:</p>
</blockquote>
<ul>
<li>While constructing grammar with C# directly sounds cool, the syntax is just not as clean and efficient as a special design DSL would be.</li>
<li>There are absolutely no compile-time checking on grammar. You have to compile it into dll first, then load it with Irony.GrammarExplorer.</li>
<li>It is extremely hard, if not impossible, to track any grammar errors back to source code.</li>
<li><p>On top of that, debug information on Shift-Reduce and Reduce-Reduce conflict is almost unreadable for a complex grammar.
It's a nice concept with poor tooling, which makes it scale poorly as the complexity of grammar grows. After some painstaking efforts to make my CoffeeScript parser to work, I finally begin to do something about it. I decide to create:</p>
<blockquote>
<p><strong>Sarcasm</strong>, an EBNF-like DSL that generates Irony.
The design goals are to:</p>
</blockquote>
</li>
<li><p>Implement a DSL that allow developers to define grammar in a more clean and efficient syntax that looks very much like EBNF notation.</p>
</li>
<li>Generate Irony grammar implementation (in C#) and a nice formatted grammar specification document (in MarkDown)</li>
<li>Enable compile-time error checking and grammar validation</li>
<li>Trace any errors back to the source code</li>
<li>Improve the readability of debug information for grammar conflicts</li>
<li>Provide necessary Visual Studio languages services, templates and tools</li>
</ul>
<h2 id="sarcasm-workflow">Sarcasm Workflow</h2>
<ol>
<li>Developer writes grammar specification file (.sarc)</li>
<li>Compiler checks for syntax error and generates both Irony grammar class (in C#) and spec docs (in MarkDown)</li>
<li>VS continues build process</li>
<li>If build failed, Sarcasm tools filters though all error messages, and map related errors back to specific tokens in .sarc file.</li>
<li>If build succeeded, Sarcasm tools loads the assembly and validates grammar.</li>
<li>Sarcasm tools translates any grammar conflicts, errors into a readable format and trace back to specific rule in .sarc file.
The entire workflow should be seamlessly integrated with Visual Studio.</li>
</ol>
<h2 id="sarcasm-grammar">Sarcasm Grammar</h2>
<p>In a nutshell, the Sarcasm grammar is a hybrid of MakeDown and modified EBNF notation. Here's a quick snippet:<span style="line-height: 1.5em;"> </span></p>
<p><pre class="brush:plain"># H1</p>
<p>/<em>
Block comment
</em>/</p>
<p>// Single Line Comment</p>
<p>// Directive
@class SarcasmGrammar</p>
<h2 id="h2">H2</h2>
<p>// Declarations
ID = new IdentifierTerminal("ID");
STRING = new StringLiteral("STRING", "\"", StringOptions.AllowsAllEscapes);</p>
<p>// Production Rules
SimpleValue := STRING | ID; </p>
<p>// Repeat
Ids := ID{};
Ids := ID*;
Ids := ID?;
Ids := ID+;</p>
<p>// Repeat with delimiters
Ids := ID{","};
Ids := ID*(".");
Ids := ID+(",");</p>
<h3 id="h3-pre-">H3</pre></h3>
<p>As you can see, the grammar consists of:</p>
<ul>
<li>MarkDown headers (start with one or more <span style="text-decoration: underline;">#</span>). Directly used for outlining.</li>
<li>Comments (single line and block). All other text contents go into comments. MarkDown syntax can be used in comments.</li>
<li>Directives (starting with <span style="text-decoration: underline;">@)</span>. Configures compiler behaviors like generated class names.</li>
<li>Declarations. Declare and initialize grammar terminals.</li>
<li>Production rules. Specifies the grammar rules.
I won't go into full details here. But you can see for yourself:</li>
</ul>
<p>Here is the <a href="https://gist.github.com/akfish/9167407#file-sarcasm-sarc" target="_blank">full grammar of Sarcasm writing in Sarcasm</a>.</p>
<p>And here is the <a href="https://gist.github.com/akfish/9167407#file-sarcasm-md" target="_blank">MarkDown specification documentation generated from that file</a></p>
<p>While the <a href="https://github.com/akfish/Sarcasm" target="_blank">project </a> is still in early developing stage, the grammar is mostly completed. I should be able to bootstrap it in a day or two.</p>
</div>
<footer class="article-footer">
<a data-url="http://yoursite.com/2014/02/25/the-making-of-sarcasm-1/" data-id="z8m5qkpyi1fxxbdq" class="article-share-link">Share</a>
<ul class="article-tag-list"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/.Net/">.Net</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/compiler/">compiler</a></li></ul>
</footer>
</div>
</article>
<article id="post-why-pseudo-random-generator-use-magic-number-9301-49297-233280" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/2014/02/22/why-pseudo-random-generator-use-magic-number-9301-49297-233280/" class="article-date">
<time datetime="2014-02-22T07:01:45.000Z" itemprop="datePublished">Feb 22 2014</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/挨踢/">挨踢</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/2014/02/22/why-pseudo-random-generator-use-magic-number-9301-49297-233280/">JavaScript随机数生成算法中为什么要用9301, 49297, 233280作为Magic Number</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<p>今天在知乎上回答了这样一个问题:<a href="http://www.zhihu.com/question/22818104" target="_blank">网上常能见到的一段JS随机数生成算法如下,为什么用9301, 49297, 233280这三个数字做基数?</a></p>
<p>问题中提到的代码如下:</p>
<p><pre class="brush:js">function rnd( seed ){
seed = ( seed * 9301 + 49297 ) % 233280; //Magic!
return seed / ( 233280.0 );
};</p>
<p>function rand(number){
today = new Date();
seed = today.getTime();
return Math.ceil( rnd( seed ) * number );
};</p>
<p>myNum=(rand(5));</pre>
经过一系列的digging,最终找到了这个问题的答案,这三个数的选择是有数学依据的。</p>
<p><strong>入门级的选择标准</strong>
这种随机数生成器叫做线性同余生成器(LCG, Linear Congruential Generator),几乎所有的运行库提供的rand都是采用的LCG,形如:
[latex]I_{n+1}=aI_n + c\ (mod\ m)[/latex]
生成的随机数序列最大周期m,生成0到m-1之间的随机数。要达到这个最大周期,必须满足</p>
<ul>
<li>c与m互质</li>
<li>a - 1可以被m的所有质因数整除</li>
<li>如果m是4的倍数,a - 1也必须是4的倍数
以上三条被称为Hull-Dobell定理。
作为一个随机数生成器,周期不够大是不好意思混的,所以这是要求之一。
可以看到,a=9301, c = 49297, m = 233280这组参数,以上三条全部满足。</li>
</ul>
<p><strong>进阶级的选择标准</strong>
要在随机数生成器界混,仅仅入门是不够的。
从工程的角度来讲,<a href="m - 1">latex</a>a + c[/latex]的值要(在合理的范围内)足够小,以避免溢出的问题。
从安全(实用)性的角度来讲,还要满足良好的随机性,这一点可以通过Knunth's Spectral Test来评估(见[2]),要通过2,3,4,5以及6维的Spectral Test才行。Spectral Test考察的就是生成的随机数序列在超空间的网格结构(lattice structure),当年IBM的RANDU子程序闹出的乌龙,连3维的Spectral Test就不能通过,上图嘲讽下:</p>
<p><a href="http://catx.me/wordpress/wp-content/uploads/2014/02/800px-Randu.png" target="_blank"><img src="http://catx.me/wordpress/wp-content/uploads/2014/02/800px-Randu.png" alt="800px-Randu"></a></p>
<p> </p>
<p>其中每个点代表三个连续的RANDU生成的随机数值,可以看到所有随机数分布在了15个二维平面上。</p>
<p>在这种要求面前,c的值最好:</p>
<ul>
<li>是质数 (c = 49297就是质数)</li>
<li>接近<a href="\frac{1}{2}-\frac{1}{6}\sqrt{3}">latex</a>m[/latex],(m = 233280时为49297.86460172205)
所以有了这样一些基本的标准,能够选择的参数范围就小了很多,弄个程序跑下Spectral Test,就能得到可选的参数组:</li>
</ul>
<p><a href="http://catx.me/wordpress/wp-content/uploads/2014/02/Unnamed-QQ-Screenshot20140222141315.png" target="_blank"><img src="http://catx.me/wordpress/wp-content/uploads/2014/02/Unnamed-QQ-Screenshot20140222141315.png" alt="Magic Number for LCG Random Generator"></a></p>
<p> </p>
<p>参考资料:<a href="http://nuclear.fis.ucm.es/COMP-PHYS/RANDOM/RandomNumbers.pdf" target="_blank">[1]</a><a href="http://random.mat.sbg.ac.at/tests/theory/spectral/" target="_blank">[2]</a></p>
</div>
<footer class="article-footer">
<a data-url="http://yoursite.com/2014/02/22/why-pseudo-random-generator-use-magic-number-9301-49297-233280/" data-id="gs2pxzxs2iyogqc1" class="article-share-link">Share</a>
<ul class="article-tag-list"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Algorithm/">Algorithm</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Magic Number/">Magic Number</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Math/">Math</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Random/">Random</a></li></ul>
</footer>
</div>
</article>
<article id="post-visual-studio-gplex-gppg-project-config" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/2014/02/08/visual-studio-gplex-gppg-project-config/" class="article-date">
<time datetime="2014-02-07T21:10:49.000Z" itemprop="datePublished">Feb 8 2014</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/挨踢/">挨踢</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/2014/02/08/visual-studio-gplex-gppg-project-config/">Visual Studio GPLEX/GPPG配置</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<p>入手实现Coffee#(CoffeeScript for .Net)的编译器时,发现了<a href="http://gplex.codeplex.com" target="_blank">GPLEX</a>和<a href="http://gppg.codeplex.com/" target="_blank">GPPG</a>这一对.Net环境下类Lex和Yacc的工具,用于自动生成C#实现的Scanner和Parser,可以快速的构建编译器的前端部分。使用时需要编写.lex(词法描述)和.y(语法描述)文件,调用工具生成.cs文件,加入到工程中编译。</p>
<p>通过修改csproject文件的配置,可以让vs自动处理.lex/.y文件的生成以及依赖关系。工程目录结构如下:</p>
<blockquote>
<p>CoffeeSharp
├─documents
├─src
│ └─CoffeeSharp
│ ├─CoffeeSharp.csproj
│ ├─Scanner.lex
│ ├─Scanner.cs
│ ├─Parser.y
│ ├─Parser.cs
│ └─...
└─tools
├─gppg.exe
└─gplex.exe
编辑CoffeeScript.csproj文件:</p>
</blockquote>
<p>1. 添加依赖关系:</p>
<p><pre class="brush:xml"> <!-- Generated with GPLEX and GPPG-->
<ItemGroup>
<Compile Include="Scanner.cs">
<AutoGen>True</AutoGen>
<DependentUpon>Scanner.lex</DependentUpon>
</Compile>
<Compile Include="Parser.cs">
<AutoGen>True</AutoGen>
<DependentUpon>Parser.y</DependentUpon>
</Compile>
</ItemGroup></pre>
2. 创建Lex/Yacc Target(修改原有的.lex/.y文件项)</p>
<p><pre class="brush:xml"> <!-- Lexer And Parser Specification Files -->
<ItemGroup>
<Lex Include="Scanner.lex" />
<Yacc Include="Parser.y" />
</ItemGroup></pre>
3. 设置Target生成规则,命令行根据实际情况修改</p>
<p><pre class="brush:xml"> <!-- Lex Target -->
<Target Name="LexGenerator" Inputs="@(Lex)" Outputs="@(Lex->'%(RelativeDir)%(Filename).cs')">
<Exec Command="$(SolutionDir)..\tools\gplex.exe /unicode /out:@(Lex ->'%(RelativeDir)%(Filename).cs') %(Lex.Identity)" />
<CreateItem Include="%(Lex.RelativeDir)%(Lex.Filename).cs">
<Output TaskParameter="Include" ItemName="FileWrites" />
</CreateItem>
</Target>
<!-- Yacc Target -->
<Target Name="YaccGenerator" Inputs="@(Yacc)" Outputs="@(Yacc->'%(RelativeDir)%(Filename).cs')">
<Exec Command="$(SolutionDir)..\tools\gppg.exe /gplex /out:@(Yacc ->'%(RelativeDir)%(Filename).cs') %(Yacc.Identity)" />
<CreateItem Include="%(Yacc.RelativeDir)%(Yacc.Filename).cs">
<Output TaskParameter="Include" ItemName="FileWrites" />
</CreateItem>
</Target></pre>
4. 设置依赖关系</p>
<p><pre class="brush:xml"><PropertyGroup>
<BuildDependsOn>LexGenerator;YaccGenerator;$(BuildDependsOn)</BuildDependsOn>
<CompileDependsOn>LexGenerator;YaccGenerator;$(CompileDependsOn)</CompileDependsOn>
</PropertyGroup></pre>
重新载入工程文件,会看到生成的.cs文件都折叠在了对应的.lex/.y文件之下:
<a href="http://catx.me/wordpress/wp-content/uploads/2014/02/yacc-lex-config.png" target="_blank"><img src="http://catx.me/wordpress/wp-content/uploads/2014/02/yacc-lex-config.png" alt="yacc-lex-config"></a>编译时的错误信息也会显示在vs中:
<a href="http://catx.me/wordpress/wp-content/uploads/2014/02/lex-error.png" target="_blank"><img src="http://catx.me/wordpress/wp-content/uploads/2014/02/lex-error.png" alt="lex-error"></a>
和使用pre-build event来实现预编译相比,这种方式最大的优点在于依赖关系明显,不易误操作编辑生成的文件,并且vs只会在.lex/.y文件有变更的情况下重新生成代码。支持按需生成会省不少事,比如在项目中配置了git pre-commit hook要求<a href="http://catx.me/2014/01/15/run-msbuild-and-mstest-from-git-pre-commit-hook/" title="通过Git Pre-Commit Hook执行MSBuild和MSTest" target="_blank">每次commit前检查build和test的正确性</a>,而很多代码自动生成工具会加上时间戳,这样编译一次就会带来文件变更,导致每次commit后都会产生新的diff。</p>
</div>
<footer class="article-footer">
<a data-url="http://yoursite.com/2014/02/08/visual-studio-gplex-gppg-project-config/" data-id="zsr73ko61j60k4w5" class="article-share-link">Share</a>
<ul class="article-tag-list"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/.Net/">.Net</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/CoffeeScript/">CoffeeScript</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/Visual Studio 2012/">Visual Studio 2012</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/compiler/">compiler</a></li></ul>
</footer>
</div>
</article>
<article id="post-get-and-visualize-grammar-definition-of-coffee-script-from-source-code" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/2014/02/06/get-and-visualize-grammar-definition-of-coffee-script-from-source-code/" class="article-date">
<time datetime="2014-02-05T16:55:24.000Z" itemprop="datePublished">Feb 6 2014</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/挨踢/">挨踢</a>
</div>
</div>
<div class="article-inner">
<header class="article-header">
<h1 itemprop="name">
<a class="article-title" href="/2014/02/06/get-and-visualize-grammar-definition-of-coffee-script-from-source-code/">从CoffeeScript源代码中获取文法并可视化</a>
</h1>
</header>
<div class="article-entry" itemprop="articleBody">
<p>最近在研究把CoffeeScript编译到.Net CLR环境上运行的可能性,在几个CoffeeScript compiler的实现中,没有发现对文法定义的specification,如果要人肉重建不仅工作量忧桑,还有可能导致兼容性问题。于是看了下源代码,发现略施小计就能解决这个问题。</p>
<p>CoffeeScript的Parser使用jison生成的,所有的文法都在<a href="http://coffeescript.org/documentation/docs/grammar.html" target="_blank">grammar.coffee</a>里定义了。这个代码非常好改,去掉对jison的调用,把语法定义用JSON.stringify() format了再输出,执行<a href="https://gist.github.com/akfish/8827385" target="_blank">修改后的代码</a>:</p>
<blockquote>
<p>coffee grammar.coffee
就会得到一大串jison格式的文法定义:</p>
<p><pre class="brush:js">{
"tokens":" TERMINATOR TERMINATOR TERMINATOR STATEMENT INDENT OUTDENT INDENT OUTDENT IDENTIFIER NUMBER STRING JS REGEX BOOL = = INDENT OUTDENT : : INDENT OUTDENT RETURN RETURN HERECOMMENT PARAM_START PARAM_END -> => , , ... = ... . ?. :: :: INDEX_START INDEX_END INDEX_SOAK { } , TERMINATOR INDENT OUTDENT CLASS CLASS CLASS EXTENDS CLASS EXTENDS CLASS CLASS CLASS EXTENDS CLASS EXTENDS SUPER SUPER FUNC_EXIST CALL_START CALL_END CALL_START CALL_END THIS @ @ [ ] [ ] .. ... [ ] , TERMINATOR INDENT OUTDENT INDENT OUTDENT , TRY TRY TRY FINALLY TRY FINALLY CATCH THROW ( ) ( INDENT OUTDENT ) WHILE WHILE WHEN UNTIL UNTIL WHEN LOOP LOOP FOR FOR FOR OWN , FORIN FOROF FORIN WHEN FOROF WHEN FORIN BY FORIN WHEN BY FORIN BY WHEN SWITCH INDENT OUTDENT SWITCH INDENT ELSE OUTDENT SWITCH INDENT OUTDENT SWITCH INDENT ELSE OUTDENT LEADING_WHEN LEADING_WHEN TERMINATOR IF ELSE IF ELSE POST_IF POST_IF UNARY - + -- ++ -- ++ ? + - MATH SHIFT COMPARE LOGIC RELATION COMPOUND_ASSIGN COMPOUND_ASSIGN INDENT OUTDENT EXTENDS",
"bnf":
{
"Root":
[
["","return $$ = new yy.Block;",null],
["Body","return $$ = $1;",null],
["Block TERMINATOR","return $$ = $1;",null]
],
"Body":
[
["Line","$$ = yy.Block.wrap([$1]);",null],
["Body TERMINATOR Line","$$ = $1.push($3);",null],
["Body TERMINATOR","$$ = $1;",null]
],
"Line":
[
["Expression","$$ = $1;",null],
["Statement","$$ = $1;",null]
],
...</pre>
这样已经算是可用了,但可读性依然不高,经过一番搜索发现一个jison-to-w3c文法标记格式的<a href="http://bottlecaps.de/convert/" target="_blank">转换器</a>,得到<a href="https://gist.github.com/akfish/8827385" target="_blank">文法</a>:</p>
<p><pre class="brush:plain">Root ::= Body?
Body ::= Line ( TERMINATOR Line | TERMINATOR )*
Line ::= Expression
| Statement
Statement
::= Return
| Comment
| STATEMENT
Expression
::= Value
| Invocation
| Code
| Operation
| Assign
| If
| Try
| While
| For
| Switch
| Class
| Throw
...</pre>
最后找到一个可视化文法的网站<a href="http://bottlecaps.de/rr/ui" target="_blank">Railroad Diagram Generator</a>将其可视化,just for fun:</p>
</blockquote>
<p><a href="http://catx.me/wordpress/wp-content/uploads/2014/02/coffee-grammar.png" target="_blank"><img src="http://catx.me/wordpress/wp-content/uploads/2014/02/coffee-grammar.png" alt="coffee-grammar"></a></p>
<p> </p>
<p>完整的图在:<a href="http://project.catx.me/other/coffee-grammar.xhtml" target="_blank"><a href="http://project.catx.me/other/coffee-grammar.xhtml">http://project.catx.me/other/coffee-grammar.xhtml</a></a></p>
<p>源代码+完整的文法定义:<a href="https://gist.github.com/akfish/8827385" target="_blank"><a href="https://gist.github.com/akfish/8827385">https://gist.github.com/akfish/8827385</a></a></p>
</div>
<footer class="article-footer">
<a data-url="http://yoursite.com/2014/02/06/get-and-visualize-grammar-definition-of-coffee-script-from-source-code/" data-id="rjdbkm9vv579knfw" class="article-share-link">Share</a>
<ul class="article-tag-list"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/.Net/">.Net</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/CoffeeScript/">CoffeeScript</a></li><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/compiler/">compiler</a></li></ul>
</footer>
</div>
</article>
<article id="post-run-msbuild-and-mstest-from-git-pre-commit-hook" class="article article-type-post" itemscope itemprop="blogPost">
<div class="article-meta">
<a href="/2014/01/15/run-msbuild-and-mstest-from-git-pre-commit-hook/" class="article-date">
<time datetime="2014-01-15T13:06:56.000Z" itemprop="datePublished">Jan 15 2014</time>
</a>
<div class="article-category">
<a class="article-category-link" href="/categories/挨踢/">挨踢</a>