C#实现自己的Json解析器(LALR(1)+miniDFA)

C#实现自己的Json解析器(LALR(1)+miniDFA)

Json是一个用处广泛、文法简单的数据格式。本文介绍如何用bitParser(拥有自己的解析器(C#实现LALR(1)语法解析器和miniDFA词法分析器的生成器))迅速实现一个简单高效的Json解析器。

读者可在(https://gitee.com/bitzhuwei/bitParser-demos/tree/master/bitzhuwei.JsonFormat.TestConsole)查看、下载完整代码。

Json格式的文法

我们可以在(https://ecma-international.org/wp-content/uploads/ECMA-404_2nd_edition_december_2017.pdf )找到Json格式的详细说明。据此,可得如下文法:

// Json grammar according to ECMA-404 2nd Edition / December 2017 Json = Object | Array ; Object = '{' '}' | '{' Members '}' ; Array = '[' ']' | '[' Elements ']' ; Members = Members ',' Member | Member ; Elements = Elements ',' Element | Element ; Member = 'string' ':' Value ; Element = Value ; Value = 'null' | 'true' | 'false' | 'number' | 'string'       | Object | Array ;  %%"([^"\u0000-u001F]|\["\/bfnrt]|\u[0-9A-Fa-f]{4})*"%% 'string' %%[-]?(0|[1-9][0-9]*)([.][0-9]+)?([eE][+-]?[0-9]+)?%% 'number' 

实际上这个文法是我用AI写出来后再整理成的。

此文法说明:

  1. 一个Json要么是一个Object,要么是一个Array

  2. 一个Object包含0-多个键值对("key" : value),用{ }括起来。

  3. 一个Array包含0-多个value,用[ ]括起来。

  4. 一个value有如下几种类型:nulltruefalsenumberstringObjectArray

其中:

nulltruefalse就是字面意思,因而可以省略不写。如果要在文法中显式地书写,就是这样:

%%null%% 'null' %%true%% 'true' %%false%% 'false' 

{}[],:也都是字面意思,因而可以省略不写。如果要在文法中显式地书写,就是这样:

%%{%% '{' %%}%% '}' %%[%% '[' %%]%% ']' %%,%% ',' %%:%% ':' 

number可由下图描述:

C#实现自己的Json解析器(LALR(1)+miniDFA)

图上直观地说明了number这个token的正则表达式由4个依次排列的部分组成:

[-]?  (0|[1-9][0-9]*)  ([.][0-9]+)?  ([eE][+-]?[0-9]+)? 

string可由下图描述:

C#实现自己的Json解析器(LALR(1)+miniDFA)

图上直观地说明了string这个token的正则表达式是用"包裹起来的某些字符或转义字符:

" (  [^"\u0000-u001F]  |  \["\/bfnrt]  |  \u[0-9A-Fa-f]{4}  )*  " /* 实际含义为: 非"、非、非控制字符(u0000-u001F) "、\、/、b、f、n、r、t uNNNN */ 

Value = Object | Array;说明Json中的数据是可以嵌套的。

将此文法作为输入,提供给bitParser,就可以一键生成下述章节介绍的Json解析器代码和文档了。

生成的词法分析器

C#实现自己的Json解析器(LALR(1)+miniDFA)

DFA

C#实现自己的Json解析器(LALR(1)+miniDFA)

DFA文件夹下是依据确定的有限自动机原理生成的词法分析器的全部词法状态。

初始状态lexicalState0
using System; using bitzhuwei.Compiler;  namespace bitzhuwei.JsonFormat {     partial class CompilerJson {         private static readonly Action<LexicalContext, char, CurrentStateWrap> lexicalState0 =         static (context, c, wrap) => {             if (false) { /* for simpler code generation purpose. */ }             /* user-input condition code */             /* [1-9] */             else if (/* possible Vt : 'number' */             /* no possible signal */             /* [xxx] scope */             '1'/*'u0031'(49)*/ <= c && c <= '9'/*'u0039'(57)*/) {                 BeginToken(context);                 ExtendToken(context, st.@number);                 wrap.currentState = lexicalState1;             }             /* user-input condition code */             /* 0 */             else if (/* possible Vt : 'number' */             /* no possible signal */             /* single char */             c == '0'/*'u0030'(48)*/) {                 BeginToken(context);                 ExtendToken(context, st.@number);                 wrap.currentState = lexicalState2;             }             /* user-input condition code */             /* [-] */             else if (/* possible Vt : 'number' */             /* no possible signal */             /* [xxx] scope */             c == '-'/*'u002D'(45)*/) {                 BeginToken(context);                 wrap.currentState = lexicalState3;             }             /* user-input condition code */             /* " */             else if (/* possible Vt : 'string' */             /* no possible signal */             /* single char */             c == '"'/*'u0022'(34)*/) {                 BeginToken(context);                 wrap.currentState = lexicalState4;             }             /* user-input condition code */             /* f */             else if (/* possible Vt : 'false' */             /* no possible signal */             /* single char */             c == 'f'/*'u0066'(102)*/) {                 BeginToken(context);                 wrap.currentState = lexicalState5;             }             /* user-input condition code */             /* t */             else if (/* possible Vt : 'true' */             /* no possible signal */             /* single char */             c == 't'/*'u0074'(116)*/) {                 BeginToken(context);                 wrap.currentState = lexicalState6;             }             /* user-input condition code */             /* n */             else if (/* possible Vt : 'null' */             /* no possible signal */             /* single char */             c == 'n'/*'u006E'(110)*/) {                 BeginToken(context);                 wrap.currentState = lexicalState7;             }             /* user-input condition code */             /* : */             else if (/* possible Vt : ':' */             /* no possible signal */             /* single char */             c == ':'/*'u003A'(58)*/) {                 BeginToken(context);                 ExtendToken(context, st.@Colon符);                 wrap.currentState = lexicalState8;             }             /* user-input condition code */             /* , */             else if (/* possible Vt : ',' */             /* no possible signal */             /* single char */             c == ','/*'u002C'(44)*/) {                 BeginToken(context);                 ExtendToken(context, st.@Comma符);                 wrap.currentState = lexicalState9;             }             /* user-input condition code */             /* ] */             else if (/* possible Vt : ']' */             /* no possible signal */             /* single char */             c == ']'/*'u005D'(93)*/) {                 BeginToken(context);                 ExtendToken(context, st.@RightBracket符);                 wrap.currentState = lexicalState10;             }             /* user-input condition code */             /* [ */             else if (/* possible Vt : '[' */             /* no possible signal */             /* single char */             c == '['/*'u005B'(91)*/) {                 BeginToken(context);                 ExtendToken(context, st.@LeftBracket符);                 wrap.currentState = lexicalState11;             }             /* user-input condition code */             /* } */             else if (/* possible Vt : '}' */             /* no possible signal */             /* single char */             c == '}'/*'u007D'(125)*/) {                 BeginToken(context);                 ExtendToken(context, st.@RightBrace符);                 wrap.currentState = lexicalState12;             }             /* user-input condition code */             /* { */             else if (/* possible Vt : '{' */             /* no possible signal */             /* single char */             c == '{'/*'u007B'(123)*/) {                 BeginToken(context);                 ExtendToken(context, st.@LeftBrace符);                 wrap.currentState = lexicalState13;             }             /* deal with everything else. */             else if (c == ' ' || c == 'r' || c == 'n' || c == 't' || c == '') {                 wrap.currentState = lexicalState0; // skip them.             }             else { // unexpected char.                 BeginToken(context);                 ExtendToken(context);                 AcceptToken(st.Error错, context);                 wrap.currentState = lexicalState0;             }         };     } } 

DFA文件夹下的实现是最初的也是最直观的实现。它已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。

miniDFA

C#实现自己的Json解析器(LALR(1)+miniDFA)

miniDFA文件夹下是依据Hopcroft算法得到的最小化的有限自动机的全部词法状态。它与DFA的区别仅在于词法状态数量可能减少了。

它是第二个实现,它也已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。

tableDFA

C#实现自己的Json解析器(LALR(1)+miniDFA)

tableDFA文件夹下是二维数组形式(ElseIf[][])的miniDFA。它与miniDFA表示的内容相同,区别在于:miniDFA用一个函数(Action<LexicalContext, char, CurrentStateWrap>)表示一个词法状态,而它用一个数组(ElseIf[])表示一个词法状态。这样可以减少内存占用。

二维数组形式的miniDFA
using System; using bitzhuwei.Compiler;  namespace bitzhuwei.JsonFormat {     partial class CompilerJson {         private static readonly ElseIf[] omitChars = new ElseIf[] {                 new('u0000'/*(0)*/, nextStateId: 0, Acts.None),                 new('t'/*'u0009'(9)*/, 'n'/*'u000A'(10)*/, nextStateId: 0, Acts.None),                 new('r'/*'u000D'(13)*/, nextStateId: 0, Acts.None),                 new(' '/*'u0020'(32)*/, nextStateId: 0, Acts.None),          };         private static readonly ElseIf[][] lexiStates = new ElseIf[47][];         static void InitializeLexiTable() {             ElseIf segment_48_48_25_3_ints_number = new('0'/*'u0030'(48)*/, 25, Acts.Begin | Acts.Extend, st.@number);//refered 2 times             ElseIf segment_49_57_24_3_ints_number = new('1'/*'u0031'(49)*/, '9'/*'u0039'(57)*/, 24, Acts.Begin | Acts.Extend, st.@number);//refered 2 times             ElseIf segment_48_57_37_2_ints_number = new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 37, Acts.Extend, st.@number);//refered 3 times             ElseIf segment_48_57_38_2_ints_number = new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 38, Acts.Extend, st.@number);//refered 2 times             ElseIf segment_48_57_44_2_ints_number = new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 44, Acts.Extend, st.@number);//refered 3 times             ElseIf segment_48_57_45_2_ints_number = new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 45, Acts.Extend, st.@number);//refered 2 times             ElseIf segment_46_46_8_0 = new('.'/*'u002E'(46)*/, 8, Acts.None);//refered 9 times             ElseIf segment_48_48_33_2_ints_number = new('0'/*'u0030'(48)*/, 33, Acts.Extend, st.@number);//refered 2 times             ElseIf segment_49_57_32_2_ints_number = new('1'/*'u0031'(49)*/, '9'/*'u0039'(57)*/, 32, Acts.Extend, st.@number);//refered 2 times             ElseIf segment_69_69_7_0 = new('E'/*'u0045'(69)*/, 7, Acts.None);//refered 11 times             ElseIf segment_101_101_7_0 = new('e'/*'u0065'(101)*/, 7, Acts.None);//refered 11 times             ElseIf segment_0_65535_0_4_ints_number = new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number);//refered 13 times             ElseIf segment_48_48_40_2_ints_number = new('0'/*'u0030'(48)*/, 40, Acts.Extend, st.@number);//refered 3 times             ElseIf segment_49_57_39_2_ints_number = new('1'/*'u0031'(49)*/, '9'/*'u0039'(57)*/, 39, Acts.Extend, st.@number);//refered 3 times             ElseIf segment_48_57_41_2_ints_number = new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 41, Acts.Extend, st.@number);//refered 2 times             lexiStates[0] = new ElseIf[] {             // possible Vt: 'string'             /*0*/new('"'/*'u0022'(34)*/, 2, Acts.Begin),             // possible Vt: ','             /*1*/new(','/*'u002C'(44)*/, 27, Acts.Begin | Acts.Extend, st.@Comma符),             // possible Vt: 'number'             /*2*/new('-'/*'u002D'(45)*/, 1, Acts.Begin),             // possible Vt: 'number'             /*3*///new('0'/*'u0030'(48)*/, 25, Acts.Begin | Acts.Extend, st.@number),             /*3*/segment_48_48_25_3_ints_number,             // possible Vt: 'number'             /*4*///new('1'/*'u0031'(49)*/, '9'/*'u0039'(57)*/, 24, Acts.Begin | Acts.Extend, st.@number),             /*4*/segment_49_57_24_3_ints_number,             // possible Vt: ':'             /*5*/new(':'/*'u003A'(58)*/, 26, Acts.Begin | Acts.Extend, st.@Colon符),             // possible Vt: '['             /*6*/new('['/*'u005B'(91)*/, 29, Acts.Begin | Acts.Extend, st.@LeftBracket符),             // possible Vt: ']'             /*7*/new(']'/*'u005D'(93)*/, 28, Acts.Begin | Acts.Extend, st.@RightBracket符),             // possible Vt: 'false'             /*8*/new('f'/*'u0066'(102)*/, 3, Acts.Begin),             // possible Vt: 'null'             /*9*/new('n'/*'u006E'(110)*/, 5, Acts.Begin),             // possible Vt: 'true'             /*10*/new('t'/*'u0074'(116)*/, 4, Acts.Begin),             // possible Vt: '{'             /*11*/new('{'/*'u007B'(123)*/, 31, Acts.Begin | Acts.Extend, st.@LeftBrace符),             // possible Vt: '}'             /*12*/new('}'/*'u007D'(125)*/, 30, Acts.Begin | Acts.Extend, st.@RightBrace符),             };             lexiStates[1] = new ElseIf[] {             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, 25, Acts.Begin | Acts.Extend, st.@number),             segment_48_48_25_3_ints_number,             // possible Vt: 'number'             //new('1'/*'u0031'(49)*/, '9'/*'u0039'(57)*/, 24, Acts.Begin | Acts.Extend, st.@number),             segment_49_57_24_3_ints_number,             };             lexiStates[2] = new ElseIf[] {             // possible Vt: 'string'             new(' '/*'u0020'(32)*/, '!'/*'u0021'(33)*/, 2, Acts.None),             // possible Vt: 'string'             new('"'/*'u0022'(34)*/, 36, Acts.Extend, st.@string),             // possible Vt: 'string'             new('#'/*'u0023'(35)*/, '['/*'u005B'(91)*/, 2, Acts.None),             // possible Vt: 'string'             new('\'/*'u005C'(92)*/, 9, Acts.None),             // possible Vt: 'string'             new(']'/*'u005D'(93)*/, 'uFFFF'/*�(65535)*/, 2, Acts.None),             };             lexiStates[3] = new ElseIf[] {             // possible Vt: 'false'             new('a'/*'u0061'(97)*/, 10, Acts.None),             };             lexiStates[4] = new ElseIf[] {             // possible Vt: 'true'             new('r'/*'u0072'(114)*/, 6, Acts.None),             };             lexiStates[5] = new ElseIf[] {             // possible Vt: 'null'             new('u'/*'u0075'(117)*/, 11, Acts.None),             };             lexiStates[6] = new ElseIf[] {             // possible Vt: 'true'             new('u'/*'u0075'(117)*/, 18, Acts.None),             };             lexiStates[7] = new ElseIf[] {             // possible Vt: 'number'             new('+'/*'u002B'(43)*/, 12, Acts.None),             // possible Vt: 'number'             new('-'/*'u002D'(45)*/, 12, Acts.None),             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 37, Acts.Extend, st.@number),             segment_48_57_37_2_ints_number,             };             lexiStates[8] = new ElseIf[] {             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 38, Acts.Extend, st.@number),             segment_48_57_38_2_ints_number,             };             lexiStates[9] = new ElseIf[] {             // possible Vt: 'string'             new('"'/*'u0022'(34)*/, 2, Acts.None),             // possible Vt: 'string'             new('/'/*'u002F'(47)*/, 2, Acts.None),             // possible Vt: 'string'             new('\'/*'u005C'(92)*/, 2, Acts.None),             // possible Vt: 'string'             new('b'/*'u0062'(98)*/, 2, Acts.None),             // possible Vt: 'string'             new('f'/*'u0066'(102)*/, 2, Acts.None),             // possible Vt: 'string'             new('n'/*'u006E'(110)*/, 2, Acts.None),             // possible Vt: 'string'             new('r'/*'u0072'(114)*/, 2, Acts.None),             // possible Vt: 'string'             new('t'/*'u0074'(116)*/, 2, Acts.None),             // possible Vt: 'string'             new('u'/*'u0075'(117)*/, 13, Acts.None),             };             lexiStates[10] = new ElseIf[] {             // possible Vt: 'false'             new('l'/*'u006C'(108)*/, 17, Acts.None),             };             lexiStates[11] = new ElseIf[] {             // possible Vt: 'null'             new('l'/*'u006C'(108)*/, 19, Acts.None),             };             lexiStates[12] = new ElseIf[] {             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 37, Acts.Extend, st.@number),             segment_48_57_37_2_ints_number,             };             lexiStates[13] = new ElseIf[] {             // possible Vt: 'string'             new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 14, Acts.None),             // possible Vt: 'string'             new('A'/*'u0041'(65)*/, 'F'/*'u0046'(70)*/, 14, Acts.None),             // possible Vt: 'string'             new('a'/*'u0061'(97)*/, 'f'/*'u0066'(102)*/, 14, Acts.None),             };             lexiStates[14] = new ElseIf[] {             // possible Vt: 'string'             new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 15, Acts.None),             // possible Vt: 'string'             new('A'/*'u0041'(65)*/, 'F'/*'u0046'(70)*/, 15, Acts.None),             // possible Vt: 'string'             new('a'/*'u0061'(97)*/, 'f'/*'u0066'(102)*/, 15, Acts.None),             };             lexiStates[15] = new ElseIf[] {             // possible Vt: 'string'             new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 16, Acts.None),             // possible Vt: 'string'             new('A'/*'u0041'(65)*/, 'F'/*'u0046'(70)*/, 16, Acts.None),             // possible Vt: 'string'             new('a'/*'u0061'(97)*/, 'f'/*'u0066'(102)*/, 16, Acts.None),             };             lexiStates[16] = new ElseIf[] {             // possible Vt: 'string'             new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 2, Acts.None),             // possible Vt: 'string'             new('A'/*'u0041'(65)*/, 'F'/*'u0046'(70)*/, 2, Acts.None),             // possible Vt: 'string'             new('a'/*'u0061'(97)*/, 'f'/*'u0066'(102)*/, 2, Acts.None),             };             lexiStates[17] = new ElseIf[] {             // possible Vt: 'false'             new('s'/*'u0073'(115)*/, 22, Acts.None),             };             lexiStates[18] = new ElseIf[] {             // possible Vt: 'true'             new('e'/*'u0065'(101)*/, 42, Acts.Extend, st.@true),             };             lexiStates[19] = new ElseIf[] {             // possible Vt: 'null'             new('l'/*'u006C'(108)*/, 43, Acts.Extend, st.@null),             };             lexiStates[20] = new ElseIf[] {             // possible Vt: 'number'             new('+'/*'u002B'(43)*/, 23, Acts.None),             // possible Vt: 'number'             new('-'/*'u002D'(45)*/, 23, Acts.None),             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 44, Acts.Extend, st.@number),             segment_48_57_44_2_ints_number,             };             lexiStates[21] = new ElseIf[] {             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 45, Acts.Extend, st.@number),             segment_48_57_45_2_ints_number,             };             lexiStates[22] = new ElseIf[] {             // possible Vt: 'false'             new('e'/*'u0065'(101)*/, 46, Acts.Extend, st.@false),             };             lexiStates[23] = new ElseIf[] {             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 44, Acts.Extend, st.@number),             segment_48_57_44_2_ints_number,             };             lexiStates[24] = new ElseIf[] {             // possible Vt: 'number'             //new('.'/*'u002E'(46)*/, 8, Acts.None),             segment_46_46_8_0,             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, 33, Acts.Extend, st.@number),             segment_48_48_33_2_ints_number,             // possible Vt: 'number'             //new('1'/*'u0031'(49)*/, '9'/*'u0039'(57)*/, 32, Acts.Extend, st.@number),             segment_49_57_32_2_ints_number,             // possible Vt: 'number'             //new('E'/*'u0045'(69)*/, 7, Acts.None),             segment_69_69_7_0,             // possible Vt: 'number'             //new('e'/*'u0065'(101)*/, 7, Acts.None),             segment_101_101_7_0,             // possible Vt: 'number'             //new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),             segment_0_65535_0_4_ints_number,             };             lexiStates[25] = new ElseIf[] {             // possible Vt: 'number'             //new('.'/*'u002E'(46)*/, 8, Acts.None),             segment_46_46_8_0,             // possible Vt: 'number'             new('0'/*'u0030'(48)*/, 35, Acts.Extend, st.@number),             // possible Vt: 'number'             new('1'/*'u0031'(49)*/, '9'/*'u0039'(57)*/, 34, Acts.Extend, st.@number),             // possible Vt: 'number'             //new('E'/*'u0045'(69)*/, 7, Acts.None),             segment_69_69_7_0,             // possible Vt: 'number'             //new('e'/*'u0065'(101)*/, 7, Acts.None),             segment_101_101_7_0,             // possible Vt: 'number'             //new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),             segment_0_65535_0_4_ints_number,             };             lexiStates[26] = new ElseIf[] {             // possible Vt: ':'             new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@Colon符),             };             lexiStates[27] = new ElseIf[] {             // possible Vt: ','             new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@Comma符),             };             lexiStates[28] = new ElseIf[] {             // possible Vt: ']'             new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@RightBracket符),             };             lexiStates[29] = new ElseIf[] {             // possible Vt: '['             new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@LeftBracket符),             };             lexiStates[30] = new ElseIf[] {             // possible Vt: '}'             new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@RightBrace符),             };             lexiStates[31] = new ElseIf[] {             // possible Vt: '{'             new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@LeftBrace符),             };             lexiStates[32] = new ElseIf[] {             // possible Vt: 'number'             //new('.'/*'u002E'(46)*/, 8, Acts.None),             segment_46_46_8_0,             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, 40, Acts.Extend, st.@number),             segment_48_48_40_2_ints_number,             // possible Vt: 'number'             //new('1'/*'u0031'(49)*/, '9'/*'u0039'(57)*/, 39, Acts.Extend, st.@number),             segment_49_57_39_2_ints_number,             // possible Vt: 'number'             //new('E'/*'u0045'(69)*/, 7, Acts.None),             segment_69_69_7_0,             // possible Vt: 'number'             //new('e'/*'u0065'(101)*/, 7, Acts.None),             segment_101_101_7_0,             // possible Vt: 'number'             //new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),             segment_0_65535_0_4_ints_number,             };             lexiStates[33] = new ElseIf[] {             // possible Vt: 'number'             //new('.'/*'u002E'(46)*/, 8, Acts.None),             segment_46_46_8_0,             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, 33, Acts.Extend, st.@number),             segment_48_48_33_2_ints_number,             // possible Vt: 'number'             //new('1'/*'u0031'(49)*/, '9'/*'u0039'(57)*/, 32, Acts.Extend, st.@number),             segment_49_57_32_2_ints_number,             // possible Vt: 'number'             //new('E'/*'u0045'(69)*/, 7, Acts.None),             segment_69_69_7_0,             // possible Vt: 'number'             //new('e'/*'u0065'(101)*/, 7, Acts.None),             segment_101_101_7_0,             // possible Vt: 'number'             //new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),             segment_0_65535_0_4_ints_number,             };             lexiStates[34] = new ElseIf[] {             // possible Vt: 'number'             //new('.'/*'u002E'(46)*/, 8, Acts.None),             segment_46_46_8_0,             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 41, Acts.Extend, st.@number),             segment_48_57_41_2_ints_number,             // possible Vt: 'number'             //new('E'/*'u0045'(69)*/, 7, Acts.None),             segment_69_69_7_0,             // possible Vt: 'number'             //new('e'/*'u0065'(101)*/, 7, Acts.None),             segment_101_101_7_0,             // possible Vt: 'number'             //new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),             segment_0_65535_0_4_ints_number,             };             lexiStates[35] = new ElseIf[] {             // possible Vt: 'number'             //new('.'/*'u002E'(46)*/, 8, Acts.None),             segment_46_46_8_0,             // possible Vt: 'number'             //new('E'/*'u0045'(69)*/, 7, Acts.None),             segment_69_69_7_0,             // possible Vt: 'number'             //new('e'/*'u0065'(101)*/, 7, Acts.None),             segment_101_101_7_0,             // possible Vt: 'number'             //new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),             segment_0_65535_0_4_ints_number,             };             lexiStates[36] = new ElseIf[] {             // possible Vt: 'string'             new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@string),             };             lexiStates[37] = new ElseIf[] {             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 37, Acts.Extend, st.@number),             segment_48_57_37_2_ints_number,             // possible Vt: 'number'             new('E'/*'u0045'(69)*/, 20, Acts.None),             // possible Vt: 'number'             new('e'/*'u0065'(101)*/, 20, Acts.None),             // possible Vt: 'number'             //new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),             segment_0_65535_0_4_ints_number,             };             lexiStates[38] = new ElseIf[] {             // possible Vt: 'number'             new('.'/*'u002E'(46)*/, 21, Acts.None),             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 38, Acts.Extend, st.@number),             segment_48_57_38_2_ints_number,             // possible Vt: 'number'             //new('E'/*'u0045'(69)*/, 7, Acts.None),             segment_69_69_7_0,             // possible Vt: 'number'             //new('e'/*'u0065'(101)*/, 7, Acts.None),             segment_101_101_7_0,             // possible Vt: 'number'             //new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),             segment_0_65535_0_4_ints_number,             };             lexiStates[39] = new ElseIf[] {             // possible Vt: 'number'             //new('.'/*'u002E'(46)*/, 8, Acts.None),             segment_46_46_8_0,             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, 40, Acts.Extend, st.@number),             segment_48_48_40_2_ints_number,             // possible Vt: 'number'             //new('1'/*'u0031'(49)*/, '9'/*'u0039'(57)*/, 39, Acts.Extend, st.@number),             segment_49_57_39_2_ints_number,             // possible Vt: 'number'             //new('E'/*'u0045'(69)*/, 7, Acts.None),             segment_69_69_7_0,             // possible Vt: 'number'             //new('e'/*'u0065'(101)*/, 7, Acts.None),             segment_101_101_7_0,             // possible Vt: 'number'             //new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),             segment_0_65535_0_4_ints_number,             };             lexiStates[40] = new ElseIf[] {             // possible Vt: 'number'             //new('.'/*'u002E'(46)*/, 8, Acts.None),             segment_46_46_8_0,             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, 40, Acts.Extend, st.@number),             segment_48_48_40_2_ints_number,             // possible Vt: 'number'             //new('1'/*'u0031'(49)*/, '9'/*'u0039'(57)*/, 39, Acts.Extend, st.@number),             segment_49_57_39_2_ints_number,             // possible Vt: 'number'             //new('E'/*'u0045'(69)*/, 7, Acts.None),             segment_69_69_7_0,             // possible Vt: 'number'             //new('e'/*'u0065'(101)*/, 7, Acts.None),             segment_101_101_7_0,             // possible Vt: 'number'             //new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),             segment_0_65535_0_4_ints_number,             };             lexiStates[41] = new ElseIf[] {             // possible Vt: 'number'             //new('.'/*'u002E'(46)*/, 8, Acts.None),             segment_46_46_8_0,             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 41, Acts.Extend, st.@number),             segment_48_57_41_2_ints_number,             // possible Vt: 'number'             //new('E'/*'u0045'(69)*/, 7, Acts.None),             segment_69_69_7_0,             // possible Vt: 'number'             //new('e'/*'u0065'(101)*/, 7, Acts.None),             segment_101_101_7_0,             // possible Vt: 'number'             //new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),             segment_0_65535_0_4_ints_number,             };             lexiStates[42] = new ElseIf[] {             // possible Vt: 'true'             new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@true),             };             lexiStates[43] = new ElseIf[] {             // possible Vt: 'null'             new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@null),             };             lexiStates[44] = new ElseIf[] {             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 44, Acts.Extend, st.@number),             segment_48_57_44_2_ints_number,             // possible Vt: 'number'             //new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),             segment_0_65535_0_4_ints_number,             };             lexiStates[45] = new ElseIf[] {             // possible Vt: 'number'             //new('0'/*'u0030'(48)*/, '9'/*'u0039'(57)*/, 45, Acts.Extend, st.@number),             segment_48_57_45_2_ints_number,             // possible Vt: 'number'             //new('E'/*'u0045'(69)*/, 7, Acts.None),             segment_69_69_7_0,             // possible Vt: 'number'             //new('e'/*'u0065'(101)*/, 7, Acts.None),             segment_101_101_7_0,             // possible Vt: 'number'             //new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),             segment_0_65535_0_4_ints_number,             };             lexiStates[46] = new ElseIf[] {             // possible Vt: 'false'             new('u0000'/*(0)*/, 'uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@false),             };         }     } } 

它是第三个实现,它也已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。

Json.LexiTable.gen.bin

C#实现自己的Json解析器(LALR(1)+miniDFA)

这是将二维数组形式(ElseIf[][])的miniDFA写入了一个二进制文件。加载Json解析器时,读取此文件即可得到二维数组形式(ElseIf[][])的miniDFA。这就不需要将整个ElseIf[][]硬编码到源代码中了,从而进一步减少了内存占用。

为了方便调试、参考,我为其准备了对应的文本格式:

Json.LexiTable.gen.txt
ElseIf 4 omit chars: 0('u0000'/*(0)*/->'u0000'/*(0)*/)=>None,0 0('t'/*'u0009'(9)*/->'n'/*'u000A'(10)*/)=>None,0 0('r'/*'u000D'(13)*/->'r'/*'u000D'(13)*/)=>None,0 0(' '/*'u0020'(32)*/->' '/*'u0020'(32)*/)=>None,0  0 re-used int[] Vts: 0 re-used IfVt ifVt: 0 re-used IfVt[] ifVts: 15 re-used ElseIf2 segment: 25('0'/*'u0030'(48)*/->'0'/*'u0030'(48)*/)=>Begin, Extend,11 24('1'/*'u0031'(49)*/->'9'/*'u0039'(57)*/)=>Begin, Extend,11 37('0'/*'u0030'(48)*/->'9'/*'u0039'(57)*/)=>Extend,11 38('0'/*'u0030'(48)*/->'9'/*'u0039'(57)*/)=>Extend,11 44('0'/*'u0030'(48)*/->'9'/*'u0039'(57)*/)=>Extend,11 45('0'/*'u0030'(48)*/->'9'/*'u0039'(57)*/)=>Extend,11 8('.'/*'u002E'(46)*/->'.'/*'u002E'(46)*/)=>None,0 33('0'/*'u0030'(48)*/->'0'/*'u0030'(48)*/)=>Extend,11 32('1'/*'u0031'(49)*/->'9'/*'u0039'(57)*/)=>Extend,11 7('E'/*'u0045'(69)*/->'E'/*'u0045'(69)*/)=>None,0 7('e'/*'u0065'(101)*/->'e'/*'u0065'(101)*/)=>None,0 0('u0000'/*(0)*/->'uFFFF'/*�(65535)*/)=>Accept,11 40('0'/*'u0030'(48)*/->'0'/*'u0030'(48)*/)=>Extend,11 39('1'/*'u0031'(49)*/->'9'/*'u0039'(57)*/)=>Extend,11 41('0'/*'u0030'(48)*/->'9'/*'u0039'(57)*/)=>Extend,11 47 ElseIf2[] row: LexiTable.Rows[0] has 13 segments: 2('"'/*'u0022'(34)*/->'"'/*'u0022'(34)*/)=>Begin,0 27(','/*'u002C'(44)*/->','/*'u002C'(44)*/)=>Begin, Extend,5 1('-'/*'u002D'(45)*/->'-'/*'u002D'(45)*/)=>Begin,0 -1 -2 26(':'/*'u003A'(58)*/->':'/*'u003A'(58)*/)=>Begin, Extend,7 29('['/*'u005B'(91)*/->'['/*'u005B'(91)*/)=>Begin, Extend,3 28(']'/*'u005D'(93)*/->']'/*'u005D'(93)*/)=>Begin, Extend,4 3('f'/*'u0066'(102)*/->'f'/*'u0066'(102)*/)=>Begin,0 5('n'/*'u006E'(110)*/->'n'/*'u006E'(110)*/)=>Begin,0 4('t'/*'u0074'(116)*/->'t'/*'u0074'(116)*/)=>Begin,0 31('{'/*'u007B'(123)*/->'{'/*'u007B'(123)*/)=>Begin, Extend,1 30('}'/*'u007D'(125)*/->'}'/*'u007D'(125)*/)=>Begin, Extend,2  LexiTable.Rows[1] has 2 segments: -1 -2  LexiTable.Rows[2] has 5 segments: 2(' '/*'u0020'(32)*/->'!'/*'u0021'(33)*/)=>None,0 36('"'/*'u0022'(34)*/->'"'/*'u0022'(34)*/)=>Extend,6 2('#'/*'u0023'(35)*/->'['/*'u005B'(91)*/)=>None,0 9('\'/*'u005C'(92)*/->'\'/*'u005C'(92)*/)=>None,0 2(']'/*'u005D'(93)*/->'uFFFF'/*�(65535)*/)=>None,0  LexiTable.Rows[3] has 1 segments: 10('a'/*'u0061'(97)*/->'a'/*'u0061'(97)*/)=>None,0  LexiTable.Rows[4] has 1 segments: 6('r'/*'u0072'(114)*/->'r'/*'u0072'(114)*/)=>None,0  LexiTable.Rows[5] has 1 segments: 11('u'/*'u0075'(117)*/->'u'/*'u0075'(117)*/)=>None,0  LexiTable.Rows[6] has 1 segments: 18('u'/*'u0075'(117)*/->'u'/*'u0075'(117)*/)=>None,0  LexiTable.Rows[7] has 3 segments: 12('+'/*'u002B'(43)*/->'+'/*'u002B'(43)*/)=>None,0 12('-'/*'u002D'(45)*/->'-'/*'u002D'(45)*/)=>None,0 -3  LexiTable.Rows[8] has 1 segments: -4  LexiTable.Rows[9] has 9 segments: 2('"'/*'u0022'(34)*/->'"'/*'u0022'(34)*/)=>None,0 2('/'/*'u002F'(47)*/->'/'/*'u002F'(47)*/)=>None,0 2('\'/*'u005C'(92)*/->'\'/*'u005C'(92)*/)=>None,0 2('b'/*'u0062'(98)*/->'b'/*'u0062'(98)*/)=>None,0 2('f'/*'u0066'(102)*/->'f'/*'u0066'(102)*/)=>None,0 2('n'/*'u006E'(110)*/->'n'/*'u006E'(110)*/)=>None,0 2('r'/*'u0072'(114)*/->'r'/*'u0072'(114)*/)=>None,0 2('t'/*'u0074'(116)*/->'t'/*'u0074'(116)*/)=>None,0 13('u'/*'u0075'(117)*/->'u'/*'u0075'(117)*/)=>None,0  LexiTable.Rows[10] has 1 segments: 17('l'/*'u006C'(108)*/->'l'/*'u006C'(108)*/)=>None,0  LexiTable.Rows[11] has 1 segments: 19('l'/*'u006C'(108)*/->'l'/*'u006C'(108)*/)=>None,0  LexiTable.Rows[12] has 1 segments: -3  LexiTable.Rows[13] has 3 segments: 14('0'/*'u0030'(48)*/->'9'/*'u0039'(57)*/)=>None,0 14('A'/*'u0041'(65)*/->'F'/*'u0046'(70)*/)=>None,0 14('a'/*'u0061'(97)*/->'f'/*'u0066'(102)*/)=>None,0  LexiTable.Rows[14] has 3 segments: 15('0'/*'u0030'(48)*/->'9'/*'u0039'(57)*/)=>None,0 15('A'/*'u0041'(65)*/->'F'/*'u0046'(70)*/)=>None,0 15('a'/*'u0061'(97)*/->'f'/*'u0066'(102)*/)=>None,0  LexiTable.Rows[15] has 3 segments: 16('0'/*'u0030'(48)*/->'9'/*'u0039'(57)*/)=>None,0 16('A'/*'u0041'(65)*/->'F'/*'u0046'(70)*/)=>None,0 16('a'/*'u0061'(97)*/->'f'/*'u0066'(102)*/)=>None,0  LexiTable.Rows[16] has 3 segments: 2('0'/*'u0030'(48)*/->'9'/*'u0039'(57)*/)=>None,0 2('A'/*'u0041'(65)*/->'F'/*'u0046'(70)*/)=>None,0 2('a'/*'u0061'(97)*/->'f'/*'u0066'(102)*/)=>None,0  LexiTable.Rows[17] has 1 segments: 22('s'/*'u0073'(115)*/->'s'/*'u0073'(115)*/)=>None,0  LexiTable.Rows[18] has 1 segments: 42('e'/*'u0065'(101)*/->'e'/*'u0065'(101)*/)=>Extend,9  LexiTable.Rows[19] has 1 segments: 43('l'/*'u006C'(108)*/->'l'/*'u006C'(108)*/)=>Extend,8  LexiTable.Rows[20] has 3 segments: 23('+'/*'u002B'(43)*/->'+'/*'u002B'(43)*/)=>None,0 23('-'/*'u002D'(45)*/->'-'/*'u002D'(45)*/)=>None,0 -5  LexiTable.Rows[21] has 1 segments: -6  LexiTable.Rows[22] has 1 segments: 46('e'/*'u0065'(101)*/->'e'/*'u0065'(101)*/)=>Extend,10  LexiTable.Rows[23] has 1 segments: -5  LexiTable.Rows[24] has 6 segments: -7 -8 -9 -10 -11 -12  LexiTable.Rows[25] has 6 segments: -7 35('0'/*'u0030'(48)*/->'0'/*'u0030'(48)*/)=>Extend,11 34('1'/*'u0031'(49)*/->'9'/*'u0039'(57)*/)=>Extend,11 -10 -11 -12  LexiTable.Rows[26] has 1 segments: 0('u0000'/*(0)*/->'uFFFF'/*�(65535)*/)=>Accept,7  LexiTable.Rows[27] has 1 segments: 0('u0000'/*(0)*/->'uFFFF'/*�(65535)*/)=>Accept,5  LexiTable.Rows[28] has 1 segments: 0('u0000'/*(0)*/->'uFFFF'/*�(65535)*/)=>Accept,4  LexiTable.Rows[29] has 1 segments: 0('u0000'/*(0)*/->'uFFFF'/*�(65535)*/)=>Accept,3  LexiTable.Rows[30] has 1 segments: 0('u0000'/*(0)*/->'uFFFF'/*�(65535)*/)=>Accept,2  LexiTable.Rows[31] has 1 segments: 0('u0000'/*(0)*/->'uFFFF'/*�(65535)*/)=>Accept,1  LexiTable.Rows[32] has 6 segments: -7 -13 -14 -10 -11 -12  LexiTable.Rows[33] has 6 segments: -7 -8 -9 -10 -11 -12  LexiTable.Rows[34] has 5 segments: -7 -15 -10 -11 -12  LexiTable.Rows[35] has 4 segments: -7 -10 -11 -12  LexiTable.Rows[36] has 1 segments: 0('u0000'/*(0)*/->'uFFFF'/*�(65535)*/)=>Accept,6  LexiTable.Rows[37] has 4 segments: -3 20('E'/*'u0045'(69)*/->'E'/*'u0045'(69)*/)=>None,0 20('e'/*'u0065'(101)*/->'e'/*'u0065'(101)*/)=>None,0 -12  LexiTable.Rows[38] has 5 segments: 21('.'/*'u002E'(46)*/->'.'/*'u002E'(46)*/)=>None,0 -4 -10 -11 -12  LexiTable.Rows[39] has 6 segments: -7 -13 -14 -10 -11 -12  LexiTable.Rows[40] has 6 segments: -7 -13 -14 -10 -11 -12  LexiTable.Rows[41] has 5 segments: -7 -15 -10 -11 -12  LexiTable.Rows[42] has 1 segments: 0('u0000'/*(0)*/->'uFFFF'/*�(65535)*/)=>Accept,9  LexiTable.Rows[43] has 1 segments: 0('u0000'/*(0)*/->'uFFFF'/*�(65535)*/)=>Accept,8  LexiTable.Rows[44] has 2 segments: -5 -12  LexiTable.Rows[45] has 4 segments: -6 -10 -11 -12  LexiTable.Rows[46] has 1 segments: 0('u0000'/*(0)*/->'uFFFF'/*�(65535)*/)=>Accept,10 

它是第四个实现,这是目前使用的实现方式。为了加载路径上的方便,我将其从Json.genLexicalAnalyzer文件夹挪到了Json.gen文件夹下。

Json.LexicalScripts.gen.cs

这是各个词法分析状态都可能用到的函数,包括3类:BeginExtendAccept。其作用是:记录一个token的起始位置(Begin)和结束位置(Extend),设置其类型、行数、列数等信息,将其加入List<Token> tokens数组(Accept)。

Json.LexicalScripts.gen.cs
using System; using System.Collections.Generic; using bitzhuwei.Compiler;  namespace bitzhuwei.JsonFormat {     partial class CompilerJson {         // this is where new <see cref="Token"/> starts.         private static void BeginToken(LexicalContext context) {             if (context.analyzingToken.type != AnalyzingToken.NotYet) {                 context.analyzingToken.Reset(index: context.result.Count, start: context.cursor);             }         }          // extend value of current token(<see cref="LexicalContext.analyzingToken"/>)         private static void ExtendToken(LexicalContext context, int Vt) {             context.analyzingToken.ends[Vt] = context.cursor;         }         private static void ExtendToken2(LexicalContext context, params int[] Vts) {             for (int i = 0; i < Vts.Length; i++) {                 var Vt = Vts[i];                 context.analyzingToken.ends[Vt] = context.cursor;             }         }         private static void ExtendToken3(LexicalContext context, params IfVt[] ifVts) {             for (int i = 0; i < ifVts.Length; i++) {                 var Vt = ifVts[i].Vt;                 context.analyzingToken.ends[Vt] = context.cursor;             }         }          // accept current Token         // set Token.type and neutralize the last LexicalContext.MoveForward()         private static void AcceptToken(LexicalContext context, int Vt) {             var startIndex = context.analyzingToken.start.index;             var end = context.analyzingToken.ends[Vt];             context.analyzingToken.value = context.sourceCode.Substring(                 startIndex, end.index - startIndex + 1);             context.analyzingToken.type = Vt;              // cancel forward steps for post-regex             var backStep = context.cursor.index - end.index;             if (backStep > 0) { context.MoveBack(backStep); }             // next operation: LexicalContext.MoveForward();              var token = context.analyzingToken.Dump( #if DEBUG                 context.stArray, #endif                 end);             context.result.Add(token);             // 没有注释可跳过 no comment to skip             context.lastSyntaxValidToken = token;             if (token.type == st.Error错) {                 context.result.token2ErrorInfo.Add(token,                     new TokenErrorInfo(token, "token type unrecognized!"));             }         }         private static void AcceptToken2(LexicalContext context, params int[] Vts) {             AcceptToken(context, Vts[0]);         }         private static void AcceptToken3(LexicalContext context, params IfVt[] ifVts) {             var typeSet = false;             int lastType = st.@终;             if (context.lastSyntaxValidToken != null) {                 lastType = context.lastSyntaxValidToken.type;             }             for (var i = 0; i < ifVts.Length; i++) {                 var ifVt = ifVts[i];                 if (ifVt.signalCondition == context.signalCondition                  // if preVt is string.Empty, let's use the first type.                  // otherwise, preVt must be the lastType.                  && (ifVt.preVt == st.@终 // default preVt                   || ifVt.preVt == lastType)) { // <'Vt'>                     context.analyzingToken.type = ifVt.Vt;                     if (ifVt.nextSignal != null) { context.signalCondition = ifVt.nextSignal; }                     typeSet = true;                     break;                 }             }             if (!typeSet) {                 for (var i = 0; i < ifVts.Length; i++) {                     var ifVt = ifVts[i];                     if (// ingnore signal condition and try to assgin a type.                         // if preVt is string.Empty, let's use the first type.                         // otherwise, preVt must be the lastType.                         (ifVt.preVt == st.@终 // default preVt                       || ifVt.preVt == lastType)) { // <'Vt'>                         context.analyzingToken.type = ifVt.Vt;                         context.signalCondition = LexicalContext.defaultSignal;                         typeSet = true;                         break;                     }                 }             }              var startIndex = context.analyzingToken.start.index;             var end = context.analyzingToken.start;             if (!typeSet) {                 // we failed to assign type according to lexi statements.                 // this indicates token error in source code or inappropriate lexi statements.                 //throw new Exception("Algorithm error: token type not set!");                 context.analyzingToken.type = st.Error错;                 context.signalCondition = LexicalContext.defaultSignal;                 // choose longest value                 for (int i = 0; i < context.analyzingToken.ends.Length; i++) {                     var item = context.analyzingToken.ends[i];                     if (end.index < item.index) { end = item; }                 }             }             else { end = context.analyzingToken.ends[context.analyzingToken.type]; }             context.analyzingToken.value = context.sourceCode.Substring(startIndex, end.index - startIndex + 1);              // cancel forward steps for post-regex             var backStep = context.cursor.index - end.index;             if (backStep > 0) { context.MoveBack(backStep); }             // next operation: context.MoveForward();              var token = context.analyzingToken.Dump( #if DEBUG                 context.stArray, #endif                 end);             context.result.Add(token);             // 没有注释可跳过 no comment to skip             context.lastSyntaxValidToken = token;             if (token.type == st.Error错) {                 context.result.token2ErrorInfo.Add(token,                     new TokenErrorInfo(token, "token type unrecognized!"));             }         }     } } 

Json.LexicalReservedWords.gen.cs

这里记录了Json文法的全部保留字(任何编程语言中的keyword),也就是{}[],:nulltruefalse这些。显然这是辅助的东西,不必在意。

Json.LexicalReservedWords.gen.cs
using System; using bitzhuwei.Compiler;  namespace bitzhuwei.JsonFormat {     partial class CompilerJson {          public static class reservedWord {             /// <summary>             /// {             /// </summary>             public const string @LeftBrace符 = "{";             /// <summary>             /// }             /// </summary>             public const string @RightBrace符 = "}";             /// <summary>             /// [             /// </summary>             public const string @LeftBracket符 = "[";             /// <summary>             /// ]             /// </summary>             public const string @RightBracket符 = "]";             /// <summary>             /// ,             /// </summary>             public const string @Comma符 = ",";             /// <summary>             /// :             /// </summary>             public const string @Colon符 = ":";             /// <summary>             /// null             /// </summary>             public const string @null = "null";             /// <summary>             /// true             /// </summary>             public const string @true = "true";             /// <summary>             /// false             /// </summary>             public const string @false = "false";          }          /// <summary>         /// if <paramref name="token"/> is a reserved word, assign correspond type and return true.         /// <para>otherwise, return false.</para>         /// </summary>         /// <param name="token"></param>         /// <returns></returns>         private static bool CheckReservedWord(AnalyzingToken token) {             bool isReservedWord = true;             switch (token.value) {             case reservedWord.@LeftBrace符: token.type = st.@LeftBrace符; break;             case reservedWord.@RightBrace符: token.type = st.@RightBrace符; break;             case reservedWord.@LeftBracket符: token.type = st.@LeftBracket符; break;             case reservedWord.@RightBracket符: token.type = st.@RightBracket符; break;             case reservedWord.@Comma符: token.type = st.@Comma符; break;             case reservedWord.@Colon符: token.type = st.@Colon符; break;             case reservedWord.@null: token.type = st.@null; break;             case reservedWord.@true: token.type = st.@true; break;             case reservedWord.@false: token.type = st.@false; break;              default: isReservedWord = false; break;             }              return isReservedWord;         }     } } 

README.gen.md

这是词法分析器的说明文档,用mermaid画出了各个token的状态机和整个文法的总状态机,如下图所示。

C#实现自己的Json解析器(LALR(1)+miniDFA)

我知道你们看不清。我也看不清。找个大屏幕直接看README.gen.md文件吧。

生成的语法分析器

C#实现自己的Json解析器(LALR(1)+miniDFA)

Dicitonary<int, LRParseAction>

Json.Dict.LALR(1).gen.cs_是LALR(1)的语法分析状态机,每个语法状态都是一个Dicitonary<int, LRParseAction>对象。

Json.Dict.LALR(1).gen.cs_
using System; using bitzhuwei.Compiler;  namespace bitzhuwei.JsonFormat {     partial class CompilerJson {          private static Dictionary<int, LRParseAction>[] InitializeSyntaxStates() {             const int syntaxStateCount = 29;             var states = new Dictionary<int, LRParseAction>[syntaxStateCount];             // 102 actions             // conflicts(0)=not sovled(0)+solved(0)(0 warnings)             #region create objects of syntax states             states[0] = new(capacity: 5);             states[1] = new(capacity: 1);             states[2] = new(capacity: 1);             states[3] = new(capacity: 1);             states[4] = new(capacity: 4);             states[5] = new(capacity: 13);             states[6] = new(capacity: 4);             states[7] = new(capacity: 2);             states[8] = new(capacity: 2);             states[9] = new(capacity: 1);             states[10] = new(capacity: 4);             states[11] = new(capacity: 2);             states[12] = new(capacity: 2);             states[13] = new(capacity: 2);             states[14] = new(capacity: 3);             states[15] = new(capacity: 3);             states[16] = new(capacity: 3);             states[17] = new(capacity: 3);             states[18] = new(capacity: 3);             states[19] = new(capacity: 3);             states[20] = new(capacity: 3);             states[21] = new(capacity: 4);             states[22] = new(capacity: 2);             states[23] = new(capacity: 10);             states[24] = new(capacity: 4);             states[25] = new(capacity: 11);             states[26] = new(capacity: 2);             states[27] = new(capacity: 2);             states[28] = new(capacity: 2);             #endregion create objects of syntax states              #region re-used actions             LRParseAction aShift4 = new(LRParseAction.Kind.Shift, states[4]);// refered 4 times             LRParseAction aShift5 = new(LRParseAction.Kind.Shift, states[5]);// refered 4 times             LRParseAction aShift9 = new(LRParseAction.Kind.Shift, states[9]);// refered 2 times             LRParseAction aGoto13 = new(LRParseAction.Kind.Goto, states[13]);// refered 2 times             LRParseAction aShift14 = new(LRParseAction.Kind.Shift, states[14]);// refered 3 times             LRParseAction aShift15 = new(LRParseAction.Kind.Shift, states[15]);// refered 3 times             LRParseAction aShift16 = new(LRParseAction.Kind.Shift, states[16]);// refered 3 times             LRParseAction aShift17 = new(LRParseAction.Kind.Shift, states[17]);// refered 3 times             LRParseAction aShift18 = new(LRParseAction.Kind.Shift, states[18]);// refered 3 times             LRParseAction aGoto19 = new(LRParseAction.Kind.Goto, states[19]);// refered 3 times             LRParseAction aGoto20 = new(LRParseAction.Kind.Goto, states[20]);// refered 3 times             LRParseAction aReduce2 = new(regulations[2]);// refered 4 times             LRParseAction aReduce7 = new(regulations[7]);// refered 2 times             LRParseAction aReduce4 = new(regulations[4]);// refered 4 times             LRParseAction aReduce9 = new(regulations[9]);// refered 2 times             LRParseAction aReduce11 = new(regulations[11]);// refered 2 times             LRParseAction aReduce12 = new(regulations[12]);// refered 3 times             LRParseAction aReduce13 = new(regulations[13]);// refered 3 times             LRParseAction aReduce14 = new(regulations[14]);// refered 3 times             LRParseAction aReduce15 = new(regulations[15]);// refered 3 times             LRParseAction aReduce16 = new(regulations[16]);// refered 3 times             LRParseAction aReduce17 = new(regulations[17]);// refered 3 times             LRParseAction aReduce18 = new(regulations[18]);// refered 3 times             LRParseAction aReduce3 = new(regulations[3]);// refered 4 times             LRParseAction aReduce5 = new(regulations[5]);// refered 4 times             LRParseAction aReduce6 = new(regulations[6]);// refered 2 times             LRParseAction aReduce10 = new(regulations[10]);// refered 2 times             LRParseAction aReduce8 = new(regulations[8]);// refered 2 times             #endregion re-used actions              // 102 actions             // conflicts(0)=not sovled(0)+solved(0)(0 warnings)             #region init actions of syntax states             // syntaxStates[0]:             // [-1] Json' : ⏳ Json ;☕ '¥'              // [0] Json : ⏳ Object ;☕ '¥'              // [1] Json : ⏳ Array ;☕ '¥'              // [2] Object : ⏳ '{' '}' ;☕ '¥'              // [3] Object : ⏳ '{' Members '}' ;☕ '¥'              // [4] Array : ⏳ '[' ']' ;☕ '¥'              // [5] Array : ⏳ '[' Elements ']' ;☕ '¥'              /*0*/states[0].Add(st.Json枝, new(LRParseAction.Kind.Goto, states[1]));             /*1*/states[0].Add(st.Object枝, new(LRParseAction.Kind.Goto, states[2]));             /*2*/states[0].Add(st.Array枝, new(LRParseAction.Kind.Goto, states[3]));             /*3*/states[0].Add(st.@LeftBrace符, aShift4);             /*4*/states[0].Add(st.@LeftBracket符, aShift5);             // syntaxStates[1]:             // [-1] Json' : Json ⏳ ;☕ '¥'              /*5*/states[1].Add(st.@终, LRParseAction.accept);             // syntaxStates[2]:             // [0] Json : Object ⏳ ;☕ '¥'              /*6*/states[2].Add(st.@终, new(regulations[0]));             // syntaxStates[3]:             // [1] Json : Array ⏳ ;☕ '¥'              /*7*/states[3].Add(st.@终, new(regulations[1]));             // syntaxStates[4]:             // [2] Object : '{' ⏳ '}' ;☕ ',' ']' '}' '¥'              // [3] Object : '{' ⏳ Members '}' ;☕ ',' ']' '}' '¥'              // [6] Members : ⏳ Members ',' Member ;☕ ',' '}'              // [7] Members : ⏳ Member ;☕ ',' '}'              // [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}'              /*8*/states[4].Add(st.@RightBrace符, new(LRParseAction.Kind.Shift, states[6]));             /*9*/states[4].Add(st.Members枝, new(LRParseAction.Kind.Goto, states[7]));             /*10*/states[4].Add(st.Member枝, new(LRParseAction.Kind.Goto, states[8]));             /*11*/states[4].Add(st.@string, aShift9);             // syntaxStates[5]:             // [4] Array : '[' ⏳ ']' ;☕ ',' ']' '}' '¥'              // [5] Array : '[' ⏳ Elements ']' ;☕ ',' ']' '}' '¥'              // [8] Elements : ⏳ Elements ',' Element ;☕ ',' ']'              // [9] Elements : ⏳ Element ;☕ ',' ']'              // [11] Element : ⏳ Value ;☕ ',' ']'              // [12] Value : ⏳ 'null' ;☕ ',' ']'              // [13] Value : ⏳ 'true' ;☕ ',' ']'              // [14] Value : ⏳ 'false' ;☕ ',' ']'              // [15] Value : ⏳ 'number' ;☕ ',' ']'              // [16] Value : ⏳ 'string' ;☕ ',' ']'              // [17] Value : ⏳ Object ;☕ ',' ']'              // [18] Value : ⏳ Array ;☕ ',' ']'              // [2] Object : ⏳ '{' '}' ;☕ ',' ']'              // [3] Object : ⏳ '{' Members '}' ;☕ ',' ']'              // [4] Array : ⏳ '[' ']' ;☕ ',' ']'              // [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']'              /*12*/states[5].Add(st.@RightBracket符, new(LRParseAction.Kind.Shift, states[10]));             /*13*/states[5].Add(st.Elements枝, new(LRParseAction.Kind.Goto, states[11]));             /*14*/states[5].Add(st.Element枝, new(LRParseAction.Kind.Goto, states[12]));             /*15*/states[5].Add(st.Value枝, aGoto13);             /*16*/states[5].Add(st.@null, aShift14);             /*17*/states[5].Add(st.@true, aShift15);             /*18*/states[5].Add(st.@false, aShift16);             /*19*/states[5].Add(st.@number, aShift17);             /*20*/states[5].Add(st.@string, aShift18);             /*21*/states[5].Add(st.Object枝, aGoto19);             /*22*/states[5].Add(st.Array枝, aGoto20);             /*23*/states[5].Add(st.@LeftBrace符, aShift4);             /*24*/states[5].Add(st.@LeftBracket符, aShift5);             // syntaxStates[6]:             // [2] Object : '{' '}' ⏳ ;☕ ',' ']' '}' '¥'              /*25*/states[6].Add(st.@Comma符, aReduce2);             /*26*/states[6].Add(st.@RightBracket符, aReduce2);             /*27*/states[6].Add(st.@RightBrace符, aReduce2);             /*28*/states[6].Add(st.@终, aReduce2);             // syntaxStates[7]:             // [3] Object : '{' Members ⏳ '}' ;☕ ',' ']' '}' '¥'              // [6] Members : Members ⏳ ',' Member ;☕ ',' '}'              /*29*/states[7].Add(st.@RightBrace符, new(LRParseAction.Kind.Shift, states[21]));             /*30*/states[7].Add(st.@Comma符, new(LRParseAction.Kind.Shift, states[22]));             // syntaxStates[8]:             // [7] Members : Member ⏳ ;☕ ',' '}'              /*31*/states[8].Add(st.@Comma符, aReduce7);             /*32*/states[8].Add(st.@RightBrace符, aReduce7);             // syntaxStates[9]:             // [10] Member : 'string' ⏳ ':' Value ;☕ ',' '}'              /*33*/states[9].Add(st.@Colon符, new(LRParseAction.Kind.Shift, states[23]));             // syntaxStates[10]:             // [4] Array : '[' ']' ⏳ ;☕ ',' ']' '}' '¥'              /*34*/states[10].Add(st.@Comma符, aReduce4);             /*35*/states[10].Add(st.@RightBracket符, aReduce4);             /*36*/states[10].Add(st.@RightBrace符, aReduce4);             /*37*/states[10].Add(st.@终, aReduce4);             // syntaxStates[11]:             // [5] Array : '[' Elements ⏳ ']' ;☕ ',' ']' '}' '¥'              // [8] Elements : Elements ⏳ ',' Element ;☕ ',' ']'              /*38*/states[11].Add(st.@RightBracket符, new(LRParseAction.Kind.Shift, states[24]));             /*39*/states[11].Add(st.@Comma符, new(LRParseAction.Kind.Shift, states[25]));             // syntaxStates[12]:             // [9] Elements : Element ⏳ ;☕ ',' ']'              /*40*/states[12].Add(st.@Comma符, aReduce9);             /*41*/states[12].Add(st.@RightBracket符, aReduce9);             // syntaxStates[13]:             // [11] Element : Value ⏳ ;☕ ',' ']'              /*42*/states[13].Add(st.@Comma符, aReduce11);             /*43*/states[13].Add(st.@RightBracket符, aReduce11);             // syntaxStates[14]:             // [12] Value : 'null' ⏳ ;☕ ',' ']' '}'              /*44*/states[14].Add(st.@Comma符, aReduce12);             /*45*/states[14].Add(st.@RightBracket符, aReduce12);             /*46*/states[14].Add(st.@RightBrace符, aReduce12);             // syntaxStates[15]:             // [13] Value : 'true' ⏳ ;☕ ',' ']' '}'              /*47*/states[15].Add(st.@Comma符, aReduce13);             /*48*/states[15].Add(st.@RightBracket符, aReduce13);             /*49*/states[15].Add(st.@RightBrace符, aReduce13);             // syntaxStates[16]:             // [14] Value : 'false' ⏳ ;☕ ',' ']' '}'              /*50*/states[16].Add(st.@Comma符, aReduce14);             /*51*/states[16].Add(st.@RightBracket符, aReduce14);             /*52*/states[16].Add(st.@RightBrace符, aReduce14);             // syntaxStates[17]:             // [15] Value : 'number' ⏳ ;☕ ',' ']' '}'              /*53*/states[17].Add(st.@Comma符, aReduce15);             /*54*/states[17].Add(st.@RightBracket符, aReduce15);             /*55*/states[17].Add(st.@RightBrace符, aReduce15);             // syntaxStates[18]:             // [16] Value : 'string' ⏳ ;☕ ',' ']' '}'              /*56*/states[18].Add(st.@Comma符, aReduce16);             /*57*/states[18].Add(st.@RightBracket符, aReduce16);             /*58*/states[18].Add(st.@RightBrace符, aReduce16);             // syntaxStates[19]:             // [17] Value : Object ⏳ ;☕ ',' ']' '}'              /*59*/states[19].Add(st.@Comma符, aReduce17);             /*60*/states[19].Add(st.@RightBracket符, aReduce17);             /*61*/states[19].Add(st.@RightBrace符, aReduce17);             // syntaxStates[20]:             // [18] Value : Array ⏳ ;☕ ',' ']' '}'              /*62*/states[20].Add(st.@Comma符, aReduce18);             /*63*/states[20].Add(st.@RightBracket符, aReduce18);             /*64*/states[20].Add(st.@RightBrace符, aReduce18);             // syntaxStates[21]:             // [3] Object : '{' Members '}' ⏳ ;☕ ',' ']' '}' '¥'              /*65*/states[21].Add(st.@Comma符, aReduce3);             /*66*/states[21].Add(st.@RightBracket符, aReduce3);             /*67*/states[21].Add(st.@RightBrace符, aReduce3);             /*68*/states[21].Add(st.@终, aReduce3);             // syntaxStates[22]:             // [6] Members : Members ',' ⏳ Member ;☕ ',' '}'              // [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}'              /*69*/states[22].Add(st.Member枝, new(LRParseAction.Kind.Goto, states[26]));             /*70*/states[22].Add(st.@string, aShift9);             // syntaxStates[23]:             // [10] Member : 'string' ':' ⏳ Value ;☕ ',' '}'              // [12] Value : ⏳ 'null' ;☕ ',' '}'              // [13] Value : ⏳ 'true' ;☕ ',' '}'              // [14] Value : ⏳ 'false' ;☕ ',' '}'              // [15] Value : ⏳ 'number' ;☕ ',' '}'              // [16] Value : ⏳ 'string' ;☕ ',' '}'              // [17] Value : ⏳ Object ;☕ ',' '}'              // [18] Value : ⏳ Array ;☕ ',' '}'              // [2] Object : ⏳ '{' '}' ;☕ ',' '}'              // [3] Object : ⏳ '{' Members '}' ;☕ ',' '}'              // [4] Array : ⏳ '[' ']' ;☕ ',' '}'              // [5] Array : ⏳ '[' Elements ']' ;☕ ',' '}'              /*71*/states[23].Add(st.Value枝, new(LRParseAction.Kind.Goto, states[27]));             /*72*/states[23].Add(st.@null, aShift14);             /*73*/states[23].Add(st.@true, aShift15);             /*74*/states[23].Add(st.@false, aShift16);             /*75*/states[23].Add(st.@number, aShift17);             /*76*/states[23].Add(st.@string, aShift18);             /*77*/states[23].Add(st.Object枝, aGoto19);             /*78*/states[23].Add(st.Array枝, aGoto20);             /*79*/states[23].Add(st.@LeftBrace符, aShift4);             /*80*/states[23].Add(st.@LeftBracket符, aShift5);             // syntaxStates[24]:             // [5] Array : '[' Elements ']' ⏳ ;☕ ',' ']' '}' '¥'              /*81*/states[24].Add(st.@Comma符, aReduce5);             /*82*/states[24].Add(st.@RightBracket符, aReduce5);             /*83*/states[24].Add(st.@RightBrace符, aReduce5);             /*84*/states[24].Add(st.@终, aReduce5);             // syntaxStates[25]:             // [8] Elements : Elements ',' ⏳ Element ;☕ ',' ']'              // [11] Element : ⏳ Value ;☕ ',' ']'              // [12] Value : ⏳ 'null' ;☕ ',' ']'              // [13] Value : ⏳ 'true' ;☕ ',' ']'              // [14] Value : ⏳ 'false' ;☕ ',' ']'              // [15] Value : ⏳ 'number' ;☕ ',' ']'              // [16] Value : ⏳ 'string' ;☕ ',' ']'              // [17] Value : ⏳ Object ;☕ ',' ']'              // [18] Value : ⏳ Array ;☕ ',' ']'              // [2] Object : ⏳ '{' '}' ;☕ ',' ']'              // [3] Object : ⏳ '{' Members '}' ;☕ ',' ']'              // [4] Array : ⏳ '[' ']' ;☕ ',' ']'              // [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']'              /*85*/states[25].Add(st.Element枝, new(LRParseAction.Kind.Goto, states[28]));             /*86*/states[25].Add(st.Value枝, aGoto13);             /*87*/states[25].Add(st.@null, aShift14);             /*88*/states[25].Add(st.@true, aShift15);             /*89*/states[25].Add(st.@false, aShift16);             /*90*/states[25].Add(st.@number, aShift17);             /*91*/states[25].Add(st.@string, aShift18);             /*92*/states[25].Add(st.Object枝, aGoto19);             /*93*/states[25].Add(st.Array枝, aGoto20);             /*94*/states[25].Add(st.@LeftBrace符, aShift4);             /*95*/states[25].Add(st.@LeftBracket符, aShift5);             // syntaxStates[26]:             // [6] Members : Members ',' Member ⏳ ;☕ ',' '}'              /*96*/states[26].Add(st.@Comma符, aReduce6);             /*97*/states[26].Add(st.@RightBrace符, aReduce6);             // syntaxStates[27]:             // [10] Member : 'string' ':' Value ⏳ ;☕ ',' '}'              /*98*/states[27].Add(st.@Comma符, aReduce10);             /*99*/states[27].Add(st.@RightBrace符, aReduce10);             // syntaxStates[28]:             // [8] Elements : Elements ',' Element ⏳ ;☕ ',' ']'              /*100*/states[28].Add(st.@Comma符, aReduce8);             /*101*/states[28].Add(st.@RightBracket符, aReduce8);             #endregion init actions of syntax states              return states;         }     } } 

另外3个Json.Dict.*.gen.cs_分别是LR(0)、SLR(1)、LR(1)的语法分析状态机,不再赘述。

这是最初的也是最直观的实现,它已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。

int[]+LRParseAction[]

Json.Table.LALR(1).gen.cs_是LALR(1)的语法分析状态机,每个语法状态都是一个包含int[]LRParseAction[]的对象。这里的每个int[t]LRParseAction[t]合起来就代替了Dictionary<int, LRParseAction>对象的一个键值对(key/value),从而减少了内存占用,也稍微提升了运行效率。

Json.Table.LALR(1).gen.cs_
using System; using bitzhuwei.Compiler;  namespace bitzhuwei.JsonFormat {     partial class CompilerJson {          private static LRParseState[] InitializeSyntaxStates() {             const int syntaxStateCount = 29;             var states = new LRParseState[syntaxStateCount];             // 102 actions             // conflicts(0)=not sovled(0)+solved(0)(0 warnings)             for (var i = 0; i < syntaxStateCount; i++) { states[i] = new(); }              #region re-used actions             LRParseAction aShift4 = new(LRParseAction.Kind.Shift, states[4]);// refered 4 times             LRParseAction aShift5 = new(LRParseAction.Kind.Shift, states[5]);// refered 4 times             LRParseAction aShift9 = new(LRParseAction.Kind.Shift, states[9]);// refered 2 times             LRParseAction aGoto13 = new(LRParseAction.Kind.Goto, states[13]);// refered 2 times             LRParseAction aShift14 = new(LRParseAction.Kind.Shift, states[14]);// refered 3 times             LRParseAction aShift15 = new(LRParseAction.Kind.Shift, states[15]);// refered 3 times             LRParseAction aShift16 = new(LRParseAction.Kind.Shift, states[16]);// refered 3 times             LRParseAction aShift17 = new(LRParseAction.Kind.Shift, states[17]);// refered 3 times             LRParseAction aShift18 = new(LRParseAction.Kind.Shift, states[18]);// refered 3 times             LRParseAction aGoto19 = new(LRParseAction.Kind.Goto, states[19]);// refered 3 times             LRParseAction aGoto20 = new(LRParseAction.Kind.Goto, states[20]);// refered 3 times             LRParseAction aReduce2 = new(regulations[2]);// refered 4 times             LRParseAction aReduce7 = new(regulations[7]);// refered 2 times             LRParseAction aReduce4 = new(regulations[4]);// refered 4 times             LRParseAction aReduce9 = new(regulations[9]);// refered 2 times             LRParseAction aReduce11 = new(regulations[11]);// refered 2 times             LRParseAction aReduce12 = new(regulations[12]);// refered 3 times             LRParseAction aReduce13 = new(regulations[13]);// refered 3 times             LRParseAction aReduce14 = new(regulations[14]);// refered 3 times             LRParseAction aReduce15 = new(regulations[15]);// refered 3 times             LRParseAction aReduce16 = new(regulations[16]);// refered 3 times             LRParseAction aReduce17 = new(regulations[17]);// refered 3 times             LRParseAction aReduce18 = new(regulations[18]);// refered 3 times             LRParseAction aReduce3 = new(regulations[3]);// refered 4 times             LRParseAction aReduce5 = new(regulations[5]);// refered 4 times             LRParseAction aReduce6 = new(regulations[6]);// refered 2 times             LRParseAction aReduce10 = new(regulations[10]);// refered 2 times             LRParseAction aReduce8 = new(regulations[8]);// refered 2 times             #endregion re-used actions              // 102 actions             // conflicts(0)=not sovled(0)+solved(0)(0 warnings)             #region init actions of syntax states             // syntaxStates[0]:             // [-1] Json' : ⏳ Json ;☕ '¥'              // [0] Json : ⏳ Object ;☕ '¥'              // [1] Json : ⏳ Array ;☕ '¥'              // [2] Object : ⏳ '{' '}' ;☕ '¥'              // [3] Object : ⏳ '{' Members '}' ;☕ '¥'              // [4] Array : ⏳ '[' ']' ;☕ '¥'              // [5] Array : ⏳ '[' Elements ']' ;☕ '¥'              states[0].nodes = new int[] {                 /*0*/st.@LeftBrace符, // (1) -> aShift4                 /*1*/st.@LeftBracket符, // (3) -> aShift5                 /*2*/st.Json枝, // (12) -> new(LRParseAction.Kind.Goto, states[1])                 /*3*/st.Object枝, // (13) -> new(LRParseAction.Kind.Goto, states[2])                 /*4*/st.Array枝, // (14) -> new(LRParseAction.Kind.Goto, states[3])             };             states[0].actions = new LRParseAction[] {                 /*0*//* st.@LeftBrace符(1), */aShift4,                 /*1*//* st.@LeftBracket符(3), */aShift5,                 /*2*//* st.Json枝(12), */new(LRParseAction.Kind.Goto, states[1]),                 /*3*//* st.Object枝(13), */new(LRParseAction.Kind.Goto, states[2]),                 /*4*//* st.Array枝(14), */new(LRParseAction.Kind.Goto, states[3]),             };             // syntaxStates[1]:             // [-1] Json' : Json ⏳ ;☕ '¥'              states[1].nodes = new int[] {                 /*5*/st.@终, // (0) -> LRParseAction.accept             };             states[1].actions = new LRParseAction[] {                 /*5*//* st.@终(0), */LRParseAction.accept,             };             // syntaxStates[2]:             // [0] Json : Object ⏳ ;☕ '¥'              states[2].nodes = new int[] {                 /*6*/st.@终, // (0) -> new(regulations[0])             };             states[2].actions = new LRParseAction[] {                 /*6*//* st.@终(0), */new(regulations[0]),             };             // syntaxStates[3]:             // [1] Json : Array ⏳ ;☕ '¥'              states[3].nodes = new int[] {                 /*7*/st.@终, // (0) -> new(regulations[1])             };             states[3].actions = new LRParseAction[] {                 /*7*//* st.@终(0), */new(regulations[1]),             };             // syntaxStates[4]:             // [2] Object : '{' ⏳ '}' ;☕ ',' ']' '}' '¥'              // [3] Object : '{' ⏳ Members '}' ;☕ ',' ']' '}' '¥'              // [6] Members : ⏳ Members ',' Member ;☕ ',' '}'              // [7] Members : ⏳ Member ;☕ ',' '}'              // [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}'              states[4].nodes = new int[] {                 /*8*/st.@RightBrace符, // (2) -> new(LRParseAction.Kind.Shift, states[6])                 /*9*/st.@string, // (6) -> aShift9                 /*10*/st.Members枝, // (15) -> new(LRParseAction.Kind.Goto, states[7])                 /*11*/st.Member枝, // (17) -> new(LRParseAction.Kind.Goto, states[8])             };             states[4].actions = new LRParseAction[] {                 /*8*//* st.@RightBrace符(2), */new(LRParseAction.Kind.Shift, states[6]),                 /*9*//* st.@string(6), */aShift9,                 /*10*//* st.Members枝(15), */new(LRParseAction.Kind.Goto, states[7]),                 /*11*//* st.Member枝(17), */new(LRParseAction.Kind.Goto, states[8]),             };             // syntaxStates[5]:             // [4] Array : '[' ⏳ ']' ;☕ ',' ']' '}' '¥'              // [5] Array : '[' ⏳ Elements ']' ;☕ ',' ']' '}' '¥'              // [8] Elements : ⏳ Elements ',' Element ;☕ ',' ']'              // [9] Elements : ⏳ Element ;☕ ',' ']'              // [11] Element : ⏳ Value ;☕ ',' ']'              // [12] Value : ⏳ 'null' ;☕ ',' ']'              // [13] Value : ⏳ 'true' ;☕ ',' ']'              // [14] Value : ⏳ 'false' ;☕ ',' ']'              // [15] Value : ⏳ 'number' ;☕ ',' ']'              // [16] Value : ⏳ 'string' ;☕ ',' ']'              // [17] Value : ⏳ Object ;☕ ',' ']'              // [18] Value : ⏳ Array ;☕ ',' ']'              // [2] Object : ⏳ '{' '}' ;☕ ',' ']'              // [3] Object : ⏳ '{' Members '}' ;☕ ',' ']'              // [4] Array : ⏳ '[' ']' ;☕ ',' ']'              // [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']'              states[5].nodes = new int[] {                 /*12*/st.@LeftBrace符, // (1) -> aShift4                 /*13*/st.@LeftBracket符, // (3) -> aShift5                 /*14*/st.@RightBracket符, // (4) -> new(LRParseAction.Kind.Shift, states[10])                 /*15*/st.@string, // (6) -> aShift18                 /*16*/st.@null, // (8) -> aShift14                 /*17*/st.@true, // (9) -> aShift15                 /*18*/st.@false, // (10) -> aShift16                 /*19*/st.@number, // (11) -> aShift17                 /*20*/st.Object枝, // (13) -> aGoto19                 /*21*/st.Array枝, // (14) -> aGoto20                 /*22*/st.Elements枝, // (16) -> new(LRParseAction.Kind.Goto, states[11])                 /*23*/st.Element枝, // (18) -> new(LRParseAction.Kind.Goto, states[12])                 /*24*/st.Value枝, // (19) -> aGoto13             };             states[5].actions = new LRParseAction[] {                 /*12*//* st.@LeftBrace符(1), */aShift4,                 /*13*//* st.@LeftBracket符(3), */aShift5,                 /*14*//* st.@RightBracket符(4), */new(LRParseAction.Kind.Shift, states[10]),                 /*15*//* st.@string(6), */aShift18,                 /*16*//* st.@null(8), */aShift14,                 /*17*//* st.@true(9), */aShift15,                 /*18*//* st.@false(10), */aShift16,                 /*19*//* st.@number(11), */aShift17,                 /*20*//* st.Object枝(13), */aGoto19,                 /*21*//* st.Array枝(14), */aGoto20,                 /*22*//* st.Elements枝(16), */new(LRParseAction.Kind.Goto, states[11]),                 /*23*//* st.Element枝(18), */new(LRParseAction.Kind.Goto, states[12]),                 /*24*//* st.Value枝(19), */aGoto13,             };             // syntaxStates[6]:             // [2] Object : '{' '}' ⏳ ;☕ ',' ']' '}' '¥'              states[6].nodes = new int[] {                 /*25*/st.@终, // (0) -> aReduce2                 /*26*/st.@RightBrace符, // (2) -> aReduce2                 /*27*/st.@RightBracket符, // (4) -> aReduce2                 /*28*/st.@Comma符, // (5) -> aReduce2             };             states[6].actions = new LRParseAction[] {                 /*25*//* st.@终(0), */aReduce2,                 /*26*//* st.@RightBrace符(2), */aReduce2,                 /*27*//* st.@RightBracket符(4), */aReduce2,                 /*28*//* st.@Comma符(5), */aReduce2,             };             // syntaxStates[7]:             // [3] Object : '{' Members ⏳ '}' ;☕ ',' ']' '}' '¥'              // [6] Members : Members ⏳ ',' Member ;☕ ',' '}'              states[7].nodes = new int[] {                 /*29*/st.@RightBrace符, // (2) -> new(LRParseAction.Kind.Shift, states[21])                 /*30*/st.@Comma符, // (5) -> new(LRParseAction.Kind.Shift, states[22])             };             states[7].actions = new LRParseAction[] {                 /*29*//* st.@RightBrace符(2), */new(LRParseAction.Kind.Shift, states[21]),                 /*30*//* st.@Comma符(5), */new(LRParseAction.Kind.Shift, states[22]),             };             // syntaxStates[8]:             // [7] Members : Member ⏳ ;☕ ',' '}'              states[8].nodes = new int[] {                 /*31*/st.@RightBrace符, // (2) -> aReduce7                 /*32*/st.@Comma符, // (5) -> aReduce7             };             states[8].actions = new LRParseAction[] {                 /*31*//* st.@RightBrace符(2), */aReduce7,                 /*32*//* st.@Comma符(5), */aReduce7,             };             // syntaxStates[9]:             // [10] Member : 'string' ⏳ ':' Value ;☕ ',' '}'              states[9].nodes = new int[] {                 /*33*/st.@Colon符, // (7) -> new(LRParseAction.Kind.Shift, states[23])             };             states[9].actions = new LRParseAction[] {                 /*33*//* st.@Colon符(7), */new(LRParseAction.Kind.Shift, states[23]),             };             // syntaxStates[10]:             // [4] Array : '[' ']' ⏳ ;☕ ',' ']' '}' '¥'              states[10].nodes = new int[] {                 /*34*/st.@终, // (0) -> aReduce4                 /*35*/st.@RightBrace符, // (2) -> aReduce4                 /*36*/st.@RightBracket符, // (4) -> aReduce4                 /*37*/st.@Comma符, // (5) -> aReduce4             };             states[10].actions = new LRParseAction[] {                 /*34*//* st.@终(0), */aReduce4,                 /*35*//* st.@RightBrace符(2), */aReduce4,                 /*36*//* st.@RightBracket符(4), */aReduce4,                 /*37*//* st.@Comma符(5), */aReduce4,             };             // syntaxStates[11]:             // [5] Array : '[' Elements ⏳ ']' ;☕ ',' ']' '}' '¥'              // [8] Elements : Elements ⏳ ',' Element ;☕ ',' ']'              states[11].nodes = new int[] {                 /*38*/st.@RightBracket符, // (4) -> new(LRParseAction.Kind.Shift, states[24])                 /*39*/st.@Comma符, // (5) -> new(LRParseAction.Kind.Shift, states[25])             };             states[11].actions = new LRParseAction[] {                 /*38*//* st.@RightBracket符(4), */new(LRParseAction.Kind.Shift, states[24]),                 /*39*//* st.@Comma符(5), */new(LRParseAction.Kind.Shift, states[25]),             };             // syntaxStates[12]:             // [9] Elements : Element ⏳ ;☕ ',' ']'              states[12].nodes = new int[] {                 /*40*/st.@RightBracket符, // (4) -> aReduce9                 /*41*/st.@Comma符, // (5) -> aReduce9             };             states[12].actions = new LRParseAction[] {                 /*40*//* st.@RightBracket符(4), */aReduce9,                 /*41*//* st.@Comma符(5), */aReduce9,             };             // syntaxStates[13]:             // [11] Element : Value ⏳ ;☕ ',' ']'              states[13].nodes = new int[] {                 /*42*/st.@RightBracket符, // (4) -> aReduce11                 /*43*/st.@Comma符, // (5) -> aReduce11             };             states[13].actions = new LRParseAction[] {                 /*42*//* st.@RightBracket符(4), */aReduce11,                 /*43*//* st.@Comma符(5), */aReduce11,             };             // syntaxStates[14]:             // [12] Value : 'null' ⏳ ;☕ ',' ']' '}'              states[14].nodes = new int[] {                 /*44*/st.@RightBrace符, // (2) -> aReduce12                 /*45*/st.@RightBracket符, // (4) -> aReduce12                 /*46*/st.@Comma符, // (5) -> aReduce12             };             states[14].actions = new LRParseAction[] {                 /*44*//* st.@RightBrace符(2), */aReduce12,                 /*45*//* st.@RightBracket符(4), */aReduce12,                 /*46*//* st.@Comma符(5), */aReduce12,             };             // syntaxStates[15]:             // [13] Value : 'true' ⏳ ;☕ ',' ']' '}'              states[15].nodes = new int[] {                 /*47*/st.@RightBrace符, // (2) -> aReduce13                 /*48*/st.@RightBracket符, // (4) -> aReduce13                 /*49*/st.@Comma符, // (5) -> aReduce13             };             states[15].actions = new LRParseAction[] {                 /*47*//* st.@RightBrace符(2), */aReduce13,                 /*48*//* st.@RightBracket符(4), */aReduce13,                 /*49*//* st.@Comma符(5), */aReduce13,             };             // syntaxStates[16]:             // [14] Value : 'false' ⏳ ;☕ ',' ']' '}'              states[16].nodes = new int[] {                 /*50*/st.@RightBrace符, // (2) -> aReduce14                 /*51*/st.@RightBracket符, // (4) -> aReduce14                 /*52*/st.@Comma符, // (5) -> aReduce14             };             states[16].actions = new LRParseAction[] {                 /*50*//* st.@RightBrace符(2), */aReduce14,                 /*51*//* st.@RightBracket符(4), */aReduce14,                 /*52*//* st.@Comma符(5), */aReduce14,             };             // syntaxStates[17]:             // [15] Value : 'number' ⏳ ;☕ ',' ']' '}'              states[17].nodes = new int[] {                 /*53*/st.@RightBrace符, // (2) -> aReduce15                 /*54*/st.@RightBracket符, // (4) -> aReduce15                 /*55*/st.@Comma符, // (5) -> aReduce15             };             states[17].actions = new LRParseAction[] {                 /*53*//* st.@RightBrace符(2), */aReduce15,                 /*54*//* st.@RightBracket符(4), */aReduce15,                 /*55*//* st.@Comma符(5), */aReduce15,             };             // syntaxStates[18]:             // [16] Value : 'string' ⏳ ;☕ ',' ']' '}'              states[18].nodes = new int[] {                 /*56*/st.@RightBrace符, // (2) -> aReduce16                 /*57*/st.@RightBracket符, // (4) -> aReduce16                 /*58*/st.@Comma符, // (5) -> aReduce16             };             states[18].actions = new LRParseAction[] {                 /*56*//* st.@RightBrace符(2), */aReduce16,                 /*57*//* st.@RightBracket符(4), */aReduce16,                 /*58*//* st.@Comma符(5), */aReduce16,             };             // syntaxStates[19]:             // [17] Value : Object ⏳ ;☕ ',' ']' '}'              states[19].nodes = new int[] {                 /*59*/st.@RightBrace符, // (2) -> aReduce17                 /*60*/st.@RightBracket符, // (4) -> aReduce17                 /*61*/st.@Comma符, // (5) -> aReduce17             };             states[19].actions = new LRParseAction[] {                 /*59*//* st.@RightBrace符(2), */aReduce17,                 /*60*//* st.@RightBracket符(4), */aReduce17,                 /*61*//* st.@Comma符(5), */aReduce17,             };             // syntaxStates[20]:             // [18] Value : Array ⏳ ;☕ ',' ']' '}'              states[20].nodes = new int[] {                 /*62*/st.@RightBrace符, // (2) -> aReduce18                 /*63*/st.@RightBracket符, // (4) -> aReduce18                 /*64*/st.@Comma符, // (5) -> aReduce18             };             states[20].actions = new LRParseAction[] {                 /*62*//* st.@RightBrace符(2), */aReduce18,                 /*63*//* st.@RightBracket符(4), */aReduce18,                 /*64*//* st.@Comma符(5), */aReduce18,             };             // syntaxStates[21]:             // [3] Object : '{' Members '}' ⏳ ;☕ ',' ']' '}' '¥'              states[21].nodes = new int[] {                 /*65*/st.@终, // (0) -> aReduce3                 /*66*/st.@RightBrace符, // (2) -> aReduce3                 /*67*/st.@RightBracket符, // (4) -> aReduce3                 /*68*/st.@Comma符, // (5) -> aReduce3             };             states[21].actions = new LRParseAction[] {                 /*65*//* st.@终(0), */aReduce3,                 /*66*//* st.@RightBrace符(2), */aReduce3,                 /*67*//* st.@RightBracket符(4), */aReduce3,                 /*68*//* st.@Comma符(5), */aReduce3,             };             // syntaxStates[22]:             // [6] Members : Members ',' ⏳ Member ;☕ ',' '}'              // [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}'              states[22].nodes = new int[] {                 /*69*/st.@string, // (6) -> aShift9                 /*70*/st.Member枝, // (17) -> new(LRParseAction.Kind.Goto, states[26])             };             states[22].actions = new LRParseAction[] {                 /*69*//* st.@string(6), */aShift9,                 /*70*//* st.Member枝(17), */new(LRParseAction.Kind.Goto, states[26]),             };             // syntaxStates[23]:             // [10] Member : 'string' ':' ⏳ Value ;☕ ',' '}'              // [12] Value : ⏳ 'null' ;☕ ',' '}'              // [13] Value : ⏳ 'true' ;☕ ',' '}'              // [14] Value : ⏳ 'false' ;☕ ',' '}'              // [15] Value : ⏳ 'number' ;☕ ',' '}'              // [16] Value : ⏳ 'string' ;☕ ',' '}'              // [17] Value : ⏳ Object ;☕ ',' '}'              // [18] Value : ⏳ Array ;☕ ',' '}'              // [2] Object : ⏳ '{' '}' ;☕ ',' '}'              // [3] Object : ⏳ '{' Members '}' ;☕ ',' '}'              // [4] Array : ⏳ '[' ']' ;☕ ',' '}'              // [5] Array : ⏳ '[' Elements ']' ;☕ ',' '}'              states[23].nodes = new int[] {                 /*71*/st.@LeftBrace符, // (1) -> aShift4                 /*72*/st.@LeftBracket符, // (3) -> aShift5                 /*73*/st.@string, // (6) -> aShift18                 /*74*/st.@null, // (8) -> aShift14                 /*75*/st.@true, // (9) -> aShift15                 /*76*/st.@false, // (10) -> aShift16                 /*77*/st.@number, // (11) -> aShift17                 /*78*/st.Object枝, // (13) -> aGoto19                 /*79*/st.Array枝, // (14) -> aGoto20                 /*80*/st.Value枝, // (19) -> new(LRParseAction.Kind.Goto, states[27])             };             states[23].actions = new LRParseAction[] {                 /*71*//* st.@LeftBrace符(1), */aShift4,                 /*72*//* st.@LeftBracket符(3), */aShift5,                 /*73*//* st.@string(6), */aShift18,                 /*74*//* st.@null(8), */aShift14,                 /*75*//* st.@true(9), */aShift15,                 /*76*//* st.@false(10), */aShift16,                 /*77*//* st.@number(11), */aShift17,                 /*78*//* st.Object枝(13), */aGoto19,                 /*79*//* st.Array枝(14), */aGoto20,                 /*80*//* st.Value枝(19), */new(LRParseAction.Kind.Goto, states[27]),             };             // syntaxStates[24]:             // [5] Array : '[' Elements ']' ⏳ ;☕ ',' ']' '}' '¥'              states[24].nodes = new int[] {                 /*81*/st.@终, // (0) -> aReduce5                 /*82*/st.@RightBrace符, // (2) -> aReduce5                 /*83*/st.@RightBracket符, // (4) -> aReduce5                 /*84*/st.@Comma符, // (5) -> aReduce5             };             states[24].actions = new LRParseAction[] {                 /*81*//* st.@终(0), */aReduce5,                 /*82*//* st.@RightBrace符(2), */aReduce5,                 /*83*//* st.@RightBracket符(4), */aReduce5,                 /*84*//* st.@Comma符(5), */aReduce5,             };             // syntaxStates[25]:             // [8] Elements : Elements ',' ⏳ Element ;☕ ',' ']'              // [11] Element : ⏳ Value ;☕ ',' ']'              // [12] Value : ⏳ 'null' ;☕ ',' ']'              // [13] Value : ⏳ 'true' ;☕ ',' ']'              // [14] Value : ⏳ 'false' ;☕ ',' ']'              // [15] Value : ⏳ 'number' ;☕ ',' ']'              // [16] Value : ⏳ 'string' ;☕ ',' ']'              // [17] Value : ⏳ Object ;☕ ',' ']'              // [18] Value : ⏳ Array ;☕ ',' ']'              // [2] Object : ⏳ '{' '}' ;☕ ',' ']'              // [3] Object : ⏳ '{' Members '}' ;☕ ',' ']'              // [4] Array : ⏳ '[' ']' ;☕ ',' ']'              // [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']'              states[25].nodes = new int[] {                 /*85*/st.@LeftBrace符, // (1) -> aShift4                 /*86*/st.@LeftBracket符, // (3) -> aShift5                 /*87*/st.@string, // (6) -> aShift18                 /*88*/st.@null, // (8) -> aShift14                 /*89*/st.@true, // (9) -> aShift15                 /*90*/st.@false, // (10) -> aShift16                 /*91*/st.@number, // (11) -> aShift17                 /*92*/st.Object枝, // (13) -> aGoto19                 /*93*/st.Array枝, // (14) -> aGoto20                 /*94*/st.Element枝, // (18) -> new(LRParseAction.Kind.Goto, states[28])                 /*95*/st.Value枝, // (19) -> aGoto13             };             states[25].actions = new LRParseAction[] {                 /*85*//* st.@LeftBrace符(1), */aShift4,                 /*86*//* st.@LeftBracket符(3), */aShift5,                 /*87*//* st.@string(6), */aShift18,                 /*88*//* st.@null(8), */aShift14,                 /*89*//* st.@true(9), */aShift15,                 /*90*//* st.@false(10), */aShift16,                 /*91*//* st.@number(11), */aShift17,                 /*92*//* st.Object枝(13), */aGoto19,                 /*93*//* st.Array枝(14), */aGoto20,                 /*94*//* st.Element枝(18), */new(LRParseAction.Kind.Goto, states[28]),                 /*95*//* st.Value枝(19), */aGoto13,             };             // syntaxStates[26]:             // [6] Members : Members ',' Member ⏳ ;☕ ',' '}'              states[26].nodes = new int[] {                 /*96*/st.@RightBrace符, // (2) -> aReduce6                 /*97*/st.@Comma符, // (5) -> aReduce6             };             states[26].actions = new LRParseAction[] {                 /*96*//* st.@RightBrace符(2), */aReduce6,                 /*97*//* st.@Comma符(5), */aReduce6,             };             // syntaxStates[27]:             // [10] Member : 'string' ':' Value ⏳ ;☕ ',' '}'              states[27].nodes = new int[] {                 /*98*/st.@RightBrace符, // (2) -> aReduce10                 /*99*/st.@Comma符, // (5) -> aReduce10             };             states[27].actions = new LRParseAction[] {                 /*98*//* st.@RightBrace符(2), */aReduce10,                 /*99*//* st.@Comma符(5), */aReduce10,             };             // syntaxStates[28]:             // [8] Elements : Elements ',' Element ⏳ ;☕ ',' ']'              states[28].nodes = new int[] {                 /*100*/st.@RightBracket符, // (4) -> aReduce8                 /*101*/st.@Comma符, // (5) -> aReduce8             };             states[28].actions = new LRParseAction[] {                 /*100*//* st.@RightBracket符(4), */aReduce8,                 /*101*//* st.@Comma符(5), */aReduce8,             };             #endregion init actions of syntax states              return states;         }     } } 

另外4个Json.Dict.*.gen.cs_分别是LL(1)、LR(0)、SLR(1)、LR(1)的语法分析状态机,不再赘述。

它是第二个实现,它已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。

Json.Table.*.gen.bin

与词法分析器类似,这是将数组形式(int[]+LRParseAction[])的语法分析表写入了一个二进制文件。加载Json解析器时,读取此文件即可得到数组形式(int[]+LRParseAction[])的语法分析表。这就不需要将整个语法分析表硬编码到源代码中了,从而进一步减少了内存占用。

为了方便调试、参考,我为其准备了对应的文本格式,例如LALR(1)的语法分析表:

Json.Table.LALR(1).gen.txt
conflicts(0)=not sovled(0)+solved(0)(0 warnings)  29 states. 28 re-used actions [0]:Shift[4] [1]:Shift[5] [2]:Shift[9] [3]:Goto[13]  [4]:Shift[14] [5]:Shift[15] [6]:Shift[16] [7]:Shift[17] [8]:Shift[18]  [9]:Goto[19] [10]:Goto[20] [11]:Reduce[2] [12]:Reduce[7] [13]:Reduce[4]  [14]:Reduce[9] [15]:Reduce[11] [16]:Reduce[12] [17]:Reduce[13] [18]:Reduce[14]  [19]:Reduce[15] [20]:Reduce[16] [21]:Reduce[17] [22]:Reduce[18] [23]:Reduce[3]  [24]:Reduce[5] [25]:Reduce[6] [26]:Reduce[10] [27]:Reduce[8]  states[0].nodes[5]: 1 3 12 13 14  states[0].actions[5]: -4(0)Shift[4] -4(1)Shift[5] Goto[1] Goto[2] Goto[3]  states[1].nodes[1]: 0  states[1].actions[1]: Accept[0]  states[2].nodes[1]: 0  states[2].actions[1]: Reduce[0]  states[3].nodes[1]: 0  states[3].actions[1]: Reduce[1]  states[4].nodes[4]: 2 6 15 17  states[4].actions[4]: Shift[6] -2(2)Shift[9] Goto[7] Goto[8]  states[5].nodes[13]: 1 3 4 6 8 9 10 11 13 14 16 18 19  states[5].actions[13]: -4(0)Shift[4] -4(1)Shift[5] Shift[10] -3(8)Shift[18] -3(4)Shift[14] -3(5)Shift[15] -3(6)Shift[16] -3(7)Shift[17] -3(9)Goto[19] -3(10)Goto[20] Goto[11] Goto[12] -2(3)Goto[13]  states[6].nodes[4]: 0 2 4 5  states[6].actions[4]: -4(11)Reduce[2] -4(11)Reduce[2] -4(11)Reduce[2] -4(11)Reduce[2]  states[7].nodes[2]: 2 5  states[7].actions[2]: Shift[21] Shift[22]  states[8].nodes[2]: 2 5  states[8].actions[2]: -2(12)Reduce[7] -2(12)Reduce[7]  states[9].nodes[1]: 7  states[9].actions[1]: Shift[23]  states[10].nodes[4]: 0 2 4 5  states[10].actions[4]: -4(13)Reduce[4] -4(13)Reduce[4] -4(13)Reduce[4] -4(13)Reduce[4]  states[11].nodes[2]: 4 5  states[11].actions[2]: Shift[24] Shift[25]  states[12].nodes[2]: 4 5  states[12].actions[2]: -2(14)Reduce[9] -2(14)Reduce[9]  states[13].nodes[2]: 4 5  states[13].actions[2]: -2(15)Reduce[11] -2(15)Reduce[11]  states[14].nodes[3]: 2 4 5  states[14].actions[3]: -3(16)Reduce[12] -3(16)Reduce[12] -3(16)Reduce[12]  states[15].nodes[3]: 2 4 5  states[15].actions[3]: -3(17)Reduce[13] -3(17)Reduce[13] -3(17)Reduce[13]  states[16].nodes[3]: 2 4 5  states[16].actions[3]: -3(18)Reduce[14] -3(18)Reduce[14] -3(18)Reduce[14]  states[17].nodes[3]: 2 4 5  states[17].actions[3]: -3(19)Reduce[15] -3(19)Reduce[15] -3(19)Reduce[15]  states[18].nodes[3]: 2 4 5  states[18].actions[3]: -3(20)Reduce[16] -3(20)Reduce[16] -3(20)Reduce[16]  states[19].nodes[3]: 2 4 5  states[19].actions[3]: -3(21)Reduce[17] -3(21)Reduce[17] -3(21)Reduce[17]  states[20].nodes[3]: 2 4 5  states[20].actions[3]: -3(22)Reduce[18] -3(22)Reduce[18] -3(22)Reduce[18]  states[21].nodes[4]: 0 2 4 5  states[21].actions[4]: -4(23)Reduce[3] -4(23)Reduce[3] -4(23)Reduce[3] -4(23)Reduce[3]  states[22].nodes[2]: 6 17  states[22].actions[2]: -2(2)Shift[9] Goto[26]  states[23].nodes[10]: 1 3 6 8 9 10 11 13 14 19  states[23].actions[10]: -4(0)Shift[4] -4(1)Shift[5] -3(8)Shift[18] -3(4)Shift[14] -3(5)Shift[15] -3(6)Shift[16] -3(7)Shift[17] -3(9)Goto[19] -3(10)Goto[20] Goto[27]  states[24].nodes[4]: 0 2 4 5  states[24].actions[4]: -4(24)Reduce[5] -4(24)Reduce[5] -4(24)Reduce[5] -4(24)Reduce[5]  states[25].nodes[11]: 1 3 6 8 9 10 11 13 14 18 19  states[25].actions[11]: -4(0)Shift[4] -4(1)Shift[5] -3(8)Shift[18] -3(4)Shift[14] -3(5)Shift[15] -3(6)Shift[16] -3(7)Shift[17] -3(9)Goto[19] -3(10)Goto[20] Goto[28] -2(3)Goto[13]  states[26].nodes[2]: 2 5  states[26].actions[2]: -2(25)Reduce[6] -2(25)Reduce[6]  states[27].nodes[2]: 2 5  states[27].actions[2]: -2(26)Reduce[10] -2(26)Reduce[10]  states[28].nodes[2]: 4 5  states[28].actions[2]: -2(27)Reduce[8] -2(27)Reduce[8]  

它是第三个实现,这是目前使用的实现方式。为了加载路径上的方便,我将其从Json.genSyntaxParser文件夹挪到了Json.gen文件夹下。

Json.Regulations.gen.cs_

这是一个数组,记录了整个Json文法的全部规则:

Json.Regulations.gen.cs_
using System; using bitzhuwei.Compiler;  namespace bitzhuwei.JsonFormat {     partial class CompilerJson {         public static readonly IReadOnlyList<Regulation> regulations = new Regulation[] {             // [0] Json = Object ;             new(0, st.Json枝, st.Object枝),              // [1] Json = Array ;             new(1, st.Json枝, st.Array枝),              // [2] Object = '{' '}' ;             new(2, st.Object枝, st.@LeftBrace符, st.@RightBrace符),              // [3] Object = '{' Members '}' ;             new(3, st.Object枝, st.@LeftBrace符, st.Members枝, st.@RightBrace符),              // [4] Array = '[' ']' ;             new(4, st.Array枝, st.@LeftBracket符, st.@RightBracket符),              // [5] Array = '[' Elements ']' ;             new(5, st.Array枝, st.@LeftBracket符, st.Elements枝, st.@RightBracket符),              // [6] Members = Members ',' Member ;             new(6, st.Members枝, st.Members枝, st.@Comma符, st.Member枝),              // [7] Members = Member ;             new(7, st.Members枝, st.Member枝),              // [8] Elements = Elements ',' Element ;             new(8, st.Elements枝, st.Elements枝, st.@Comma符, st.Element枝),              // [9] Elements = Element ;             new(9, st.Elements枝, st.Element枝),              // [10] Member = 'string' ':' Value ;             new(10, st.Member枝, st.@string, st.@Colon符, st.Value枝),              // [11] Element = Value ;             new(11, st.Element枝, st.Value枝),              // [12] Value = 'null' ;             new(12, st.Value枝, st.@null),              // [13] Value = 'true' ;             new(13, st.Value枝, st.@true),              // [14] Value = 'false' ;             new(14, st.Value枝, st.@false),              // [15] Value = 'number' ;             new(15, st.Value枝, st.@number),              // [16] Value = 'string' ;             new(16, st.Value枝, st.@string),              // [17] Value = Object ;             new(17, st.Value枝, st.Object枝),              // [18] Value = Array ;             new(18, st.Value枝, st.Array枝),          };     } } 

为了减少内存占用,这个硬编码的实现方式也已经被一个二进制文件(Json.Regulations.gen.bin)取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。

Json.Regulations.gen.bin对应的文本格式
19 12 = 1 (13) 12 = 1 (14) 13 = 2 (1 | 2) 13 = 3 (1 | 15 | 2) 14 = 2 (3 | 4) 14 = 3 (3 | 16 | 4) 15 = 3 (15 | 5 | 17) 15 = 1 (17) 16 = 3 (16 | 5 | 18) 16 = 1 (18) 17 = 3 (6 | 7 | 19) 18 = 1 (19) 19 = 1 (8) 19 = 1 (9) 19 = 1 (10) 19 = 1 (11) 19 = 1 (6) 19 = 1 (13) 19 = 1 (14) 

总而言之,如下所示:

C#实现自己的Json解析器(LALR(1)+miniDFA)

生成的提取器

所谓提取,就是按后序优先遍历的顺序访问语法树的各个结点,在访问时提取出语义信息。

例如,{ "a": 0.3, "b": true, "a": "again" }的语法树是这样的:

R[0] Json = Object ;⛪T[0->12]  └─R[3] Object = '{' Members '}' ;⛪T[0->12]     ├─T[0]='{' {     ├─R[6] Members = Members ',' Member ;⛪T[1->11]     │  ├─R[6] Members = Members ',' Member ;⛪T[1->7]     │  │  ├─R[7] Members = Member ;⛪T[1->3]     │  │  │  └─R[10] Member = 'string' ':' Value ;⛪T[1->3]     │  │  │     ├─T[1]='string' "a"     │  │  │     ├─T[2]=':' :     │  │  │     └─R[15] Value = 'number' ;⛪T[3]     │  │  │        └─T[3]='number' 0.3     │  │  ├─T[4]=',' ,     │  │  └─R[10] Member = 'string' ':' Value ;⛪T[5->7]     │  │     ├─T[5]='string' "b"     │  │     ├─T[6]=':' :     │  │     └─R[13] Value = 'true' ;⛪T[7]     │  │        └─T[7]='true' true     │  ├─T[8]=',' ,     │  └─R[10] Member = 'string' ':' Value ;⛪T[9->11]     │     ├─T[9]='string' "a"     │     ├─T[10]=':' :     │     └─R[16] Value = 'string' ;⛪T[11]     │        └─T[11]='string' "again"     └─T[12]='}' } 

按后序优先遍历的顺序,提取器会依次访问T[0]T[1]T[2]T[3]并将其入栈,然后访问R[15] Value = 'number' ;⛪T[3],此时应当:

// [15] Value = 'number' ; var r0 = (Token)context.rightStack.Pop();// T[3]出栈 var left = new JsonValue(JsonValue.Kind.Number, r0.value); context.rightStack.Push(left);// Value入栈 

之后会访问R[10] Member = 'string' ':' Value ;⛪T[1->3],此时应当:

// [10] Member = 'string' ':' Value ; var r0 = (JsonValue)context.rightStack.Pop();// Value出栈 var r1 = (Token)context.rightStack.Pop();// T[2]出栈 var r2 = (Token)context.rightStack.Pop();// T[1]出栈 var left = new JsonMember(key: r2.value, value: r0); context.rightStack.Push(left);// Member入栈 

这样逐步地访问到根节点R[0] Json = Object ;⛪T[0->12],此时应当:

var r0 = (List<JsonMember>)context.rightStack.Pop();// Member列表出栈 var left = new Json(r0); context.rightStack.Push(left);// Json入栈 

这样,语法树访问完毕了,栈context.rightStack中有且只有1个对象,即最终的Json。此时应当:

// [-1] Json' = Json ; context.result = (Json)context.rightStack.Pop(); 
提取器的完整代码InitializeExtractorItems
using System; using bitzhuwei.Compiler;  namespace bitzhuwei.JsonFormat {     partial class CompilerJson {         /// <summary>         /// <see cref="LRNode.type"/> -&gt; <see cref="Action{LRNode, TContext{Json}}"/>         /// </summary>         private static readonly Action<LRNode, TContext<Json>>?[]             @jsonExtractorItems = new Action<LRNode, TContext<Json>>[1/*'¥'*/ + 8/*Vn*/];          /// <summary>         /// initialize dict for extractor.         /// </summary>         private static void InitializeExtractorItems() {             var extractorItems = @jsonExtractorItems;              #region obsolete             //extractorDict.Add(st.NotYet,             //(node, context) => {             // not needed.             //});             //extractorDict.Add(st.Error,             //(node, context) => {             // nothing to do.             //});             //extractorDict.Add(st.blockComment,             //(node, context) => {             // not needed.             //});             //extractorDict.Add(st.inlineComment,             //(node, context) => {             // not needed.             //});             #endregion obsolete              extractorItems[st.@终/*0*/] = static (node, context) => {                 // [-1] Json' = Json ;                 // dumped by user-defined extractor                 context.result = (Json)context.rightStack.Pop();             }; // end of extractorItems[st.@终/*0*/] = (node, context) => { ... };             const int lexiVtCount = 11;             extractorItems[st.Json枝/*12*/ - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 0: { // [0] Json = Object ;                     // dumped by user-defined extractor                     var r0 = (List<JsonMember>)context.rightStack.Pop();                     var left = new Json(r0);                     context.rightStack.Push(left);                 }                 break;                 case 1: { // [1] Json = Array ;                     // dumped by user-defined extractor                     var r0 = (List<JsonValue>)context.rightStack.Pop();                     var left = new Json(r0);                     context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Json枝/*12*/ - lexiVtCount] = (node, context) => { ... };             extractorItems[st.Object枝/*13*/ - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 2: { // [2] Object = '{' '}' ;                     // dumped by user-defined extractor                     var r0 = (Token)context.rightStack.Pop();// reserved word is omissible                     var r1 = (Token)context.rightStack.Pop();// reserved word is omissible                     var left = new List<JsonMember>();                     context.rightStack.Push(left);                 }                 break;                 case 3: { // [3] Object = '{' Members '}' ;                     // dumped by user-defined extractor                     var r0 = (Token)context.rightStack.Pop();// reserved word is omissible                     var r1 = (List<JsonMember>)context.rightStack.Pop();                     var r2 = (Token)context.rightStack.Pop();// reserved word is omissible                     var left = r1;                     context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Object枝/*13*/ - lexiVtCount] = (node, context) => { ... };             extractorItems[st.Array枝/*14*/ - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 4: { // [4] Array = '[' ']' ;                     // dumped by user-defined extractor                     var r0 = (Token)context.rightStack.Pop();// reserved word is omissible                     var r1 = (Token)context.rightStack.Pop();// reserved word is omissible                     var left = new List<JsonValue>();                     context.rightStack.Push(left);                 }                 break;                 case 5: { // [5] Array = '[' Elements ']' ;                     // dumped by user-defined extractor                     var r0 = (Token)context.rightStack.Pop();// reserved word is omissible                     var r1 = (List<JsonValue>)context.rightStack.Pop();                     var r2 = (Token)context.rightStack.Pop();// reserved word is omissible                     var left = r1;                     context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Array枝/*14*/ - lexiVtCount] = (node, context) => { ... };             extractorItems[st.Members枝/*15*/ - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 6: { // [6] Members = Members ',' Member ;                     // dumped by user-defined extractor                     var r0 = (JsonMember)context.rightStack.Pop();                     var r1 = (Token)context.rightStack.Pop();// reserved word is omissible                     var r2 = (List<JsonMember>)context.rightStack.Pop();                     var left = r2;                     left.Add(r0);                     context.rightStack.Push(left);                 }                 break;                 case 7: { // [7] Members = Member ;                     // dumped by user-defined extractor                     var r0 = (JsonMember)context.rightStack.Pop();                     var left = new List<JsonMember>();                     left.Add(r0);                     context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Members枝/*15*/ - lexiVtCount] = (node, context) => { ... };             extractorItems[st.Elements枝/*16*/ - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 8: { // [8] Elements = Elements ',' Element ;                     // dumped by user-defined extractor                     var r0 = (JsonValue)context.rightStack.Pop();                     var r1 = (Token)context.rightStack.Pop();// reserved word is omissible                     var r2 = (List<JsonValue>)context.rightStack.Pop();                     var left = r2;                     left.Add(r0);                     context.rightStack.Push(left);                 }                 break;                 case 9: { // [9] Elements = Element ;                     // dumped by user-defined extractor                     var r0 = (JsonValue)context.rightStack.Pop();                     var left = new List<JsonValue>();                     left.Add(r0);                     context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Elements枝/*16*/ - lexiVtCount] = (node, context) => { ... };             extractorItems[st.Member枝/*17*/ - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 10: { // [10] Member = 'string' ':' Value ;                     // dumped by user-defined extractor                     var r0 = (JsonValue)context.rightStack.Pop();                     var r1 = (Token)context.rightStack.Pop();// reserved word is omissible                     var r2 = (Token)context.rightStack.Pop();                     var left = new JsonMember(key: r2.value, value: r0);                     context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Member枝/*17*/ - lexiVtCount] = (node, context) => { ... };             /*             extractorItems[st.Element枝(18) - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 11: { // [11] Element = Value ;                     // dumped by DefaultExtractor                     // var r0 = (VnValue)context.rightStack.Pop();                     // var left = new VnElement(r0);                     // context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Element枝(18) - lexiVtCount] = (node, context) => { ... };             */             extractorItems[st.Value枝/*19*/ - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 12: { // [12] Value = 'null' ;                     // dumped by user-defined extractor                     var r0 = (Token)context.rightStack.Pop();                     var left = new JsonValue(JsonValue.Kind.Null, r0.value);                     context.rightStack.Push(left);                 }                 break;                 case 13: { // [13] Value = 'true' ;                     // dumped by user-defined extractor                     var r0 = (Token)context.rightStack.Pop();                     var left = new JsonValue(JsonValue.Kind.True, r0.value);                     context.rightStack.Push(left);                 }                 break;                 case 14: { // [14] Value = 'false' ;                     // dumped by user-defined extractor                     var r0 = (Token)context.rightStack.Pop();                     var left = new JsonValue(JsonValue.Kind.False, r0.value);                     context.rightStack.Push(left);                 }                 break;                 case 15: { // [15] Value = 'number' ;                     // dumped by user-defined extractor                     var r0 = (Token)context.rightStack.Pop();                     var left = new JsonValue(JsonValue.Kind.Number, r0.value);                     context.rightStack.Push(left);                 }                 break;                 case 16: { // [16] Value = 'string' ;                     // dumped by user-defined extractor                     var r0 = (Token)context.rightStack.Pop();                     var left = new JsonValue(JsonValue.Kind.String, r0.value);                     context.rightStack.Push(left);                 }                 break;                 case 17: { // [17] Value = Object ;                     // dumped by user-defined extractor                     var r0 = (List<JsonMember>)context.rightStack.Pop();                     var left = new JsonValue(r0);                     context.rightStack.Push(left);                 }                 break;                 case 18: { // [18] Value = Array ;                     // dumped by user-defined extractor                     var r0 = (List<JsonValue>)context.rightStack.Pop();                     var left = new JsonValue(r0);                     context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Value枝/*19*/ - lexiVtCount] = (node, context) => { ... };          }     } } 

不同的应用场景会要求不同的语义信息,因而一键生成的提取器代码不是这样的,而是仅仅将语法树压平了,并且保留了尽可能多的源代码信息,如下所示:

一键生成的提取器代码
using System; using bitzhuwei.Compiler;  namespace bitzhuwei.JsonFormat {     partial class CompilerJson {         /// <summary>         /// <see cref="LRNode.type"/> -&gt; <see cref="Action{LRNode, TContext{Json}}"/>         /// </summary>         private static readonly Action<LRNode, TContext<Json>>?[]             @jsonExtractorItems = new Action<LRNode, TContext<Json>>[1/*'¥'*/ + 8/*Vn*/];          /// <summary>         /// initialize dict for extractor.         /// </summary>         private static void InitializeExtractorItems() {             var extractorItems = @jsonExtractorItems;              #region obsolete             //extractorDict.Add(st.NotYet,             //(node, context) => {             // not needed.             //});             //extractorDict.Add(st.Error,             //(node, context) => {             // nothing to do.             //});             //extractorDict.Add(st.blockComment,             //(node, context) => {             // not needed.             //});             //extractorDict.Add(st.inlineComment,             //(node, context) => {             // not needed.             //});             #endregion obsolete              extractorItems[st.@终/*0*/] = static (node, context) => {                 // [-1] Json' = Json ;                 // dumped by ExternalExtractor                 var @final = (VnJson)context.rightStack.Pop();                 var left = new Json(@final);                 context.result = left; // final step, no need to push into stack.             }; // end of extractorItems[st.@终/*0*/] = (node, context) => { ... };             const int lexiVtCount = 11;             extractorItems[st.Json枝/*12*/ - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 0: { // [0] Json = Object ;                     // dumped by InheritExtractor                     // class VnObject : VnJson                     var r0 = (VnObject)context.rightStack.Pop();                     var left = r0;                     context.rightStack.Push(left);                 }                 break;                 case 1: { // [1] Json = Array ;                     // dumped by InheritExtractor                     // class VnArray : VnJson                     var r0 = (VnArray)context.rightStack.Pop();                     var left = r0;                     context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Json枝/*12*/ - lexiVtCount] = (node, context) => { ... };             extractorItems[st.Object枝/*13*/ - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 2: { // [2] Object = '{' '}' ;                     // dumped by DefaultExtractor                     var r0 = (Token)context.rightStack.Pop();// reserved word is omissible                     var r1 = (Token)context.rightStack.Pop();// reserved word is omissible                     var left = new VnObject(r1, r0);                     context.rightStack.Push(left);                 }                 break;                 case 3: { // [3] Object = '{' Members '}' ;                     // dumped by DefaultExtractor                     var r0 = (Token)context.rightStack.Pop();// reserved word is omissible                     var r1 = (VnMembers)context.rightStack.Pop();                     var r2 = (Token)context.rightStack.Pop();// reserved word is omissible                     var left = new VnObject(r2, r1, r0);                     context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Object枝/*13*/ - lexiVtCount] = (node, context) => { ... };             extractorItems[st.Array枝/*14*/ - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 4: { // [4] Array = '[' ']' ;                     // dumped by DefaultExtractor                     var r0 = (Token)context.rightStack.Pop();// reserved word is omissible                     var r1 = (Token)context.rightStack.Pop();// reserved word is omissible                     var left = new VnArray(r1, r0);                     context.rightStack.Push(left);                 }                 break;                 case 5: { // [5] Array = '[' Elements ']' ;                     // dumped by DefaultExtractor                     var r0 = (Token)context.rightStack.Pop();// reserved word is omissible                     var r1 = (VnElements)context.rightStack.Pop();                     var r2 = (Token)context.rightStack.Pop();// reserved word is omissible                     var left = new VnArray(r2, r1, r0);                     context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Array枝/*14*/ - lexiVtCount] = (node, context) => { ... };             extractorItems[st.Members枝/*15*/ - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 6: { // [6] Members = Members ',' Member ;                     // dumped by ListExtractor 2                     var r0 = (VnMember)context.rightStack.Pop();                     var r1 = (Token)context.rightStack.Pop();// reserved word is omissible                     var r2 = (VnMembers)context.rightStack.Pop();                     var left = r2;                     left.Add(r1, r0);                     context.rightStack.Push(left);                 }                 break;                 case 7: { // [7] Members = Member ;                     // dumped by ListExtractor 1                     var r0 = (VnMember)context.rightStack.Pop();                     var left = new VnMembers(r0);                     context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Members枝/*15*/ - lexiVtCount] = (node, context) => { ... };             extractorItems[st.Elements枝/*16*/ - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 8: { // [8] Elements = Elements ',' Element ;                     // dumped by ListExtractor 2                     var r0 = (VnElement)context.rightStack.Pop();                     var r1 = (Token)context.rightStack.Pop();// reserved word is omissible                     var r2 = (VnElements)context.rightStack.Pop();                     var left = r2;                     left.Add(r1, r0);                     context.rightStack.Push(left);                 }                 break;                 case 9: { // [9] Elements = Element ;                     // dumped by ListExtractor 1                     var r0 = (VnElement)context.rightStack.Pop();                     var left = new VnElements(r0);                     context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Elements枝/*16*/ - lexiVtCount] = (node, context) => { ... };             extractorItems[st.Member枝/*17*/ - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 10: { // [10] Member = 'string' ':' Value ;                     // dumped by DefaultExtractor                     var r0 = (VnValue)context.rightStack.Pop();                     var r1 = (Token)context.rightStack.Pop();// reserved word is omissible                     var r2 = (Token)context.rightStack.Pop();                     var left = new VnMember(r2, r1, r0);                     context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Member枝/*17*/ - lexiVtCount] = (node, context) => { ... };             extractorItems[st.Element枝/*18*/ - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 11: { // [11] Element = Value ;                     // dumped by DefaultExtractor                     var r0 = (VnValue)context.rightStack.Pop();                     var left = new VnElement(r0);                     context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Element枝/*18*/ - lexiVtCount] = (node, context) => { ... };             extractorItems[st.Value枝/*19*/ - lexiVtCount] = static (node, context) => {                 switch (node.regulation.index) {                 case 12: { // [12] Value = 'null' ;                     // dumped by DefaultExtractor                     var r0 = (Token)context.rightStack.Pop();// reserved word is omissible                     var left = new VnValue(r0);                     context.rightStack.Push(left);                 }                 break;                 case 13: { // [13] Value = 'true' ;                     // dumped by DefaultExtractor                     var r0 = (Token)context.rightStack.Pop();// reserved word is omissible                     var left = new VnValue(r0);                     context.rightStack.Push(left);                 }                 break;                 case 14: { // [14] Value = 'false' ;                     // dumped by DefaultExtractor                     var r0 = (Token)context.rightStack.Pop();// reserved word is omissible                     var left = new VnValue(r0);                     context.rightStack.Push(left);                 }                 break;                 case 15: { // [15] Value = 'number' ;                     // dumped by DefaultExtractor                     var r0 = (Token)context.rightStack.Pop();                     var left = new VnValue(r0);                     context.rightStack.Push(left);                 }                 break;                 case 16: { // [16] Value = 'string' ;                     // dumped by DefaultExtractor                     var r0 = (Token)context.rightStack.Pop();                     var left = new VnValue(r0);                     context.rightStack.Push(left);                 }                 break;                 case 17: { // [17] Value = Object ;                     // dumped by DefaultExtractor                     var r0 = (VnObject)context.rightStack.Pop();                     var left = new VnValue(r0);                     context.rightStack.Push(left);                 }                 break;                 case 18: { // [18] Value = Array ;                     // dumped by DefaultExtractor                     var r0 = (VnArray)context.rightStack.Pop();                     var left = new VnValue(r0);                     context.rightStack.Push(left);                 }                 break;                 default: throw new NotImplementedException();                 }             }; // end of extractorItems[st.Value枝/*19*/ - lexiVtCount] = (node, context) => { ... };          }     } } 

这是步子最小的保守式代码,程序员可以在此基础上继续开发,也可以自行编写访问各类型结点的提取动作。本应用场景的目的是尽可能高效地解析Json文本文件,因而完全自行编写了访问各类型结点的提取动作。

测试

测试用例0
{} 
测试用例1
[] 
测试用例2
{ "a": 0.3 } 
测试用例3
{   "a": 0.3,   "b": true } 
测试用例4
{   "a": 0.3,   "b": true,   "a": "again" } 
测试用例5
{   "a": 0.3,   "b": true,   "a": "again",   "array": [     1,     true,     null,     "str",     {       "t": 100,       "array2": [ false, 3.14, "tmp" ]     }   ] } 

上述测试用例都能够被Json解析器正确解析,也可以在(https://jsonlint.com/)验证。

调用Json解析器的代码如下:

var compiler = new bitzhuwei.JsonFormat.CompilerJson(); var sourceCode = File.ReadAllText("xxx.json"); var tokens = compiler.Analyze(sourceCode); var syntaxTree = compiler.Parse(tokens); var json = compiler.Extract(syntaxTree.root, tokens, sourceCode); // use json ... 

End

发表评论

评论已关闭。

相关文章