CSharp初体验

入门

初来乍到了解一门新的语言,它可能和熟悉的c/c++有不小差别,整体上需要首先了解下语法文件的整体结构。例如,源文件整体结构如何。

乍看CSharp源文件(compile unit)的结构,官网主要是通过文字描述的整体结构,而下面的形式化语法,描述也不太符合自定向下这种类型的语法结构描述方法,这样对于新手来了解这种语言的整体结构来说就有些困难。

好在有一个开源的dotgnu项目,该项目的官方文档中显示,项目已经在2012年正式废弃(可能更早已经没有更新了)。从工程的语法描述文件来看,它还没有涉及到lambda表达式这种重要语法功能的支持,不知道是因为项目启动时暂时没有支持,或者是启动时CSharp还没有这种语法功能。

As of December 2012, the DotGNU project has been decommissioned, until and unless a substantial new volunteer effort arises. The exception is the libjit component, which is now a separate libjit package.

dotgnu

尽管该项目比较久远,但是它的语法描述是通过经典的yacc语法描述,这样对于理解整体结构时最为直观的。其中对于整体结构的描述大致如下。从这个描述来看,整个源文件的结构顶层只能包含using、namespace、class、enum、struct、module、interface、delegate这些声明。

///@file: DotGnupnetcscccsharpcs_grammar.y /*  * Outer level of the C# input file.  */  CompilationUnit 	: /* empty */	{ 				/* The input file is empty */ 				CCTypedWarning("-empty-input", 							   "file contains no declarations"); 				ResetState(); 			} 	| OuterDeclarationsRecoverable		{ 				/* Check for empty input and finalize the parse */ 				if(!HaveDecls) 				{ 					CCTypedWarning("-empty-input", 								   "file contains no declarations"); 				} 				ResetState(); 			} 	| OuterDeclarationsRecoverable NonOptAttributes	{ 				/* A file that contains declarations and assembly attributes */ 				if($2) 				{ 					InitGlobalNamespace(); 					CCPluginAddStandaloneAttrs 						(ILNode_StandaloneAttr_create 							((ILNode*)CurrNamespaceNode, $2)); 				} 				ResetState(); 			} 	| NonOptAttributes	{ 				/* A file that contains only assembly attributes */ 				if($1) 				{ 					InitGlobalNamespace(); 					CCPluginAddStandaloneAttrs 						(ILNode_StandaloneAttr_create 							((ILNode*)CurrNamespaceNode, $1)); 				} 				ResetState(); 			} 	;  /*  * Note: strictly speaking, declarations should be ordered so  * that using declarations always come before namespace members.  * We have relaxed this to make error recovery easier.  */ OuterDeclarations 	: OuterDeclaration 	| OuterDeclarations OuterDeclaration 	;  OuterDeclaration 	: UsingDirective 	| NamespaceMemberDeclaration 	| error			{ 				/* 				 * This production recovers from errors at the outer level 				 * by skipping invalid tokens until a namespace, using, 				 * type declaration, or attribute, is encountered. 				 */ 			#ifdef YYEOF 				while(yychar != YYEOF) 			#else 				while(yychar >= 0) 			#endif 				{ 					if(yychar == NAMESPACE || yychar == USING || 					   yychar == PUBLIC || yychar == INTERNAL || 					   yychar == UNSAFE || yychar == SEALED || 					   yychar == ABSTRACT || yychar == CLASS || 					   yychar == STRUCT || yychar == DELEGATE || 					   yychar == ENUM || yychar == INTERFACE || 					   yychar == '[') 					{ 						/* This token starts a new outer-level declaration */ 						break; 					} 					else if(yychar == '}' && CurrNamespace.len != 0) 					{ 						/* Probably the end of the enclosing namespace */ 						break; 					} 					else if(yychar == ';') 					{ 						/* Probably the end of an outer-level declaration, 						   so restart the parser on the next token */ 						yychar = YYLEX; 						break; 					} 					yychar = YYLEX; 				} 			#ifdef YYEOF 				if(yychar != YYEOF) 			#else 				if(yychar >= 0) 			#endif 				{ 					yyerrok; 				} 				NestingLevel = 0; 			} 	; ///.... OptNamespaceMemberDeclarations 	: /* empty */ 	| OuterDeclarations 	;  NamespaceMemberDeclaration 	: NamespaceDeclaration 	| TypeDeclaration			{ CCPluginAddTopLevel($1); } 	;  TypeDeclaration 	: ClassDeclaration			{ $$ = $1; } 	| ModuleDeclaration			{ $$ = $1; } 	| StructDeclaration			{ $$ = $1; } 	| InterfaceDeclaration		{ $$ = $1; } 	| EnumDeclaration			{ $$ = $1; } 	| DelegateDeclaration		{ $$ = $1; } 	;  

roslyn

微软官方开源了CSharp的实现,所以最标准的解释应该是来自微软官方代码。遗憾的是这个工程是使用CSharp开发的,所以项目内对于语法的解析也不是通过yacc文件描述,而是手工实现的一个编译器解析。猜测代码应该位于

///@file: roslynsrcCompilersCSharpPortableParser          internal CompilationUnitSyntax ParseCompilationUnitCore()         {             SyntaxToken? tmp = null;             SyntaxListBuilder? initialBadNodes = null;             var body = new NamespaceBodyBuilder(_pool);             try             {                 this.ParseNamespaceBody(ref tmp, ref body, ref initialBadNodes, SyntaxKind.CompilationUnit);                  var eof = this.EatToken(SyntaxKind.EndOfFileToken);                 var result = _syntaxFactory.CompilationUnit(body.Externs, body.Usings, body.Attributes, body.Members, eof);                  if (initialBadNodes != null)                 {                     // attach initial bad nodes as leading trivia on first token                     result = AddLeadingSkippedSyntax(result, initialBadNodes.ToListNode());                     _pool.Free(initialBadNodes);                 }                  return result;             }             finally             {                 body.Free(_pool);             }         }             private void ParseNamespaceBody(             [NotNullIfNotNull(nameof(openBraceOrSemicolon))] ref SyntaxToken? openBraceOrSemicolon,             ref NamespaceBodyBuilder body,             ref SyntaxListBuilder? initialBadNodes,             SyntaxKind parentKind)         {             // "top-level" expressions and statements should never occur inside an asynchronous context             Debug.Assert(!IsInAsync);              bool isGlobal = openBraceOrSemicolon == null;              var saveTerm = _termState;             _termState |= TerminatorState.IsNamespaceMemberStartOrStop;             NamespaceParts seen = NamespaceParts.None;             var pendingIncompleteMembers = _pool.Allocate<MemberDeclarationSyntax>();             bool reportUnexpectedToken = true;              try             {                 while (true)                 {                     switch (this.CurrentToken.Kind)                     {                         case SyntaxKind.NamespaceKeyword:                             // incomplete members must be processed before we add any nodes to the body:                             AddIncompleteMembers(ref pendingIncompleteMembers, ref body);                              var attributeLists = _pool.Allocate<AttributeListSyntax>();                             var modifiers = _pool.Allocate();                              body.Members.Add(adjustStateAndReportStatementOutOfOrder(ref seen, this.ParseNamespaceDeclaration(attributeLists, modifiers)));                              _pool.Free(attributeLists);                             _pool.Free(modifiers);                              reportUnexpectedToken = true;                             break;                          case SyntaxKind.CloseBraceToken:                             // A very common user error is to type an additional }                              // somewhere in the file.  This will cause us to stop parsing                             // the root (global) namespace too early and will make the                              // rest of the file unparseable and unusable by intellisense.                             // We detect that case here and we skip the close curly and                             // continue parsing as if we did not see the }                             if (isGlobal)                             {                                 // incomplete members must be processed before we add any nodes to the body:                                 ReduceIncompleteMembers(ref pendingIncompleteMembers, ref openBraceOrSemicolon, ref body, ref initialBadNodes);                                  var token = this.EatToken();                                 token = this.AddError(token,                                     IsScript ? ErrorCode.ERR_GlobalDefinitionOrStatementExpected : ErrorCode.ERR_EOFExpected);                                  this.AddSkippedNamespaceText(ref openBraceOrSemicolon, ref body, ref initialBadNodes, token);                                 reportUnexpectedToken = true;                                 break;                             }                             else                             {                                 // This token marks the end of a namespace body                                 return;                             }                          case SyntaxKind.EndOfFileToken:                             // This token marks the end of a namespace body                             return;                          case SyntaxKind.ExternKeyword:                             if (isGlobal && !ScanExternAliasDirective())                             {                                 // extern member or a local function                                 goto default;                             }                             else                             {                                 // incomplete members must be processed before we add any nodes to the body:                                 ReduceIncompleteMembers(ref pendingIncompleteMembers, ref openBraceOrSemicolon, ref body, ref initialBadNodes);                                  var @extern = ParseExternAliasDirective();                                 if (seen > NamespaceParts.ExternAliases)                                 {                                     @extern = this.AddErrorToFirstToken(@extern, ErrorCode.ERR_ExternAfterElements);                                     this.AddSkippedNamespaceText(ref openBraceOrSemicolon, ref body, ref initialBadNodes, @extern);                                 }                                 else                                 {                                     body.Externs.Add(@extern);                                     seen = NamespaceParts.ExternAliases;                                 }                                  reportUnexpectedToken = true;                                 break;                             }                          case SyntaxKind.UsingKeyword:                             if (isGlobal && (this.PeekToken(1).Kind == SyntaxKind.OpenParenToken || (!IsScript && IsPossibleTopLevelUsingLocalDeclarationStatement())))                             {                                 // Top-level using statement or using local declaration                                 goto default;                             }                             else                             {                                 parseUsingDirective(ref openBraceOrSemicolon, ref body, ref initialBadNodes, ref seen, ref pendingIncompleteMembers);                             }                              reportUnexpectedToken = true;                             break;                          case SyntaxKind.IdentifierToken:                             if (this.CurrentToken.ContextualKind != SyntaxKind.GlobalKeyword || this.PeekToken(1).Kind != SyntaxKind.UsingKeyword)                             {                                 goto default;                             }                             else                             {                                 parseUsingDirective(ref openBraceOrSemicolon, ref body, ref initialBadNodes, ref seen, ref pendingIncompleteMembers);                             }                              reportUnexpectedToken = true;                             break;                          case SyntaxKind.OpenBracketToken:                             if (this.IsPossibleGlobalAttributeDeclaration())                             {                                 // incomplete members must be processed before we add any nodes to the body:                                 ReduceIncompleteMembers(ref pendingIncompleteMembers, ref openBraceOrSemicolon, ref body, ref initialBadNodes);                                  var attribute = this.ParseAttributeDeclaration();                                 if (!isGlobal || seen > NamespaceParts.GlobalAttributes)                                 {                                     RoslynDebug.Assert(attribute.Target != null, "Must have a target as IsPossibleGlobalAttributeDeclaration checks for that");                                     attribute = this.AddError(attribute, attribute.Target.Identifier, ErrorCode.ERR_GlobalAttributesNotFirst);                                     this.AddSkippedNamespaceText(ref openBraceOrSemicolon, ref body, ref initialBadNodes, attribute);                                 }                                 else                                 {                                     body.Attributes.Add(attribute);                                     seen = NamespaceParts.GlobalAttributes;                                 }                                  reportUnexpectedToken = true;                                 break;                             }                              goto default;                          default:                             var memberOrStatement = isGlobal ? this.ParseMemberDeclarationOrStatement(parentKind) : this.ParseMemberDeclaration(parentKind);                             if (memberOrStatement == null)                             {                                 // incomplete members must be processed before we add any nodes to the body:                                 ReduceIncompleteMembers(ref pendingIncompleteMembers, ref openBraceOrSemicolon, ref body, ref initialBadNodes);                                  // eat one token and try to parse declaration or statement again:                                 var skippedToken = EatToken();                                 if (reportUnexpectedToken && !skippedToken.ContainsDiagnostics)                                 {                                     skippedToken = this.AddError(skippedToken,                                         IsScript ? ErrorCode.ERR_GlobalDefinitionOrStatementExpected : ErrorCode.ERR_EOFExpected);                                      // do not report the error multiple times for subsequent tokens:                                     reportUnexpectedToken = false;                                 }                                  this.AddSkippedNamespaceText(ref openBraceOrSemicolon, ref body, ref initialBadNodes, skippedToken);                             }                             else if (memberOrStatement.Kind == SyntaxKind.IncompleteMember && seen < NamespaceParts.MembersAndStatements)                             {                                 pendingIncompleteMembers.Add(memberOrStatement);                                 reportUnexpectedToken = true;                             }                             else                             {                                 // incomplete members must be processed before we add any nodes to the body:                                 AddIncompleteMembers(ref pendingIncompleteMembers, ref body);                                  body.Members.Add(adjustStateAndReportStatementOutOfOrder(ref seen, memberOrStatement));                                 reportUnexpectedToken = true;                             }                             break;                     }                 }             }             finally             {                 _termState = saveTerm;                  // adds pending incomplete nodes:                 AddIncompleteMembers(ref pendingIncompleteMembers, ref body);                 _pool.Free(pendingIncompleteMembers);             }              MemberDeclarationSyntax adjustStateAndReportStatementOutOfOrder(ref NamespaceParts seen, MemberDeclarationSyntax memberOrStatement)             {                 switch (memberOrStatement.Kind)                 {                     case SyntaxKind.GlobalStatement:                         if (seen < NamespaceParts.MembersAndStatements)                         {                             seen = NamespaceParts.MembersAndStatements;                         }                         else if (seen == NamespaceParts.TypesAndNamespaces)                         {                             seen = NamespaceParts.TopLevelStatementsAfterTypesAndNamespaces;                              if (!IsScript)                             {                                 memberOrStatement = this.AddError(memberOrStatement, ErrorCode.ERR_TopLevelStatementAfterNamespaceOrType);                             }                         }                          break;                      case SyntaxKind.NamespaceDeclaration:                     case SyntaxKind.FileScopedNamespaceDeclaration:                     case SyntaxKind.EnumDeclaration:                     case SyntaxKind.StructDeclaration:                     case SyntaxKind.ClassDeclaration:                     case SyntaxKind.InterfaceDeclaration:                     case SyntaxKind.DelegateDeclaration:                     case SyntaxKind.RecordDeclaration:                     case SyntaxKind.RecordStructDeclaration:                         if (seen < NamespaceParts.TypesAndNamespaces)                         {                             seen = NamespaceParts.TypesAndNamespaces;                         }                         break;                      default:                         if (seen < NamespaceParts.MembersAndStatements)                         {                             seen = NamespaceParts.MembersAndStatements;                         }                         break;                 }                  return memberOrStatement;             }              void parseUsingDirective(                 ref SyntaxToken? openBrace,                 ref NamespaceBodyBuilder body,                 ref SyntaxListBuilder? initialBadNodes,                 ref NamespaceParts seen,                 ref SyntaxListBuilder<MemberDeclarationSyntax> pendingIncompleteMembers)             {                 // incomplete members must be processed before we add any nodes to the body:                 ReduceIncompleteMembers(ref pendingIncompleteMembers, ref openBrace, ref body, ref initialBadNodes);                  var @using = this.ParseUsingDirective();                 if (seen > NamespaceParts.Usings)                 {                     @using = this.AddError(@using, ErrorCode.ERR_UsingAfterElements);                     this.AddSkippedNamespaceText(ref openBrace, ref body, ref initialBadNodes, @using);                 }                 else                 {                     body.Usings.Add(@using);                     seen = NamespaceParts.Usings;                 }             }         } 

乌龙

因为这个这种手撕的编译器代码看起来过于晦涩,又回头看了下CSharp的官方语言描述,其中是有编译单元入口描述的,只是隐藏的位置比较深,所以刚开始没看到([流汗]),这个最顶层的语法结构就是compilation_unit,从这个依次向下可以看到对于该结构的逐层描述和细化。从这个语法描述结构来看,最顶层的结构的确只能宝库using开始的结构,然后就是namespace,以及type_declaration。

// Source: §14.2 Compilation units compilation_unit     : extern_alias_directive* using_directive* global_attributes?       namespace_member_declaration*     ;      // Source: §22.3 Attribute specification global_attributes     : global_attribute_section+     ;  // Source: §14.6 Namespace member declarations namespace_member_declaration     : namespace_declaration     | type_declaration     ;  // Source: §14.7 Type declarations type_declaration     : class_declaration     | struct_declaration     | interface_declaration     | enum_declaration     | delegate_declaration     ; // Source: §14.3 Namespace declarations namespace_declaration     : 'namespace' qualified_identifier namespace_body ';'?     ;      global_attribute_section     : '[' global_attribute_target_specifier attribute_list ']'     | '[' global_attribute_target_specifier attribute_list ',' ']'     ;      

lambda表达式

在众多表达式中,这种lambda是一种比较顺手的语法结构,经在很多项目中出镜率还是很高的,所以还是要看下这个语法。在这个语法描述中,可以看到,关键的是"=>"这个语法结构,在这个结构之前,可以使用括弧(explicit_anonymous_function_signature),也可以不使用(implicit_anonymous_function_signature)。这种语法其实很难使用yacc语法描述,因为它对上下文的依赖非常强。

// Source: §12.19.1 General lambda_expression     : 'async'? anonymous_function_signature '=>' anonymous_function_body     ; anonymous_function_signature     : explicit_anonymous_function_signature     | implicit_anonymous_function_signature     ;  explicit_anonymous_function_signature     : '(' explicit_anonymous_function_parameter_list? ')'     ; implicit_anonymous_function_signature     : '(' implicit_anonymous_function_parameter_list? ')'     | implicit_anonymous_function_parameter     ;  implicit_anonymous_function_parameter_list     : implicit_anonymous_function_parameter       (',' implicit_anonymous_function_parameter)*     ;  implicit_anonymous_function_parameter     : identifier     ;      

其它=>

搜索语法中的这个'=>',可以发现除了lambda表达式之外,还有其他的场景使用,例如local_function_body。同样是这种语法结构,那么如何区域分是lambda表达式还是local_function呢?其实看下语法的上下文就可以看到,localfunction中'=>'前面是需要有类型(return_type)声明,而lambda表达式中的implicit_anonymous_function_parameter是作为expression来出现的,而顾名思义,expression表达式的前面是不可能出现type这种类型前缀引导的。

这里再次看到,CSharp这种语言是很难通过yacc这种通用的语法工具来描述。

// Source: §13.6.4 Local function declarations local_function_declaration     : local_function_header local_function_body     ;  local_function_header     : local_function_modifier* return_type identifier type_parameter_list?         ( formal_parameter_list? ) type_parameter_constraints_clause*     ; local_function_modifier     : 'async'     | 'unsafe'     ;  local_function_body     : block     | '=>' null_conditional_invocation_expression ';'     | '=>' expression ';'     ; 

推论

全局变量

一个直接的推论是:不存在类似于C/C++中“全局变量”的概念

main函数

由于不存在全局变量或者函数,所以也不存在类似于C/C++的全局main函数入口,所以整个应用(application)的入口只能位于某个class(不特定)内部,语言规定作为必须声明为static public类型。

what if no namespace

从语法上看,namespace并不是必须的,如果没有把声明放在namespace中,那么和C++一样,声明会放在全局globalnamespace中。

栗子

但是,按照语法规范写的代码并不代表就是合法的。例如下面根据语法规范写的代码,大部分都是错误:-(——编程好难啊……

using System;  //命名空间不能直接包含字段或方法之类的成员 int leela = 1;  namespace harry { 	class harry 	{ 		public static int fry(int x, int y) 		{ 			int localfunc() => x + y; 			//只有 assignment、call、increment、decrement 和 new 对象表达式可用作语句 			z => z + 1; 			//error CS0149: 应输入方法名称 			int dd = ((int a) => a + 1)(1); 			return localfunc(); 		} 		public static int Main() 		{ 			return fry(3, 7); 		} 	}; }   namespace tsecer { 	//命名空间不能直接包含字段或方法之类的成员 	void tsecer(){} } 

发表评论

评论已关闭。

相关文章