Context-free Grammars for Programming Languages

The structure of English is given in terms of subjects, verbs, etc. The structure of a computer program is given in terms of procedures, statements, expressions, etc. For example, an arithmetic expresion consisting of just addition and multiplication can be described using the following rules:

	<expression> ::= <expression> + <term> | <term>
	<term>       ::= <term> * <factor> | <factor>
	<factor>     ::= (  <expression> ) 
	               | <name> | <integer>
	<name>	   ::= <letter> | <name> <letter> | <name> <digit>
	<integer>	   ::= <digit> | <integer> <digit>
	<letter>     ::= A | B | ... |Z 
	<digit>	   ::= 0 | 1 | 2 | ... | 9
Here, we have used "::=" for is defined as rather than an arrow, --> as before. The metalanguage BNF (Backus-Naur Form) is a way of specifying context-free languages, and BNF was originally defined using ::= rather than -->. As long as we understand what is meant and what the capabilities of this grammatical notation are, the notation doesn't matter. We will often omit the angle brackets, <>, when writing BNF.

Unlike natural languages like English, all the legal strings in a programming language can be specified using a context-free grammar. However, grammars for programming languages specofy semantically incorrect strings as well. For example context-free languages cannot be used to tell if a variable, say, A, declared to be of type boolean is used in an arithmentic expression A + 1.

Parse Tree for A * B A parse tree shows the structure of a string using the grammar.

Derivations

Terminals and Nonterminals

Productions

Start Symbol

Sentential Form and Sentence

Extended BNF

Epsilon Productions

Send questions and comments to: Karen Lemone