Parsing | Set 2 (Bottom Up or Shift Reduce Parsers)

In this article, we are discussing the Bottom Up parser.
Bottom Up Parsers / Shift Reduce Parsers
Build the parse tree from leaves to root. Bottom-up parsing can be defined as an attempt to reduce the input string w to the start symbol of grammar by tracing out the rightmost derivations of w in reverse.

Classification of bottom up parsers

A general shift reduce parsing is LR parsing. The L stands for scanning the input from left to right and R stands for constructing a rightmost derivation in reverse.
Benefits of LR parsing:

  1. Many programming languages using some variations of an LR parser. It should be noted that C++ and Perl are exceptions to it.

  2. LR Parser can be implemented very efficiently.

  3. Of all the Parsers that scan their symbols from left to right, LR Parsers detect syntactic errors, as soon as possible.

Here we will look at the construction of GOTO graph of grammar by using all the four LR parsing techniques. For solving questions in GATE we have to construct the GOTO directly for the given grammar to save time.

LR(0) Parser
We need two functions –

Augmented Grammar
If G is a grammar with start symbol S then G’, the augmented grammar for G, is the grammar with new start symbol S’ and a production S’ -> S. The purpose of this new starting production is to indicate to the parser when it should stop parsing and announce acceptance of input.
Let a grammar be S -> AA
A -> aA | b
The augmented grammar for the above grammar will be
S’ -> S
S -> AA
A -> aA | b

LR(0) Items
An LR(0) is the item of a grammar G is a production of G with a dot at some position in the right side.
S -> ABC yields four items
S -> .ABC
S -> A.BC
S -> AB.C
S -> ABC.
The production A -> ε generates only one item A -> .ε

Closure Operation:
If I is a set of items for a grammar G, then closure(I) is the set of items constructed from I by the two rules:

  1. Initially every item in I is added to closure(I).
  2. If A -> α.Bβ is in closure(I) and B -> γ is a production then add the item B -> .γ to I, If it is not already there. We apply this rule until no more items can be added to closure(I).


Goto Operation :
Goto(I, X) = 1. Add I by moving dot after X.
                      2. Apply closure to first step.

Construction of GOTO graph-

  • State I0 – closure of augmented LR(0) item
  • Using I0 find all collection of sets of LR(0) items with the help of DFA
  • Convert DFA to LR(0) parsing table

Construction of LR(0) parsing table:

  • The action function takes as arguments a state i and a terminal a (or $ , the input end marker). The value of ACTION[i, a] can have one of four forms:
    1. Shift j, where j is a state.
    2. Reduce A -> β.
    3. Accept
    4. Error
  • We extend the GOTO function, defined on sets of items, to states: if GOTO[Ii , A] = Ij then GOTO also maps a state i and a nonterminal A to state j.

Consider the grammar S ->AA
A -> aA | b
Augmented grammar S’ -> S
S -> AA
A -> aA | b

The LR(0) parsing table for above GOTO graph will be –

Action part of the table contains all the terminals of the grammar whereas the goto part contains all the nonterminals. For every state of goto graph we write all the goto operations in the table. If goto is applied to a terminal than it is written in the action part if goto is applied on a nonterminal it is written in goto part. If on applying goto a production is reduced ( i.e if the dot reaches at the end of production and no further closure can be applied) then it is denoted as Ri and if the production is not reduced (shifted) it is denoted as Si.
If a production is reduced it is written under the terminals given by follow of the left side of the production which is reduced for ex: in I5 S->AA is reduced so R1 is written under the terminals in follow(S)={$} (To know more about how to calculate follow function: Click here ) in LR(0) parser.
If in a state the start symbol of grammar is reduced it is written under $ symbol as accepted.

NOTE: If in any state both reduced and shifted productions are present or two reduced productions are present it is called a conflict situation and the grammar is not LR grammar.

This article is attributed to GeeksforGeeks.org

leave a comment



load comments

Subscribe to Our Newsletter