Praha.pm logo

What is Marpa and when is it needed?

E. Choroba

Praha.pm (Prague)

Currently unemployed

PerlMonks, CPAN, StackOverflow

choroba@matfyz.cz

What is Marpa?

But before we start…

Who was Marpa?

Marpa the Translator

Marpa painting at Holy Isle

What is Marpa?

Marpa::R2 is a parser module by Jeffrey Kegler.

What is a parser?

Classic example

Grammar: Backus-Naur form

Expression ::= Term
             | Expression + Term
             | Expression - Term
Term       ::= Factor
             | Factor * Factor
             | Factor / Factor
Factor     ::= number
             | ( Expression )
             | + Factor
             | - Factor

Parses (or generates) mathematical expressions of arbitrary complexity.

Classic example

Grammar: Backus-Naur form

Expression ::= Term
             | Expression + Term
             | Expression - Term
Term       ::= Factor
             | Factor * Factor
             | Factor / Factor
Factor     ::= number
             | ( Expression )
             | + Factor
             | - Factor
7 + 2 * 3

Classic example

Grammar: Backus-Naur form

Expression ::= Term
             | Expression + Term
             | Expression - Term
Term       ::= Factor
             | Factor * Factor
             | Factor / Factor
Factor     ::= number
             | ( Expression )
             | + Factor
             | - Factor
7 + 2 * 3

Classic example

Grammar: Backus-Naur form

Expression ::= Term
             | Expression + Term
             | Expression - Term
Term       ::= Factor
             | Factor * Factor
             | Factor / Factor
Factor     ::= number
             | ( Expression )
             | + Factor
             | - Factor
7 + 2 * 3

Classic example

Grammar: Backus-Naur form

Expression ::= Term
             | Expression + Term
             | Expression - Term
Term       ::= Factor
             | Factor * Factor
             | Factor / Factor
Factor     ::= number
             | ( Expression )
             | + Factor
             | - Factor
7 + 2 * 3

Classic example

Grammar: Backus-Naur form

Expression ::= Term
             | Expression + Term
             | Expression - Term
Term       ::= Factor
             | Factor * Factor
             | Factor / Factor
Factor     ::= number
             | ( Expression )
             | + Factor
             | - Factor
7 + 2 * 3

Classic example

#!/usr/bin/perl
use warnings;
use strict;

use Marpa::R2;

my $rules = << '__G__';
lexeme default = latm => 1
:default ::= action => ::first

:start     ::= Expression
Expression ::= Term
             | Expression (plus) Term        action => add
             | Expression (minus) Term       action => subtract
Term       ::= Factor
             | Factor (asterisk) Factor      action => multiply
             | Factor (slash) Factor         action => divide
Factor     ::= number
             | (lbrace) Expression (rbrace)
             | (plus) Factor
             | (minus) Factor                action => negate

number     ~ [0-9.]+
plus       ~ '+'
minus      ~ '-'
asterisk   ~ '*'
slash      ~ '/'
lbrace     ~ '('
rbrace     ~ ')'

whitespace ~ [ \t]+
:discard   ~ whitespace

__G__

sub multiply { $_[1] * $_[2] }
sub divide   { $_[1] / $_[2] }
sub add      { $_[1] + $_[2] }
sub subtract { $_[1] - $_[2] }
sub negate   { - $_[1] }

my $input   = shift;
my $grammar = 'Marpa::R2::Scanless::G'->new({ source => \$rules });
my $value   = $grammar->parse(\$input, { semantics_package => 'main' });

print $$value;

Rule precedence

Expression ::= Term
             | Expression (plus) Term        action => add
             | Expression (minus) Term       action => subtract
Term       ::= Factor
             | Factor (asterisk) Factor      action => multiply
             | Factor (slash) Factor         action => divide
Factor     ::= number
             | (lbrace) Expression (rbrace)
             | (plus) Factor
             | (minus) Factor                action => negate

number   ~ [0-9.]+
plus     ~ '+'
minus    ~ '-'
asterisk ~ '*'
slash    ~ '/'
lbrace   ~ '('
rbrace   ~ ')'

Rule precedence

Expression ::= (lbrace) Expression (rbrace)                          assoc => group
             | number
             | (minus) Expression                action => negate
            || Expression (asterisk) Expression  action => multiply
             | Expression (slash) Expression     action => divide
            || Expression (plus) Expression      action => add
             | Expression (minus) Expression     action => subtract




number   ~ [0-9.]+
plus     ~ '+'
minus    ~ '-'
asterisk ~ '*'
slash    ~ '/'
lbrace   ~ '('
rbrace   ~ ')'

No need for Factor and Term.

Adding exponentiation

Expression ::= (lbrace) Expression (rbrace)                          assoc => group
             | number
             | (minus) Expression                action => negate

            || Expression (asterisk) Expression  action => multiply
             | Expression (slash) Expression     action => divide
            || Expression (plus) Expression      action => add
             | Expression (minus) Expression     action => subtract

number   ~ [0-9.]+
plus     ~ '+'
minus    ~ '-'
asterisk ~ '*'
slash    ~ '/'

lbrace   ~ '('
rbrace   ~ ')'

Adding exponentiation

Expression ::= (lbrace) Expression (rbrace)                          assoc => group
             | number
             | (minus) Expression                action => negate
            || Expression (caret) Expression     action => power     assoc => right
            || Expression (asterisk) Expression  action => multiply
             | Expression (slash) Expression     action => divide
            || Expression (plus) Expression      action => add
             | Expression (minus) Expression     action => subtract

number   ~ [0-9.]+
plus     ~ '+'
minus    ~ '-'
asterisk ~ '*'
slash    ~ '/'
caret    ~ '^'
lbrace   ~ '('
rbrace   ~ ')'
Right associative:
4^3^2 = 4^(3^2)
Missing action:
sub power    { $_[1] ** $_[2] }

Adding exponentiation (without priorities)

:start     ::= Expression
Expression ::= Term
             | Expression (plus) Term        action => add
             | Expression (minus) Term       action => subtract
Term       ::= Exp
             | Term (asterisk) Exp           action => multiply
             | Term (slash) Exp              action => divide
Exp        ::= Factor (caret) Exp            action => power
             | Factor
Factor     ::= number
             | (lbrace) Expression (rbrace)
             | (plus) Factor
             | (minus) Factor                action => negate

number     ~ [0-9.]+
plus       ~ '+'
minus      ~ '-'
asterisk   ~ '*'
slash      ~ '/'
lbrace     ~ '('
rbrace     ~ ')'
caret      ~ '^'

Better control of the number format

number     ~ [.0-9]+

0..0

Alligator

Better control of the number format

number     ~ sign_maybe digit_many e
           | sign_maybe digit_any '.' digit_many e_maybe
           | sign_maybe digit_many e_maybe
           | sign_maybe non_zero digit_any

empty      ~
sign_maybe ~ [+-] | empty
digit      ~ [0-9]
non_zero   ~ [1-9]
digit_any  ~ digit*
digit_many ~ digit+
e          ~ [Ee] sign_maybe digit_many
e_maybe    ~ e | empty

Understands the following: -12E+12, .12e5

Add variables

a = 3 * 7;
b = a * 2;
print b + 1

Add variables (grammar)

:start     ::= Program
Program    ::= Statement semicolon Program       action => none
             | Statement                         action => none
Statement  ::= Assign                            action => none
             | Output                            action => none
Assign     ::= Var (eq) Expression               action => store
Output     ::= (print) List                      action => show
List       ::= Expression (comma) List           action => concat
             | Expression
Expression ::= (lbrace) Expression (rbrace)                          assoc => group
             | number
             | (minus) Expression                action => negate
             | Var                               action => interpolate
            || Expression (caret) Expression     action => power     assoc => right
            || Expression (asterisk) Expression  action => multiply
             | Expression (slash) Expression     action => divide
            || Expression (plus) Expression      action => add
             | Expression (minus) Expression     action => subtract
Var        ::= varname

varname    ~ alpha | alpha alnum
alpha      ~ [a-zA-Z]
alnum      ~ [a-zA-Z0-9]+

semicolon  ~ ';'
eq         ~ '='
comma      ~ ','
print      ~ 'print'

Add variables (new actions)

my %vars;
sub none        {}
sub show        { say $_[1] }
sub concat      { $_[1] . $_[2] }
sub store       { $vars{ $_[1] } = $_[2] }
sub interpolate { $vars{ $_[1] } // die "Unknown variable $_[1]" }

Add strings

r = 3 ^ 3; print "The result is: ", r

Add strings (grammar)

:start     ::= Program
Program    ::= Statement semicolon Program       action => none
             | Statement                         action => none
Statement  ::= Assign                            action => none
             | Output                            action => none
Assign     ::= Var (eq) Expression               action => store
Output     ::= (print) List                      action => show
List       ::= Expression (comma) List           action => concat
             | Expression
             | String (comma) List               action => concat
             | String

Expression ::= (lbrace) Expression (rbrace)                          assoc => group
             | number
             | (minus) Expression                action => negate
             | Var                               action => interpolate
            || Expression (caret) Expression     action => power     assoc => right
            || Expression (asterisk) Expression  action => multiply
             | Expression (slash) Expression     action => divide
            || Expression (plus) Expression      action => add
             | Expression (minus) Expression     action => subtract
String     ::= (quote) string (quote)
Var        ::= varname

semicolon  ~ ';'
eq         ~ '='
comma      ~ ','
print      ~ 'print'
string     ~ [^"]*
quote      ~ '"'

Make semicolons optional

a=1 b=2 c=a+b print c

Make semicolons optional (Ruby Slippers)

Make semicolons optional (before)

my $grammar = 'Marpa::R2::Scanless::G'->new({ source => \$rules });
my $value   = $grammar->parse(\$input, { semantics_package => 'main' });

print $$value if defined $$value;

Make semicolons optional (code)

my $grammar = 'Marpa::R2::Scanless::G'->new({source => \$rules});
my $recce   = 'Marpa::R2::Scanless::R'->new({grammar           => $grammar,
                                             semantics_package => 'main',
                                             rejection         => 'event'});
my $last_pos = -1;
for ( $recce->read(\$input);
      $recce->pos < length $input;
      $recce->resume
) {
    if (grep 'semicolon' eq $_, @{ $recce->terminals_expected }) {
        $recce->lexeme_read('semicolon', $recce->pos, 0);
        $last_pos = $recce->pos;
        warn 'Semicolon inserted at ', $last_pos;
    } else {
        die "No lexeme found at ", $recce->pos;
    }
}
my $value = $recce->value;

print $$value if defined $$value;

Me showing Marpa to people

Me using Marpa:

Conclusion

Thank you

https://e-choroba.eu/23-Marpa