%{
openBasic_types
;;
letphrase_of_cmd
c
=
match
c
with
"RUN"
->
Run
|
"LIST"
->
List
|
"END"
->
End
|
_
->
failwith
"line : unexpected command"
;;
letbin_op_of_rel
r
=
match
r
with
"="
->
EQUAL
|
"<"
->
INF
|
"<="
->
INFEQ
|
">"
->
SUP
|
">="
->
SUPEQ
|
"<>"
->
DIFF
|
_
->
failwith
"line : unexpected relation symbol"
;;%}
Lexical units are the following:
Their names are self-explanatory and they are described in file basic_lexer.mll (see page ??).
%
token
<
int>
Lint
%
token
<
string>
Lident
%
token
<
string>
Lstring
%
token
<
string>
Lcmd
%
tokenLplus
Lminus
Lmult
Ldiv
Lmod
%
token
<
string>
Lrel
%
tokenLand
Lor
Lneg
%
tokenLpar
Rpar
%
token
<
string>
Lrem
%
tokenLrem
Llet
Lprint
Linput
Lif
Lthen
Lgoto
%
tokenLequal
%
tokenLeol
Precedence rules between operators once again take the values assigned by functions priority_uop and priority_binop defined when first giving the grammar for our Basic (see page ??).
Symbol Lop will be used to process unary minus. It is not a terminal in the grammar, but a ``pseudo non-terminal'' which allows overloading of operators when two uses of an operator should not receive the same precedence depending on context. This is the case with the minus symbol (-). We will reconsider this point once we have specified the rules in the grammar.
%
rightLneg
%
leftLand
Lor
%
leftLequal
Lrel
%
leftLmod
%
leftLplus
Lminus
%
leftLmult
Ldiv
%
nonassocLop
Since the start symbol is line, the function generated will return the syntax tree for the parsed line.
%start line %type <Basic_types.phrase> line
These rules do not call for particular remarks except:
%%
line
:
Lint
inst
Leol
{
Line
{num
=$
1
;inst
=$
2
}}
|
Lcmd
Leol
{
phrase_of_cmd
$
1
}
;
inst
:
Lrem
{
Rem
$
1
}
|
Lgoto
Lint
{
Goto
$
2
}
|
Lprint
exp
{
$
2
}
|
Linput
Lident
{
Input
$
2
}
|
Lif
exp
Lthen
Lint
{
If
(
$
2
,
$
4
)}
|
Llet
Lident
Lequal
exp
{
Let
(
$
2
,
$
4
)}
;
exp
:
Lint
{
ExpInt
$
1
}
|
Lident
{
ExpVar
$
1
}
|
Lstring
{
ExpStr
$
1
}
|
Lneg
exp
{
ExpUnr
(NOT
,
$
2
)}
|
exp
Lplus
exp
{
ExpBin
(
$
1
,
PLUS
,
$
3
)}
|
exp
Lminus
exp
{
ExpBin
(
$
1
,
MINUS
,
$
3
)}
|
exp
Lmult
exp
{
ExpBin
(
$
1
,
MULT
,
$
3
)}
|
exp
Ldiv
exp
{
ExpBin
(
$
1
,
DIV
,
$
3
)}
|
exp
Lmod
exp
{
ExpBin
(
$
1
,
MOD
,
$
3
)}
|
exp
Lequal
exp
{
ExpBin
(
$
1
,
EQUAL
,
$
3
)}
|
exp
Lrel
exp
{
ExpBin
(
$
1
,
(bin_op_of_rel
$
2
),
$
3
)}
|
exp
Land
exp
{
ExpBin
(
$
1
,
AND
,
$
3
)}
|
exp
Lor
exp
{
ExpBin
(
$
1
,
OR
,
$
3
)}
|
Lminus
exp
%
precLop
{
ExpUnr(OPPOSITE
,
$
2
)}
|
Lpar
exp
Rpar
{
$
2
}
;
%%
exp : ... | Lminus exp %prec Lop { ExpUnr(OPPOSITE, $2) }It concerns the use of unary -. Keyword
%prec
that we find in it declares that this rule should receive
the precedence of Lop (here the highest precedence).
{open
Basic_parser
;;
let
string_chars
s
=
String.sub
s
1
((String.length
s)
-
2
);;
}
rulelexer
=
parse
[
' '
'\t'
]
{
lexer
lexbuf
}
|
'\n'
{
Leol
}
|
'!'
{
Lneg
}
|
'&'
{
Land
}
|
'|'
{
Lor
}
|
'='
{
Lequal
}
|
'%'
{
Lmod
}
|
'+'
{
Lplus
}
|
'-'
{
Lminus
}
|
'*'
{
Lmult
}
|
'/'
{
Ldiv
}
|
[
'<'
'>'
]
{
Lrel
(Lexing.lexeme
lexbuf)
}
|
"<="
{
Lrel
(Lexing.lexeme
lexbuf)
}
|
">="
{
Lrel
(Lexing.lexeme
lexbuf)
}
|
"REM"
[
^
'\n'
]*
{
Lrem
(Lexing.lexeme
lexbuf)
}
|
"LET"
{
Llet
}
|
"PRINT"
{
Lprint
}
|
"INPUT"
{
Linput
}
|
"IF"
{
Lif
}
|
"THEN"
{
Lthen
}
|
"GOTO"
{
Lgoto
}
|
"RUN"
{
Lcmd
(Lexing.lexeme
lexbuf)
}
|
"LIST"
{
Lcmd
(Lexing.lexeme
lexbuf)
}
|
"END"
{
Lcmd
(Lexing.lexeme
lexbuf)
}
|
[
'0'
-
'9'
]+
{
Lint
(int_of_string
(Lexing.lexeme
lexbuf))
}
|
[
'A'
-
'z'
]+
{
Lident
(Lexing.lexeme
lexbuf)
}
|
'"'
[
^
'"'
]*
'"'
{
Lstring
(string_chars
(Lexing.lexeme
lexbuf))
}
"REM" [^ '\n']*
). This rule
recognizes keyword REM followed by an arbitrary number of
characters other than '\n'
. The second remark concerns character
strings ('"' [^ '"']* '"'
) considered as sequences of
characters different from "
and contained between two "
.ocamlc -c basic_types.mli ocamlyacc basic_parser.mly ocamllex basic_lexer.mll ocamlc -c basic_parser.mli ocamlc -c basic_lexer.ml ocamlc -c basic_parser.mlWhich will generate files basic_lexer.cmo and basic_parser.cmo which may be linked into an application.
with
match
parse
(input_line
stdin)
with
match
line
lexer
(Lexing.from_string
((input_line
stdin)
^
"\n"
))with
'\n'
which function input_line had filtered
out. This is necessary because the '\n'
character indicates the
end of a command line (Leol).