Skip to content
Jukka Lehtosalo edited this page Mar 30, 2016 · 8 revisions

The mypy parser translates a list of tokens into an abstract syntax tree (AST). (The term parse tree is sometimes used informally as a synonym.)

You can use the parse.py module as a script to experiment with it (I assume that mypy is an alias to the mypy.py script):

$ cd mypy         # mypy repo
$ cat samples/hello.py
print('Hello, world')
$ mypy parse.py samples/hello.py
MypyFile:1(
  samples/hello.py
  ExpressionStmt:1(
    CallExpr:1(
      NameExpr(print)
      Args(
        StrExpr(Hello, world)))))

The names MypyFile, ExpressionStmt, CallExpr, NameExpr and StrExpr refer to AST node classes defined in mypy/nodes.py. The numbers after colons are line numbers.

The parser is implemented manually and is fairly straightforward. The only significant complication is ambiguity in the mypy syntax. For example, set y could be interpreted as a variable declaration or an expression involving < and > operators. The parser gives precedence to variable declarations over expressions, and this works very well. Only the parser deals with ambiguity; later compiler passes do not have to be aware of syntactic ambiguity.

The parser does only a minimal amount of consistency checks. As such it also accepts many invalid programs. The next compiler pass, SemanticAnalyzer, performs additional checks.