From 49c6cec2fb45c52c361f518de282f2de0ddd4fe0 Mon Sep 17 00:00:00 2001 From: Viresh Gupta Date: Sun, 6 May 2018 14:31:17 +0530 Subject: [PATCH] cEP-0018: Integration of ANTLR in coala Propose integration of ANTLR in coala. Closes https://github.com/coala/cEPs/issues/118 --- README.md | 1 + cEP-0018.md | 274 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 275 insertions(+) create mode 100644 cEP-0018.md diff --git a/README.md b/README.md index 6e7189afa..c1e2b25a8 100644 --- a/README.md +++ b/README.md @@ -18,6 +18,7 @@ in the `cEP-0000.md` document. | [cEP-0012](cEP-0012.md) | coala's Command Line Interface | This cEP describes the design of coala's Command Line Interface (the action selection screen). | | [cEP-0013](cEP-0013.md) | Cobot Enhancement and porting | This cEP describes about the new features that are to be added to cobot as a part of the [GSoC project](https://summerofcode.withgoogle.com/projects/#4913450777051136). | | [cEP-0014](cEP-0014.md) | Generate relevant coafile sections | This cEP proposes a framework for coala-quickstart to generate more relevant `.coafile` sections based on project files and user preferences. | +| [cEP-0018](cEP-0018.md) | Integration of ANTLR into coala | This cEP describes how an API based on ANTLR will be constructed and maintained | | [cEP-0019](cEP-0019.md) | Meta-review System | This cEP describes the details of the process of implementing the meta-review system as a part of the [GSoC'18 project](https://summerofcode.withgoogle.com/projects/#5188493739819008). | | [cEP-0027](cEP-0027.md) | coala Bears Testing API | This cEP describes the implementation process of `BaseTestHelper` class and `GlobalBearTestHelper` class to improve testing API of coala as a part of the [GSoC'18 project](https://summerofcode.withgoogle.com/projects/#6625036551585792). | | [cEP-0021](cEP-0021.md) | Profile Bears | This cEP describes the detailed implementation of a profiling interface as a part of the [GSoC'18 project](https://summerofcode.withgoogle.com/projects/#6109762077327360). | diff --git a/cEP-0018.md b/cEP-0018.md new file mode 100644 index 000000000..e8e500277 --- /dev/null +++ b/cEP-0018.md @@ -0,0 +1,274 @@ +# Integration of ANTLR into coala core + +| Metadata | | +| -------- | --------------------------------------------- | +| cEP | 18 | +| Version | 1.0 | +| Title | Integration of ANTLR into coala core | +| Authors | Viresh Gupta | +| Status | Proposed | +| Type | Feature | + +## Abstract + +This document describes how an API based on ANTLR will be constructed and +maintained. + +## Introduction + +ANTLR provides parsers for various language grammars and thus we can provide +an interface from within coala and make it available to bear writers so that +they can write advanced linting bears. This will aim at supporting the more +flexible visitor based method of AST traversal (as opposed to listener based +mechanisms). + +The proposal is to introduce a parallel concept to the coala-bears library, +which will be called `coala-antlr` hereon. + +## Proposed Change + +Here is the detailed implementation stepwise: + +1. There will be a separate repository named as `coala-antlr` which will + be installable via `pip install coala-antlr`. +2. A new package would be introduced in the coala-antlr repository, + which would maintain all the dependencies for `antlr-ast` based + bears. This would be the `coantlib` package +3. The `coantlib` package will provide endpoints relevant to the + visitor model of traversing the ast generated via ANTLR. +4. Another package named `coantparsers` will be created inside + `coala-antlr` repository. This package will hold parsers and lexers + generated for various languages generated (with python target) beforehand + for the supported set of grammars. +5. `coantlib` will have an `ASTLoader` class that will be responsible + for loading the AST of a given file depending on the user input, using file + extension as a fallback. + For e.g, a .c file will be loaded using the parser from + `coantlib` for c language if no language is specified. +6. Another `ASTWalker` class will be responsible for providing a single method + to resolve the language supplied by the bear and create a walker instance + of the appropriate type. +7. Also `coantlib.walkers` will have several walker classes derived from + `ASTWalker` class that will be language specific and grammar dependant. + For e.g a `Py3ASTWalker`. +8. The bears dependant on the `coantlib` will reside in the same repository as + the `coantlib`. All of them will be in a different package, the `coantbears` + package, all of them installable together via a single option in the + setup.py along with `coantlib`. +9. The bears will be able to use a high level API by defining the `LANGUAGE` + instance variable, which will return to them a walker with high level API's + in case the language is supported. (for e.g if `LANGUAGE='Python'`, they + will automatically get a `BasePyASTWalker` instance in the walker class + variable) + +## Management of the new `coala-antlr` repository + +Managing a new repository is a heavy task, and this will be highly automated as +follows: + +1. The parsers in `coala-antlr/coantparsers` would be generated and + pushed via CI builds whenever a new commit is pushed. +2. The cib tool can be enhanced to deal with the installation of bears that + require only some specified parsers (for e.g a `PyUnusedVarBear` would + only require parser for python). +3. There will be nightly cron jobs for keeping parsers up-to-date. +4. For adding support for a new language, a new Walker class deriving + `ASTWalker` would be added and the rest will be automatically taken care by + the library. + +## Code Samples/Prototypes + +Here is a prototype for the implementations within `coantlib`: + +```python +import antlr4 +import coantparsers +import inspector + +from antlr4 import InputStream +from coalib.bearlib.languages.definitions import * +from coalib.bearlib.languages.Language import parse_lang_str, Languages + + +def resolve_with_package_info(lang): + """ + Use the inspector methods to search amongst all + classes that inherit ASTWalker within this module + and return that class whose supported language matches + """ + + +class ASTLoader(object): + + mapping = { + C: [coantparsers.clexer, coantparsers.cparser], + Python: [coantparsers.pylexer, coantparsers.pyparser], + JavaScript: [coantparsers.jslexer, coantparsers.jsparser], + ... + } + + @staticmethod + def load_file(file, filename, lang='auto'): + """ + Loads file's AST into memory + """ + if(lang == 'auto'): + ext = get_extension(filename) + else: + ext = Languages(parse_lang_str(lang)[0])[0] + if ext is None: + raise FileExtensionNotSupported('Unknown File Type') + inputStream = InputStream(''.join(file)) + lexer = mapping[ext][0](inputStream) + parser = mapping[ext][1](lexer) + return parser + + +class ASTWalker(object): + treeRoot = None + tree = None + + @staticmethod + def make_walker(): + """ + Resolves walker object for appropriate type + """ + if lang == 'auto': + return ASTWalker + else: + return resolve_with_package_info(lang) + + @staticmethod + def get_walker(file, filename, lang): + """ + Returns a walker to the required language + """ + ret_class = make_walker(lang) + if not ret_class: + raise Exception('Un-supported Language') + else: + return ret_class(file, filename) + + def __init__(file, filename): + self.tree = ASTLoader.load_file(file, filename) + self.treeRoot = self.tree + ... +``` + +### Prototype of `Py3ASTWalker` + +These kinds of Walkers will supply a high level walker API + +```python +"""Prototype of Python ASTWalkers.""" + + +class BasePyASTWalker(ASTWalker): + """Base class for Python Walkers. This will be an abstract class.""" + + LANGUAGES = {'Python'} + + def get_imports(): + """Return list of strings containing import statements.""" + + def get_methods(): + """Return list of strings containing all methods from the file.""" + ... + + +class Py2ASTWalker(BasePyASTWalker): + """Python 2 specific walker.""" + + LANGUAGES = {'Python 2'} + + def get_print_statements(): + """Return list of print statements, since print is a keyword in py2.""" + ... + + +class Py3ASTWalker(BasePyASTWalker): + """Python 3 specific walker.""" + + LANGUAGES = {'Python3'} + + def get_integer_div_stmts(): + """ + Return a list of integer divide statements. + + since / and // are different in py3, this is py3 specific + """ + ... +``` + +### Prototype of `ASTBear` class implementation + +```python +"""Prototype of ASTBear class.""" + +from coantlib import ASTWalker +from coalib.bears.LocalBear import Bear + + +class ASTBear(LocalBear): + """ + ASTBear class. + + This class is similar to Local Bear except for the added capability of + initialising walker for use by the instance of this bear. + """ + + walker = None + + def initialise(file, filename): + """Initialise self.walker according to the file and language.""" + if LANGUAGES: + self.walker = ASTWalker.get_walker(file, + filename, + self.LANGUAGES[0]) + else: + self.walker = ASTWalker.get_walker(file, filename, 'auto') + + def run(self, + filename, + file, + tree, + *args, + dependency_results=None, + **kwargs + ): + """Actual run method - to be implemented by bear.""" + raise NotImplementedError('Needs to be done by bear') +``` + +### A test bear + +```python +"""A sample Bear.""" + +from coantlib.ASTBear import ASTBear +from coalib.results.Result import Result + + +class TestBear(ASTBear): + """A test bear that uses walker from coantlib for linting.""" + + def run(self, + filename, + file, + tree, + *args, + dependency_results=None, + **kwargs + ): + """Essence of logic for bear.""" + self.initialise(file, filename) + violations = {} + method_list = self.walker.get_methods() + for method in method_list: + """ + logic for checking naming convention + and yielding an apt result + wherever required + """ + ... +```