Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preprocessor for ULP/RTC macros #43

Merged
merged 29 commits into from
Aug 9, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
79db90f
add units test for the .set directive
wnienhaus Jul 22, 2021
84d734d
add support for left aligned assembler directives (e.g. .set)
wnienhaus Jul 22, 2021
ec81ecc
fix a crash bug where BSS size calculation was attempted on the value…
wnienhaus Jul 22, 2021
c184924
raise error when attempting to store values in .bss section
wnienhaus Jul 29, 2021
25d34b0
fix reference to non-existing variable
wnienhaus Jul 22, 2021
76a81ac
fix typo in comment of instruction definition
wnienhaus Jul 22, 2021
56f4530
add support for the .global directive. only symbols flagged as global…
wnienhaus Jul 22, 2021
9907b10
let SymbolTable.export() optionally export non-global symbols too
wnienhaus Jul 22, 2021
27ab850
support ULP opcodes in upper case
wnienhaus Jul 22, 2021
54b117e
add a compatibility test for the recent fixes and improvements
wnienhaus Jul 22, 2021
feb42dc
add support for evaluating expressions
wnienhaus Jul 22, 2021
87507c9
add a compatibility test for evaluating expressions
wnienhaus Jul 23, 2021
99352a3
docs: add that expressions are now supported
wnienhaus Jul 29, 2021
d76fd26
add preprocessor that can replace simple #define values in code
wnienhaus Jul 23, 2021
4dded94
allow assembler to skip comment removal to avoid removing comments twice
wnienhaus Aug 7, 2021
219f939
fix evaluation of expressions during first assembler pass
wnienhaus Jul 25, 2021
5c3eeb8
remove no-longer-needed pass dependent code from SymbolTable
wnienhaus Jul 26, 2021
3e8c0d5
add support for macros such as WRITE_RTC_REG
wnienhaus Jul 26, 2021
ac1de99
add simple include file processing
wnienhaus Jul 26, 2021
8d88fd1
add support for using a btree database (DefinesDB) to store defines f…
wnienhaus Jul 27, 2021
46f1442
add special handling for the BIT macro used in the esp-idf framework
wnienhaus Jul 27, 2021
2f6ee78
add include processor tool for populating a defines.db from include f…
wnienhaus Jul 28, 2021
69ae946
add compatibility tests using good example code off the net
wnienhaus Jul 28, 2021
4f90f76
add documentation for the preprocessor
wnienhaus Jul 29, 2021
d44384f
fix use of treg field in i_move instruction to match binutils-esp32 o…
wnienhaus Jul 28, 2021
254adf9
allow specifying the address for reg_rd and reg_wr in 32-bit words
wnienhaus Jul 28, 2021
c3bd101
support .int data type
wnienhaus Jul 29, 2021
2a0a39a
refactor: small improvements based on PR comments.
wnienhaus Aug 9, 2021
47d5e8a
Updated LICENSE file and added AUTHORS file
wnienhaus Aug 9, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion .github/workflows/run_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -70,5 +70,13 @@ jobs:
export PATH=$PATH:${{ steps.build_micropython.outputs.bin_dir }}
export PATH=$PATH:${{ steps.build_binutils.outputs.bin_dir }}
cd tests
ln -s ../binutils-esp32ulp # already cloned earlier. reuse.
./01_compat_tests.sh

- name: Run compat tests with RTC macros
id: compat_rtc_tests
run: |
export PATH=$PATH:${{ steps.build_micropython.outputs.bin_dir }}
export PATH=$PATH:${{ steps.build_binutils.outputs.bin_dir }}
cd tests
ln -s ../binutils-esp32ulp # already cloned earlier. reuse.
./02_compat_rtc_tests.sh
8 changes: 8 additions & 0 deletions AUTHORS
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
E-mail addresses listed here are not intended for support.

py-esp32-ulp authors
--------------------
py-esp32-ulp is written and maintained by Thomas Waldmann and various contributors:

- Thomas Waldmann <[email protected]>
- Wilko Nienhaus <[email protected]>
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
The MIT License (MIT)

Copyright (c) 2018 Thomas Waldmann
Copyright 2018-2021 by the py-esp32-ulp authors, see AUTHORS file

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
9 changes: 9 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,18 @@ Status

The most commonly used simple stuff should work.

Expressions in assembly source code are supported and get evaluated during
assembling. Only expressions evaluating to a single integer are supported.
Constants defined with ``.set`` are supported in expressions.

We have some unit tests and also compatibility tests that compare the output
whether it is identical with binutils-esp32ulp output.

There is a simple preprocessor that understands just enough to allow assembling
ULP source files containing convenience macros such as WRITE_RTC_REG. The
preprocessor and how to use it is documented here:
`Preprocessor support <docs/preprocess.rst>`_.

There might be some stuff missing, some bugs and other symptoms of alpha
software. Also, error and exception handling is rather rough yet.

Expand Down
138 changes: 138 additions & 0 deletions docs/preprocess.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
Preprocessor
---------------------

py-esp32-ulp contains a small preprocessor, which aims to fulfill one goal:
facilitate assembling of ULP code from Espressif and other open-source
projects to loadable/executable machine code without modification.

Such code uses convenience macros (``READ_RTC_*`` and ``WRITE_RTC_*``)
provided by the ESP-IDF framework, along with constants defined in the
framework's include files (such as ``RTC_GPIO_IN_REG``), to make reading
and writing from/to peripheral registers much easier.

In order to do this the preprocessor has two capabilities:

1. Parse and replace identifiers defined with ``#define``
2. Recognise the ``WRITE_RTC_*`` and ``READ_RTC_*`` macros and expand
them in a way that mirrors what the real ESP-IDF macros do.


Usage
------------------------

Normally the assembler is called as follows

.. code-block:: python

src = "..full assembler file contents"
assembler = Assembler()
assembler.assemble(src)
...

With the preprocessor, simply pass the source code via the preprocessor first:

.. code-block:: python

from preprocess import preprocess

src = "..full assembler file contents"
src = preprocess(src)
assembler = Assembler()
assembler.assemble(src)
...


Using a "Defines Database"
--------------------------

Because the py-esp32-ulp assembler was built for running on the ESP32
microcontroller with limited RAM, the preprocessor aims to work there too.

To handle large number of defined constants (such as the ``RTC_*`` constants from
the ESP-IDF) the preprocessor can use a database (based on BerkleyDB) stored on the
device's filesystem for looking up defines.

The database needs to be populated before preprocessing. (Usually, when only using
constants from the ESP-IDF, this is a one-time step, because the include files
don't change.) The database can be reused for all subsequent preprocessor runs.

(The database can also be generated on a PC and then deployed to the ESP32, to
save processing effort on the device. In that case the include files themselves
are not needed on the device either.)

1. Build the defines database

The ``esp32_ulp.parse_to_db`` tool can be used to generate the defines
database from include files. The resulting file will be called
``defines.db``.

(The following assume running on a PC. To do this on device, refer to the
`esp32_ulp/parse_to_db.py <../esp32_ulp/parse_to_db.py>`_ file.)

.. code-block:: bash

# general command
micropython -m esp32_ulp.parse_to_db path/to/include.h

# loading specific ESP-IDF include files
micropython -m esp32_ulp.parse_to_db esp-idf/components/soc/esp32/include/soc/soc_ulp.h

# loading multiple files at once
micropython -m esp32_ulp.parse_to_db esp-idf/components/soc/esp32/include/soc/*.h

# if file system space is not a concern, the following can be convenient
# by including all relevant include files from the ESP-IDF framework.
# This results in an approximately 2MB large database.
micropython -m esp32_ulp.parse_to_db \
esp-idf/components/soc/esp32/include/soc/*.h \
esp-idf/components/esp_common/include/*.h

# most ULP code uses only 5 include files. Parsing only those into the
# database should thus allow assembling virtually all ULP code one would
# find or want to write.
# This results in an approximately 250kB large database.
micropython -m esp32_ulp.parse_to_db \
esp-idf/components/soc/esp32/include/soc/{soc,soc_ulp,rtc_cntl_reg,rtc_io_reg,sens_reg}.h

2. Using the defines database during preprocessing

The preprocessor will automatically use a defines database, when using the
``preprocess.preprocess`` convenience function, even when the database does
not exist (an absent database is treated like an empty database, and care
is taken not to create an empty database file, cluttering up the filesystem,
when not needed).

If you do not want the preprocessor use use a DefinesDB, pass ``False`` to
the ``use_defines_db`` argument of the ``preprocess`` convenience function,
or instantiate the ``Preprocessor`` class directly, without passing it a
DefinesDB instance via ``use_db``.

Design choices
--------------

The preprocessor does not support:

1. Function style macros such as :code:`#define f(a,b) (a+b)`

This is not important, because there are only few RTC macros that need
to be supported and they are simply implemented as Python functions.

Since the preprocessor will understand ``#define`` directives directly in the
assembler source file, include mechanisms are not needed in some cases
(simply copying the needed ``#define`` statements from include files into the
assembler source will work).

2. ``#include`` directives

The preprocessor does not currently follow ``#include`` directives. To
limit space requirements (both in memory and on the filesystem), the
preprocessor relies on a database of defines (key/value pairs). This
database should be populated before using the preprocessor, by using the
``esp32_ulp.parse_to_db`` tool (see section above), which parses include
files for identifiers defined therein.

3. Preserving comments

The assumption is that the output will almost always go into the
assembler directly, so preserving comments is not very useful and
would add a lot of complexity.
4 changes: 3 additions & 1 deletion esp32_ulp/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,16 @@

from .util import garbage_collect

from .preprocess import preprocess
from .assemble import Assembler
from .link import make_binary
garbage_collect('after import')


def src_to_binary(src):
assembler = Assembler()
assembler.assemble(src)
src = preprocess(src)
assembler.assemble(src, remove_comments=False) # comments already removed by preprocessor
garbage_collect('before symbols export')
addrs_syms = assembler.symbols.export()
for addr, sym in addrs_syms:
Expand Down
76 changes: 41 additions & 35 deletions esp32_ulp/assemble.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"""

from . import opcodes
from .nocomment import remove_comments
from .nocomment import remove_comments as do_remove_comments
from .util import garbage_collect

TEXT, DATA, BSS = 'text', 'data', 'bss'
Expand All @@ -12,13 +12,10 @@


class SymbolTable:
def __init__(self, symbols, bases):
def __init__(self, symbols, bases, globals):
self._symbols = symbols
self._bases = bases
self._pass = None

def set_pass(self, _pass):
self._pass = _pass
self._globals = globals

def set_bases(self, bases):
self._bases = bases
Expand All @@ -32,38 +29,28 @@ def get_from(self):
def set_sym(self, symbol, stype, section, value):
entry = (stype, section, value)
if symbol in self._symbols and entry != self._symbols[symbol]:
raise Exception('redefining symbol %s with different value %r -> %r.' % (label, self._symbols[symbol], entry))
raise Exception('redefining symbol %s with different value %r -> %r.' % (symbol, self._symbols[symbol], entry))
self._symbols[symbol] = entry

def has_sym(self, symbol):
return symbol in self._symbols

def get_sym(self, symbol):
try:
entry = self._symbols[symbol]
except KeyError:
if self._pass == 1:
entry = (REL, TEXT, 0) # for a dummy, this is good enough
else:
raise
entry = self._symbols[symbol]
return entry

def dump(self):
for symbol, entry in self._symbols.items():
print(symbol, entry)

def export(self):
addrs_syms = [(self.resolve_absolute(entry), symbol) for symbol, entry in self._symbols.items()]
def export(self, incl_non_globals=False):
addrs_syms = [(self.resolve_absolute(entry), symbol)
for symbol, entry in self._symbols.items()
if incl_non_globals or symbol in self._globals]
return sorted(addrs_syms)

def to_abs_addr(self, section, offset):
try:
base = self._bases[section]
except KeyError:
if self._pass == 1:
base = 0 # for a dummy this is good enough
else:
raise
base = self._bases[section]
return base + offset

def resolve_absolute(self, symbol):
Expand Down Expand Up @@ -93,16 +80,19 @@ def resolve_relative(self, symbol):
from_addr = self.to_abs_addr(self._from_section, self._from_offset)
return sym_addr - from_addr

def set_global(self, symbol):
self._globals[symbol] = True
pass


class Assembler:

def __init__(self, symbols=None, bases=None):
self.symbols = SymbolTable(symbols or {}, bases or {})
def __init__(self, symbols=None, bases=None, globals=None):
self.symbols = SymbolTable(symbols or {}, bases or {}, globals or {})
opcodes.symbols = self.symbols # XXX dirty hack

def init(self, a_pass):
self.a_pass = a_pass
self.symbols.set_pass(a_pass)
self.sections = dict(text=[], data=[])
self.offsets = dict(text=0, data=0, bss=0)
self.section = TEXT
Expand All @@ -118,7 +108,7 @@ def parse_line(self, line):
"""
if not line:
return
has_label = line[0] not in '\t '
has_label = line[0] not in '\t .'
if has_label:
label_line = line.split(None, 1)
if len(label_line) == 2:
Expand Down Expand Up @@ -150,8 +140,10 @@ def append_section(self, value, expected_section=None):
if expected_section is not None and s is not expected_section:
raise TypeError('only allowed in %s section' % expected_section)
if s is BSS:
# just increase BSS size by value
self.offsets[s] += value
if int.from_bytes(value, 'little') != 0:
raise ValueError('attempt to store non-zero value in section .bss')
# just increase BSS size by length of value
self.offsets[s] += len(value)
else:
self.sections[s].append(value)
self.offsets[s] += len(value)
Expand Down Expand Up @@ -231,9 +223,12 @@ def d_align(self, align=4, fill=None):
self.fill(self.section, amount, fill)

def d_set(self, symbol, expr):
value = int(expr) # TODO: support more than just integers
value = int(opcodes.eval_arg(expr))
self.symbols.set_sym(symbol, ABS, None, value)

def d_global(self, symbol):
self.symbols.set_global(symbol)

def append_data(self, wordlen, args):
data = [int(arg).to_bytes(wordlen, 'little') for arg in args]
self.append_section(b''.join(data))
Expand All @@ -245,6 +240,11 @@ def d_word(self, *args):
self.append_data(2, args)

def d_long(self, *args):
self.d_int(*args)

def d_int(self, *args):
# .long and .int are identical as per GNU assembler documentation
# https://sourceware.org/binutils/docs/as/Long.html
self.append_data(4, args)

def assembler_pass(self, lines):
Expand All @@ -263,16 +263,22 @@ def assembler_pass(self, lines):
continue
else:
# machine instruction
func = getattr(opcodes, 'i_' + opcode, None)
func = getattr(opcodes, 'i_' + opcode.lower(), None)
if func is not None:
instruction = func(*args)
# during the first pass, symbols are not all known yet.
# so some expressions may not evaluate to something (yet).
# instruction building requires sane arguments however.
# since all instructions are 4 bytes long, we simply skip
# building instructions during pass 1, and append an "empty
# instruction" to the section to get the right section size.
instruction = 0 if self.a_pass == 1 else func(*args)
self.append_section(instruction.to_bytes(4, 'little'), TEXT)
continue
raise Exception('Unknown opcode or directive: %s' % opcode)
raise ValueError('Unknown opcode or directive: %s' % opcode)
self.finalize_sections()

def assemble(self, text):
lines = remove_comments(text)
def assemble(self, text, remove_comments=True):
lines = do_remove_comments(text) if remove_comments else text.splitlines()
self.init(1) # pass 1 is only to get the symbol table right
self.assembler_pass(lines)
self.symbols.set_bases(self.compute_bases())
Expand Down
Loading