Skip to content

Commit 2844227

Browse files
Merge pull request #43 from wnienhaus/preprocessor
Preprocessor for ULP/RTC macros
2 parents d880b88 + 47d5e8a commit 2844227

26 files changed

+1490
-69
lines changed

.github/workflows/run_tests.yaml

+9-1
Original file line numberDiff line numberDiff line change
@@ -70,5 +70,13 @@ jobs:
7070
export PATH=$PATH:${{ steps.build_micropython.outputs.bin_dir }}
7171
export PATH=$PATH:${{ steps.build_binutils.outputs.bin_dir }}
7272
cd tests
73-
ln -s ../binutils-esp32ulp # already cloned earlier. reuse.
7473
./01_compat_tests.sh
74+
75+
- name: Run compat tests with RTC macros
76+
id: compat_rtc_tests
77+
run: |
78+
export PATH=$PATH:${{ steps.build_micropython.outputs.bin_dir }}
79+
export PATH=$PATH:${{ steps.build_binutils.outputs.bin_dir }}
80+
cd tests
81+
ln -s ../binutils-esp32ulp # already cloned earlier. reuse.
82+
./02_compat_rtc_tests.sh

AUTHORS

+8
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
E-mail addresses listed here are not intended for support.
2+
3+
py-esp32-ulp authors
4+
--------------------
5+
py-esp32-ulp is written and maintained by Thomas Waldmann and various contributors:
6+
7+
- Thomas Waldmann <[email protected]>
8+
- Wilko Nienhaus <[email protected]>

LICENSE

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
The MIT License (MIT)
22

3-
Copyright (c) 2018 Thomas Waldmann
3+
Copyright 2018-2021 by the py-esp32-ulp authors, see AUTHORS file
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

README.rst

+9
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,18 @@ Status
1717

1818
The most commonly used simple stuff should work.
1919

20+
Expressions in assembly source code are supported and get evaluated during
21+
assembling. Only expressions evaluating to a single integer are supported.
22+
Constants defined with ``.set`` are supported in expressions.
23+
2024
We have some unit tests and also compatibility tests that compare the output
2125
whether it is identical with binutils-esp32ulp output.
2226

27+
There is a simple preprocessor that understands just enough to allow assembling
28+
ULP source files containing convenience macros such as WRITE_RTC_REG. The
29+
preprocessor and how to use it is documented here:
30+
`Preprocessor support <docs/preprocess.rst>`_.
31+
2332
There might be some stuff missing, some bugs and other symptoms of alpha
2433
software. Also, error and exception handling is rather rough yet.
2534

docs/preprocess.rst

+138
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
Preprocessor
2+
---------------------
3+
4+
py-esp32-ulp contains a small preprocessor, which aims to fulfill one goal:
5+
facilitate assembling of ULP code from Espressif and other open-source
6+
projects to loadable/executable machine code without modification.
7+
8+
Such code uses convenience macros (``READ_RTC_*`` and ``WRITE_RTC_*``)
9+
provided by the ESP-IDF framework, along with constants defined in the
10+
framework's include files (such as ``RTC_GPIO_IN_REG``), to make reading
11+
and writing from/to peripheral registers much easier.
12+
13+
In order to do this the preprocessor has two capabilities:
14+
15+
1. Parse and replace identifiers defined with ``#define``
16+
2. Recognise the ``WRITE_RTC_*`` and ``READ_RTC_*`` macros and expand
17+
them in a way that mirrors what the real ESP-IDF macros do.
18+
19+
20+
Usage
21+
------------------------
22+
23+
Normally the assembler is called as follows
24+
25+
.. code-block:: python
26+
27+
src = "..full assembler file contents"
28+
assembler = Assembler()
29+
assembler.assemble(src)
30+
...
31+
32+
With the preprocessor, simply pass the source code via the preprocessor first:
33+
34+
.. code-block:: python
35+
36+
from preprocess import preprocess
37+
38+
src = "..full assembler file contents"
39+
src = preprocess(src)
40+
assembler = Assembler()
41+
assembler.assemble(src)
42+
...
43+
44+
45+
Using a "Defines Database"
46+
--------------------------
47+
48+
Because the py-esp32-ulp assembler was built for running on the ESP32
49+
microcontroller with limited RAM, the preprocessor aims to work there too.
50+
51+
To handle large number of defined constants (such as the ``RTC_*`` constants from
52+
the ESP-IDF) the preprocessor can use a database (based on BerkleyDB) stored on the
53+
device's filesystem for looking up defines.
54+
55+
The database needs to be populated before preprocessing. (Usually, when only using
56+
constants from the ESP-IDF, this is a one-time step, because the include files
57+
don't change.) The database can be reused for all subsequent preprocessor runs.
58+
59+
(The database can also be generated on a PC and then deployed to the ESP32, to
60+
save processing effort on the device. In that case the include files themselves
61+
are not needed on the device either.)
62+
63+
1. Build the defines database
64+
65+
The ``esp32_ulp.parse_to_db`` tool can be used to generate the defines
66+
database from include files. The resulting file will be called
67+
``defines.db``.
68+
69+
(The following assume running on a PC. To do this on device, refer to the
70+
`esp32_ulp/parse_to_db.py <../esp32_ulp/parse_to_db.py>`_ file.)
71+
72+
.. code-block:: bash
73+
74+
# general command
75+
micropython -m esp32_ulp.parse_to_db path/to/include.h
76+
77+
# loading specific ESP-IDF include files
78+
micropython -m esp32_ulp.parse_to_db esp-idf/components/soc/esp32/include/soc/soc_ulp.h
79+
80+
# loading multiple files at once
81+
micropython -m esp32_ulp.parse_to_db esp-idf/components/soc/esp32/include/soc/*.h
82+
83+
# if file system space is not a concern, the following can be convenient
84+
# by including all relevant include files from the ESP-IDF framework.
85+
# This results in an approximately 2MB large database.
86+
micropython -m esp32_ulp.parse_to_db \
87+
esp-idf/components/soc/esp32/include/soc/*.h \
88+
esp-idf/components/esp_common/include/*.h
89+
90+
# most ULP code uses only 5 include files. Parsing only those into the
91+
# database should thus allow assembling virtually all ULP code one would
92+
# find or want to write.
93+
# This results in an approximately 250kB large database.
94+
micropython -m esp32_ulp.parse_to_db \
95+
esp-idf/components/soc/esp32/include/soc/{soc,soc_ulp,rtc_cntl_reg,rtc_io_reg,sens_reg}.h
96+
97+
2. Using the defines database during preprocessing
98+
99+
The preprocessor will automatically use a defines database, when using the
100+
``preprocess.preprocess`` convenience function, even when the database does
101+
not exist (an absent database is treated like an empty database, and care
102+
is taken not to create an empty database file, cluttering up the filesystem,
103+
when not needed).
104+
105+
If you do not want the preprocessor use use a DefinesDB, pass ``False`` to
106+
the ``use_defines_db`` argument of the ``preprocess`` convenience function,
107+
or instantiate the ``Preprocessor`` class directly, without passing it a
108+
DefinesDB instance via ``use_db``.
109+
110+
Design choices
111+
--------------
112+
113+
The preprocessor does not support:
114+
115+
1. Function style macros such as :code:`#define f(a,b) (a+b)`
116+
117+
This is not important, because there are only few RTC macros that need
118+
to be supported and they are simply implemented as Python functions.
119+
120+
Since the preprocessor will understand ``#define`` directives directly in the
121+
assembler source file, include mechanisms are not needed in some cases
122+
(simply copying the needed ``#define`` statements from include files into the
123+
assembler source will work).
124+
125+
2. ``#include`` directives
126+
127+
The preprocessor does not currently follow ``#include`` directives. To
128+
limit space requirements (both in memory and on the filesystem), the
129+
preprocessor relies on a database of defines (key/value pairs). This
130+
database should be populated before using the preprocessor, by using the
131+
``esp32_ulp.parse_to_db`` tool (see section above), which parses include
132+
files for identifiers defined therein.
133+
134+
3. Preserving comments
135+
136+
The assumption is that the output will almost always go into the
137+
assembler directly, so preserving comments is not very useful and
138+
would add a lot of complexity.

esp32_ulp/__main__.py

+3-1
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,16 @@
22

33
from .util import garbage_collect
44

5+
from .preprocess import preprocess
56
from .assemble import Assembler
67
from .link import make_binary
78
garbage_collect('after import')
89

910

1011
def src_to_binary(src):
1112
assembler = Assembler()
12-
assembler.assemble(src)
13+
src = preprocess(src)
14+
assembler.assemble(src, remove_comments=False) # comments already removed by preprocessor
1315
garbage_collect('before symbols export')
1416
addrs_syms = assembler.symbols.export()
1517
for addr, sym in addrs_syms:

esp32_ulp/assemble.py

+41-35
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
"""
44

55
from . import opcodes
6-
from .nocomment import remove_comments
6+
from .nocomment import remove_comments as do_remove_comments
77
from .util import garbage_collect
88

99
TEXT, DATA, BSS = 'text', 'data', 'bss'
@@ -12,13 +12,10 @@
1212

1313

1414
class SymbolTable:
15-
def __init__(self, symbols, bases):
15+
def __init__(self, symbols, bases, globals):
1616
self._symbols = symbols
1717
self._bases = bases
18-
self._pass = None
19-
20-
def set_pass(self, _pass):
21-
self._pass = _pass
18+
self._globals = globals
2219

2320
def set_bases(self, bases):
2421
self._bases = bases
@@ -32,38 +29,28 @@ def get_from(self):
3229
def set_sym(self, symbol, stype, section, value):
3330
entry = (stype, section, value)
3431
if symbol in self._symbols and entry != self._symbols[symbol]:
35-
raise Exception('redefining symbol %s with different value %r -> %r.' % (label, self._symbols[symbol], entry))
32+
raise Exception('redefining symbol %s with different value %r -> %r.' % (symbol, self._symbols[symbol], entry))
3633
self._symbols[symbol] = entry
3734

3835
def has_sym(self, symbol):
3936
return symbol in self._symbols
4037

4138
def get_sym(self, symbol):
42-
try:
43-
entry = self._symbols[symbol]
44-
except KeyError:
45-
if self._pass == 1:
46-
entry = (REL, TEXT, 0) # for a dummy, this is good enough
47-
else:
48-
raise
39+
entry = self._symbols[symbol]
4940
return entry
5041

5142
def dump(self):
5243
for symbol, entry in self._symbols.items():
5344
print(symbol, entry)
5445

55-
def export(self):
56-
addrs_syms = [(self.resolve_absolute(entry), symbol) for symbol, entry in self._symbols.items()]
46+
def export(self, incl_non_globals=False):
47+
addrs_syms = [(self.resolve_absolute(entry), symbol)
48+
for symbol, entry in self._symbols.items()
49+
if incl_non_globals or symbol in self._globals]
5750
return sorted(addrs_syms)
5851

5952
def to_abs_addr(self, section, offset):
60-
try:
61-
base = self._bases[section]
62-
except KeyError:
63-
if self._pass == 1:
64-
base = 0 # for a dummy this is good enough
65-
else:
66-
raise
53+
base = self._bases[section]
6754
return base + offset
6855

6956
def resolve_absolute(self, symbol):
@@ -93,16 +80,19 @@ def resolve_relative(self, symbol):
9380
from_addr = self.to_abs_addr(self._from_section, self._from_offset)
9481
return sym_addr - from_addr
9582

83+
def set_global(self, symbol):
84+
self._globals[symbol] = True
85+
pass
86+
9687

9788
class Assembler:
9889

99-
def __init__(self, symbols=None, bases=None):
100-
self.symbols = SymbolTable(symbols or {}, bases or {})
90+
def __init__(self, symbols=None, bases=None, globals=None):
91+
self.symbols = SymbolTable(symbols or {}, bases or {}, globals or {})
10192
opcodes.symbols = self.symbols # XXX dirty hack
10293

10394
def init(self, a_pass):
10495
self.a_pass = a_pass
105-
self.symbols.set_pass(a_pass)
10696
self.sections = dict(text=[], data=[])
10797
self.offsets = dict(text=0, data=0, bss=0)
10898
self.section = TEXT
@@ -118,7 +108,7 @@ def parse_line(self, line):
118108
"""
119109
if not line:
120110
return
121-
has_label = line[0] not in '\t '
111+
has_label = line[0] not in '\t .'
122112
if has_label:
123113
label_line = line.split(None, 1)
124114
if len(label_line) == 2:
@@ -150,8 +140,10 @@ def append_section(self, value, expected_section=None):
150140
if expected_section is not None and s is not expected_section:
151141
raise TypeError('only allowed in %s section' % expected_section)
152142
if s is BSS:
153-
# just increase BSS size by value
154-
self.offsets[s] += value
143+
if int.from_bytes(value, 'little') != 0:
144+
raise ValueError('attempt to store non-zero value in section .bss')
145+
# just increase BSS size by length of value
146+
self.offsets[s] += len(value)
155147
else:
156148
self.sections[s].append(value)
157149
self.offsets[s] += len(value)
@@ -231,9 +223,12 @@ def d_align(self, align=4, fill=None):
231223
self.fill(self.section, amount, fill)
232224

233225
def d_set(self, symbol, expr):
234-
value = int(expr) # TODO: support more than just integers
226+
value = int(opcodes.eval_arg(expr))
235227
self.symbols.set_sym(symbol, ABS, None, value)
236228

229+
def d_global(self, symbol):
230+
self.symbols.set_global(symbol)
231+
237232
def append_data(self, wordlen, args):
238233
data = [int(arg).to_bytes(wordlen, 'little') for arg in args]
239234
self.append_section(b''.join(data))
@@ -245,6 +240,11 @@ def d_word(self, *args):
245240
self.append_data(2, args)
246241

247242
def d_long(self, *args):
243+
self.d_int(*args)
244+
245+
def d_int(self, *args):
246+
# .long and .int are identical as per GNU assembler documentation
247+
# https://sourceware.org/binutils/docs/as/Long.html
248248
self.append_data(4, args)
249249

250250
def assembler_pass(self, lines):
@@ -263,16 +263,22 @@ def assembler_pass(self, lines):
263263
continue
264264
else:
265265
# machine instruction
266-
func = getattr(opcodes, 'i_' + opcode, None)
266+
func = getattr(opcodes, 'i_' + opcode.lower(), None)
267267
if func is not None:
268-
instruction = func(*args)
268+
# during the first pass, symbols are not all known yet.
269+
# so some expressions may not evaluate to something (yet).
270+
# instruction building requires sane arguments however.
271+
# since all instructions are 4 bytes long, we simply skip
272+
# building instructions during pass 1, and append an "empty
273+
# instruction" to the section to get the right section size.
274+
instruction = 0 if self.a_pass == 1 else func(*args)
269275
self.append_section(instruction.to_bytes(4, 'little'), TEXT)
270276
continue
271-
raise Exception('Unknown opcode or directive: %s' % opcode)
277+
raise ValueError('Unknown opcode or directive: %s' % opcode)
272278
self.finalize_sections()
273279

274-
def assemble(self, text):
275-
lines = remove_comments(text)
280+
def assemble(self, text, remove_comments=True):
281+
lines = do_remove_comments(text) if remove_comments else text.splitlines()
276282
self.init(1) # pass 1 is only to get the symbol table right
277283
self.assembler_pass(lines)
278284
self.symbols.set_bases(self.compute_bases())

0 commit comments

Comments
 (0)