-
Notifications
You must be signed in to change notification settings - Fork 15
/
Copy pathBUGS
74 lines (53 loc) · 3.04 KB
/
BUGS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
Note: only historical bugs are listed in this file.
For up-to-date buglist, see https://github.com/cosmos72/stmx/issues
KNOWN BUGS
see https://github.com/cosmos72/stmx/issues
32-bit CCL: "#<bogus object @ 0x...> is not of required type short-float" in retry-funs
FIXED BUGS:
7) fixed race condition in GV6/%UPDATE-STAT - it was also the cause of bug 6
6) with HW-TRANSACTIONS enabled, test suite no longer hangs - thanks to bugfix 7.
5) consistent reads were not fully guaranteed. The implementation allowed
transactions to read inconsistent TVARs in some circumstances.
Reason: when reading TVARs, version must be read twice (and depending on the compiler/CPU,
other things are needed as well).
See doc/consistent-reads.md for a full solution, which has been implemented.
4) initargs of transactional classes were wrapped in TVARs multiple times,
depending on the length of list returned by (closer-mop:slot-definition-initargs slot)
3) when a transaction signals an error, (run-atomic) was calling (valid? log) without locking,
so it could get spurious "invalid" answers and unnecessarily re-run the transaction
(only a waste of resources, not a bug)
but it could even get spurious "valid" answers in case read TVARs match the current values
partially before another thread commits, and partially after.
In the latter case, (run-atomic) would propagate the error to the caller
=> BUG, it must instead re-run the transaction.
How to fix: when a transaction signals an error, also validate the log WITH locking.
2) (wait-once) was returning without removing the log from tvars
waiting list. Each tvar would remove all the waiting logs when it
notified it has changed, but unchanged vars could accumulate a LOT
of enqueued logs, leaking memory.
Solution: replaced tvar waiting queue with a hash-table, so now
(wait-once) before returning explicitly removes its log from tvars
waiting list.
1) (commit) could call (condition-notify) too early, before the relevant thread
is sleeping in (condition-wait) inside (wait-once).
Solution: since we cannot keep locks on tlogs while also locking
tvars (DEADLOCK!), we added a flag 'prevent-sleep to tlog,
and always read/write it while holding a lock on the tlog,
then we do in (commit):
(with-lock-held (lock-of log)
(setf (prevent-sleep-of log) t)
(condition-notify log (lock-of log)))
and in (wait-once):
(with-lock-held (lock-of log)
(setf (prevent-sleep-of log) nil) ;; needed? should be the initial value
;; ... loop on (reads-of log) to enqueue on their waiting list,
;; WHILE checking if they changed ...
(with-lock-held (lock-of log)
(unless (prevent-sleep-of log)
(condition-wait log (lock-of log))))
NOT BUGS
4) (retry) triggers a call to (valid? log) without locking, so in theory it may
get spurious "valid" answers exactly like in bug 3. Is it a bug or not?
Not a bug.
Reason: if tlog appears valid, (wait-tlog) will repeat validity check
with locking before actually sleeping.