[basic.indet] Reproducibility of erroneous byte values #7801

frederick-vs-ja · 2025-03-28T14:05:50Z

[basic.indet]/1.2 seems a bit too terse to read. Does the paragraph require that each erroneous byte value is reproducible at each time the execution point is reached (possibly even among different executions)?

Can we add an example to clarify that which implementation strategies are allowed and which are not?

Eisenwave · 2025-03-28T19:47:17Z

I think this wording is defective. The proposal states design intent:

We propose to change the semantics of reading an uninitialized variable:

Default-initialization of an automatic-storage object initializes the object with a fixed value defined by the implementation; [...]

[basic.indet] p1.2 states:

otherwise, the bytes have erroneous values, where each value is determined by the implementation independently of the state of the program.

To me, there is a mismatch here. The value is intended to be fixed, i.e. some kind of constant, but "independently of the state of the program" doesn't achieve that effect. For example, it would be valid to choose a random value, possibly even truly random using some kind of hardware RNG instruction. That is independent of the "state of the program", but not constant, and not deterministic.

What is the "state of the program" anyway? A C++ program is static information; it's a collection of translation units. The execution has state; the program doesn't; the program simply exists.

Last but not least, it seems perfectly valid to do the following:

// The bytes of x have erroneous value, and the implementation uses 0xCC for all.
// The bytes of y have erroneous value, and the implementation uses 0xFF for all.
int x, y;

In other words, you can use constants, but not always the same constant.

tl; dr the proposal wants a fixed erroneous value for all ints or whatever and the wording doesn't seem to guarantee that.

@jensmaurer thoughts?

t3nsor · 2025-04-01T15:03:51Z

The purpose of the "independent of the state of the program" wording is just to make it clear that the implementation can't just give you whatever bytes were at that (concrete machine) memory address before. But we can't say that in standardese, so the wording is a bit weasely.

I think it's important to keep that context in mind. There's a limit to how precise we can make this wording. That also means we probably shouldn't twist ourselves into knots figuring out how to exclude implausible implementation strategies like true RNGs.

I don't know the history of the prose part of the paper, but I think at some point there was a hope that it would be able to eliminate all UB caused by reading erroneous variables, by ensuring that the erroneous value is one that's valid for its type. When it got to CWG, it became clear that we couldn't guarantee this in general, because you can create an unsigned char buffer with initially erroneous values and then construct an object into it whose type isn't known until runtime, and thus the value representation used for the initialization of the buffer might not be valid for that type. It's possible that the "fixed value" wording in the prose is just a relic from when you were supposed to get a fixed value that would depend only on the type (to ensure that it's valid).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[basic.indet] Reproducibility of erroneous byte values #7801

[basic.indet] Reproducibility of erroneous byte values #7801

frederick-vs-ja commented Mar 28, 2025

Eisenwave commented Mar 28, 2025 •

edited

Loading

t3nsor commented Apr 1, 2025

[basic.indet] Reproducibility of erroneous byte values #7801

[basic.indet] Reproducibility of erroneous byte values #7801

Comments

frederick-vs-ja commented Mar 28, 2025

Eisenwave commented Mar 28, 2025 • edited Loading

t3nsor commented Apr 1, 2025

Eisenwave commented Mar 28, 2025 •

edited

Loading