Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[basic.indet] Reproducibility of erroneous byte values #7801

Open
frederick-vs-ja opened this issue Mar 28, 2025 · 2 comments
Open

[basic.indet] Reproducibility of erroneous byte values #7801

frederick-vs-ja opened this issue Mar 28, 2025 · 2 comments

Comments

@frederick-vs-ja
Copy link
Contributor

[basic.indet]/1.2 seems a bit too terse to read. Does the paragraph require that each erroneous byte value is reproducible at each time the execution point is reached (possibly even among different executions)?

Can we add an example to clarify that which implementation strategies are allowed and which are not?

@Eisenwave
Copy link
Contributor

Eisenwave commented Mar 28, 2025

I think this wording is defective. The proposal states design intent:

We propose to change the semantics of reading an uninitialized variable:

Default-initialization of an automatic-storage object initializes the object with a fixed value defined by the implementation; [...]

[basic.indet] p1.2 states:

otherwise, the bytes have erroneous values, where each value is determined by the implementation independently of the state of the program.

To me, there is a mismatch here. The value is intended to be fixed, i.e. some kind of constant, but "independently of the state of the program" doesn't achieve that effect. For example, it would be valid to choose a random value, possibly even truly random using some kind of hardware RNG instruction. That is independent of the "state of the program", but not constant, and not deterministic.

What is the "state of the program" anyway? A C++ program is static information; it's a collection of translation units. The execution has state; the program doesn't; the program simply exists.

Last but not least, it seems perfectly valid to do the following:

// The bytes of x have erroneous value, and the implementation uses 0xCC for all.
// The bytes of y have erroneous value, and the implementation uses 0xFF for all.
int x, y;

In other words, you can use constants, but not always the same constant.

tl; dr the proposal wants a fixed erroneous value for all ints or whatever and the wording doesn't seem to guarantee that.

@jensmaurer thoughts?

@t3nsor
Copy link
Contributor

t3nsor commented Apr 1, 2025

The purpose of the "independent of the state of the program" wording is just to make it clear that the implementation can't just give you whatever bytes were at that (concrete machine) memory address before. But we can't say that in standardese, so the wording is a bit weasely.

I think it's important to keep that context in mind. There's a limit to how precise we can make this wording. That also means we probably shouldn't twist ourselves into knots figuring out how to exclude implausible implementation strategies like true RNGs.

I don't know the history of the prose part of the paper, but I think at some point there was a hope that it would be able to eliminate all UB caused by reading erroneous variables, by ensuring that the erroneous value is one that's valid for its type. When it got to CWG, it became clear that we couldn't guarantee this in general, because you can create an unsigned char buffer with initially erroneous values and then construct an object into it whose type isn't known until runtime, and thus the value representation used for the initialization of the buffer might not be valid for that type. It's possible that the "fixed value" wording in the prose is just a relic from when you were supposed to get a fixed value that would depend only on the type (to ensure that it's valid).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants