Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement changes of function reference proposal #2562

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

zherczeg
Copy link
Contributor

No description provided.

@zherczeg
Copy link
Contributor Author

@sbc100 @keithw since I don't know enough about the project, these are just some random changes in the parser.

The aim is parsing https://github.com/WebAssembly/function-references/blob/main/test/core/call_ref.wast

It is doing something (parsing correctly is surely an exaggeration) with call_ref $ii and ref null $ii. The next item is ref.null $ii. It looks like it will need API changes, since a token type will not be enough. Would adding an index to it be a good idea?

Could you check that these changes makes sense? There is a Type::Reference enum in the code, but function reference introduces two other enums, and does not use Reference. This is strange to me.

Thank you for the help.

@zherczeg
Copy link
Contributor Author

This patch goes to a new direction. The 32 bit type filed of Var is split into two 16 bit fields. The second can be used to store WebAssembly types (negative 7 bit numbers). This way Var can be used in a more generic way.

@zherczeg zherczeg force-pushed the function_ref branch 2 times, most recently from c804a98 to c40c098 Compare March 13, 2025 12:53
@zherczeg
Copy link
Contributor Author

There is something I don't understand. There is a Reference = -0x15 in type.h. However, in
https://github.com/WebAssembly/function-references/blob/main/proposals/function-references/Overview.md
(ref ht) is defined as -0x1c and (ref null ht) is -0x1d. Is the old code obsolete?

The key change of the patch is that Var can also behave like Type, and transformation between the two is possible after names are resolved.

Btw, may I ask how actively maintained the wabt project?

@sbc100
Copy link
Member

sbc100 commented Mar 13, 2025

I believe ht refers to Heap Type which would not be needed for function references. I believe heap types refer to things like structs and arrays in the GC proposal.

@zherczeg
Copy link
Contributor Author

That is possible. Unfortunately not everything is described clearly in the text. For example, ref.null ht: [] -> [(ref null ht)] Which encoding is used for ht here?

https://github.com/WebAssembly/function-references/blob/main/proposals/function-references/Overview.md

@zherczeg
Copy link
Contributor Author

I have tried to track down the v8 implementation for ref.null.

https://github.com/v8/v8/blob/main/src/wasm/function-body-decoder-impl.h#L386

It calls read_heap_type. It is in the same file.

https://github.com/v8/v8/blob/main/src/wasm/function-body-decoder-impl.h#L217

Starts with:
decoder->read_i33v(pc, "heap type");

So it looks like it uses the s33 decoding which the specification mentioned.

@zherczeg zherczeg force-pushed the function_ref branch 5 times, most recently from c94e5a1 to a959acb Compare March 14, 2025 16:37
@zherczeg
Copy link
Contributor Author

I have some good news! This test is fully working now (including the interpreter):
https://github.com/WebAssembly/function-references/blob/main/test/core/call_ref.wast

Only 9 fails remained, they all look like CallRef syntax change related, so some tests need to be updated.

The patch is quite large. There are some parts which are questionable. I will comment about these parts later to help review.

@zherczeg zherczeg force-pushed the function_ref branch 2 times, most recently from 78596d6 to b79aead Compare March 15, 2025 06:49
@@ -181,7 +181,7 @@ class BinaryReaderLogging : public BinaryReaderDelegate {
Result OnCatchExpr(Index tag_index) override;
Result OnCatchAllExpr() override;
Result OnCallIndirectExpr(Index sig_index, Index table_index) override;
Result OnCallRefExpr() override;
Result OnCallRefExpr(Type sig_type) override;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think type sounds better than Index, although it could be an Index.

Index,
Name,
};

struct Var {
// Var can represent variables or types.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a major internal change. It does not increase the size of Var. I think it is better than constructing a Var/Type pair.


Location loc;

private:
void Destroy();

VarType type_;
// Can be set to Type::Enum types, 0 represent no optional type.
int16_t opt_type_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This range should be enough for a long time.

@@ -63,6 +63,7 @@ void WriteS32Leb128(Stream* stream, T value, const char* desc) {
size_t ReadU32Leb128(const uint8_t* p, const uint8_t* end, uint32_t* out_value);
size_t ReadU64Leb128(const uint8_t* p, const uint8_t* end, uint64_t* out_value);
size_t ReadS32Leb128(const uint8_t* p, const uint8_t* end, uint32_t* out_value);
size_t ReadS33Leb128(const uint8_t* p, const uint8_t* end, uint64_t* out_value);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec adds this new type, the output is uint64_t. It could be merged with ReadS32Leb128 by adding a bool* argument.

@@ -220,18 +221,6 @@ class SharedValidator {
Result OnUnreachable(const Location&);

private:
struct FuncType {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is moved to type checker.

@@ -46,6 +46,8 @@ class Type {
FuncRef = -0x10, // 0x70
ExternRef = -0x11, // 0x6f
Reference = -0x15, // 0x6b
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is completely missing from the proposal. Maybe it was removed at some point? Currently Ref and Reference are considered equal, and randomly used one or the other. I wanted to wait a feedback before doing anything with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In v8 it is kStructRefCode:
https://chromium.googlesource.com/v8/v8/+/refs/heads/main/src/wasm/wasm-constants.h#46

Probably used by GC. Probably kArrayRefCode will be needed as well.

Result BinaryReaderLogging::name(Type type) { \
LOGF(#name "(%s)\n", type.GetName().c_str()); \
return reader_->name(type); \
#define DEFINE_TYPE(name) \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this might be reverted. I will check.

@@ -897,7 +897,7 @@ Result BinaryReaderObjdumpDisassemble::OnOpcodeType(Type type) {
if (!in_function_body) {
return Result::Ok;
}
if (current_opcode == Opcode::SelectT) {
if (current_opcode == Opcode::SelectT || current_opcode == Opcode::CallRef) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe more opcodes could go here (ref.null?). This is confusing for me, the two toString variants works differently, although they could be merged.

@@ -288,6 +286,40 @@ size_t ReadS32Leb128(const uint8_t* p,
}
}

size_t ReadS33Leb128(const uint8_t* p,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These readers don't check that the minimum amount of bytes is used for encoding. This is important for utf, not sure for leb.

@@ -545,6 +546,17 @@ Result ResolveFuncTypes(Module* module, Errors* errors) {
}

if (func) {
if (!func->local_type_list.empty()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of a bugfix. Named references has been never resolved for local types. Maybe we could do it on the compressed list, but that can be done in another patch.

@zherczeg zherczeg force-pushed the function_ref branch 4 times, most recently from 6587cf7 to 0f7353f Compare March 16, 2025 04:16
@@ -0,0 +1,19 @@
;;; TOOL: run-interp-spec
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turned out that call_ref.wast is already in the repository, and it is passing now. There are other tests in the directory (e.g. binary.wast) which is also passing now, they could be updated later as well.

@zherczeg zherczeg marked this pull request as ready for review March 16, 2025 04:37
@zherczeg
Copy link
Contributor Author

I think the patch is ready. There are lots of follow-up work ahead.

@@ -189,13 +189,32 @@ void ScriptValidator::PrintError(const Location* loc, const char* format, ...) {
errors_->emplace_back(ErrorLevel::Error, *loc, buffer);
}

static Result CheckType(Type actual, Type expected) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this is only used by comparing expected results. This means "module" context is not present, and expected values are immediate values. So maybe an even restricted checker could work.

@zherczeg zherczeg force-pushed the function_ref branch 3 times, most recently from 58411a5 to bf0400f Compare March 18, 2025 05:06
@zherczeg
Copy link
Contributor Author

Who should I cc to get review for this patch? Thank you!

@@ -259,8 +284,8 @@ struct FuncSignature {
// So to use this type we need to translate its name into
// a proper index from the module type section.
// This is the mapping from parameter/result index to its name.
std::unordered_map<uint32_t, std::string> param_type_names;
std::unordered_map<uint32_t, std::string> result_type_names;
std::unordered_map<uint32_t, Var> param_type_names;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the name is not found, we need to throw an error. A string has no location. I think a Var is nicer than creating a name / location pair.

@zherczeg
Copy link
Contributor Author

This patch and #2567 covers the majority of function reference proposal. There are still missing pieces, such as compile time checking that all non-null references are set before use, and typed tables.

Is there anybody here who can review these patches or how reviewing works here?

@zherczeg
Copy link
Contributor Author

I got no reply for my previous comment. This patch perfectly aligns with the aims of the project, it improves the current implementation, and still there is no feedback. I suspect there is something I don't know. Another project was mentioned in #2551 . Is this project in some kind of a maintenance mode, and improvements are not accepted anymore? There is no problem with negative news as long as it is clearly communicated.

@SoniEx2
Copy link
Collaborator

SoniEx2 commented Mar 24, 2025

expect slow responses, we don't know about other maintainers but we've been a bit busy... will review when we have time

@zherczeg zherczeg force-pushed the function_ref branch 3 times, most recently from 2c100eb to f4dec03 Compare March 25, 2025 13:53
@@ -282,7 +284,7 @@ Result SharedValidator::OnElemSegmentElemType(const Location& loc,
if (elem.is_active) {
// Check that the type of the elem segment matches the table in which
// it is active.
result |= CheckType(loc, elem.table_type, elem_type, "elem segment");
result |= CheckType(loc, elem_type, elem.table_type, "elem segment");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just found this bug. The actual/expected arguments are swapped. The tests are updated accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this is pre-existing bug? If so maybe it can be split out into a separate PR?

@zherczeg zherczeg force-pushed the function_ref branch 2 times, most recently from 1ce2902 to 0f93fb0 Compare March 26, 2025 03:34
Parse (ref index) form, not just (ref $name) form
Resolve references in types and locals
Display location when a named reference is not found
@zherczeg zherczeg force-pushed the function_ref branch 2 times, most recently from 7da12b4 to dae2b75 Compare April 4, 2025 09:52
@zherczeg
Copy link
Contributor Author

zherczeg commented Apr 4, 2025

Patch is rebased on the top of #2571 Thanks to the fixes, it is now < 900 lines of new code including test changes.

@zherczeg zherczeg force-pushed the function_ref branch 2 times, most recently from ced415d to 3e533cb Compare April 4, 2025 11:04
Support named references for globals, locals, tables, elems
Support named references for call_ref, ref_null
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants