Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Harness Sonic's speed and PreTouch for deep copying objects with Copy method #729

Open
scr-oath opened this issue Jan 11, 2025 · 5 comments

Comments

@scr-oath
Copy link

Some ad hoc testing shows that using something like this for deep copy beats many of the libraries that are actually built for this purpose (by a factor of at least 20 for github.com/mohae/deepcopy but over close to 100 for others). I know benchmarking is tricky but I do believe that sonic is quite performant, and I wonder if using it for copy can be faster still if the interim json representation can be skipped.

I would love to see a copy method in some form func Copy(src, dst any) err seems like a reasonable pattern.

Alternatively, is there any support for transforming to/from ast.Node from/to interfaces? (it looked like getting the ast.Node was only possible from json []byte via sonic.Get, and from it using node.MarshalJson()).

func BenchmarkSonic(b *testing.B) {
	for i := 0; i < b.N; i++ {
		data, err := sonic.Marshal(auctionResponse)
		require.NoError(b, err)

		var to *hookstage.AuctionResponsePayload
		require.NoError(b, sonic.Unmarshal(data, &to))

		require.NotSame(b, auctionResponse.BidResponse, to.BidResponse)
	}
}
@scr-oath
Copy link
Author

It may be that ast's intention to be "self-contained" means that it doesn't benefit from all of the Pretouch and asm optimizations… in any case, some mechanism of copying without serialization would be really helpful, and most likely the fastest mechanism available.

@AsterDY
Copy link
Collaborator

AsterDY commented Jan 13, 2025

what's the purpose of deep-copying? ast.Node uses pass-by-reference syntax to keep efficient.

@scr-oath
Copy link
Author

scr-oath commented Feb 4, 2025

Ok, so here's a mini-dive into the use-case… The context is in the open source project https://github.com/prebid/prebid-server - it is a server that handles advertising auctions using the OpenRTB json schema.

Use case: duplicating work to many workers in parallel for advertising bidding using prebid

A client (browser, mobile, video player) sends information about the page and all "impressions" for that page, which may have formats, sizes, and other information to provide insights for advertisers to both decide a price to "bid" as well as to provide their "advertising creatives" that fit in that slot.

Because this server's job is to fan out to the various advertisers (or intermediary SSP's and DSP's, which, themselves forward requests), a slice of impressions (large, deep structures) are, today, shallow copied into slices destined for each bidder. When this is done in massively parallel fashion, and when there are any hooks configured to do mutations on impressions, it is possible to have contention issues (there is no locking at the moment).

Observation: sonic.Marshal/Unmarshal is faster than many/any of the alternatives dedicated to "deepcopy"

Several of the libraries from the Awesome Go were tried and sonic.Marshal/Unmarshal seems to beat them all, I suspect because it has or can have the PreTouch "JIT" compilation so that reflection doesn't need to be performed for subsequent invocations.

Benchmarks

I have some results I can share, but it would take time to share the actual benchmarks code, but

Name Library
BenchmarkCopierDeep http://github.com/jinzhu/copier
BenchmarkDeepcopy http://github.com/barkimedes/go-deepcopy
BenchmarkMohaeDC http://github.com/mohae/deepcopy
BenchmarkSonic http://github.com/bytedance/sonic
BenchmarkCopierDeep
BenchmarkCopierDeep-48            	      55	  20612264 ns/op
BenchmarkDeepcopy
BenchmarkDeepcopy-48              	     100	  10627142 ns/op
BenchmarkMohaeDC
BenchmarkMohaeDC-48               	     159	   7561499 ns/op
BenchmarkSonic
BenchmarkSonic-48                 	    3634	    316686 ns/op
BenchmarkSonicCopy

@scr-oath
Copy link
Author

scr-oath commented Feb 4, 2025

So it seems that using sonic outperforms others by a factor of 20-60x

Here's code that we have to do the copy and it's pretty fast already, but I'm wondering whether, if sonic innards could go straight from any to any, whether it could be faster still - it wouldn't have to generate the []byte data, but could directly harness whatever it does with Pretouch to go directly into place.

// SonicCopy copies src to dst using sonic.Marshal and sonic.Unmarshal.
// NOTE: numbers assigned to any interface may change their type - e.g. int(1) -> float64(1)
func SonicCopy[D ~*S, S any](dst D, src S) error {
	data, err := sonic.Marshal(src)
	if err != nil {
		return errors.WrapPrefix(err, "failed to marshal src", 0)
	}
	if err = sonic.Unmarshal(data, dst); err != nil {
		return errors.WrapPrefix(err, "failed to unmarshal dst", 0)
	}
	return nil
}

// SonicCopyWithAPI copies src to dst using sonic.Marshal and sonic.Unmarshal using the given api
// NOTE: numbers assigned to any interface may change their type - e.g. int(1) -> float64(1)
func SonicCopyWithAPI[D ~*S, S any](dst D, src S, api sonic.API) error {
	data, err := api.Marshal(src)
	if err != nil {
		return errors.WrapPrefix(err, "failed to marshal src", 0)
	}
	if err = api.Unmarshal(data, dst); err != nil {
		return errors.WrapPrefix(err, "failed to unmarshal dst", 0)
	}
	return nil
}

@xiaost
Copy link
Collaborator

xiaost commented Feb 5, 2025

@scr-oath Thanks for you feedback.

yes, as you shared the best solution would be deep copy without generating []byte.

but it would be another issue that we have to maintain the optimised deep copy code.

generating []byte is necessary (as of now) coz by default sonic will keep ref like string or []byte to the generated bytes instead of creating new objects, and this makes sonic fast, so the generated []byte matters even after Unmarshal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants