Skip to content

Commit 7c40544

Browse files
prattmicgopherbot
authored andcommitted
internal/runtime/maps: use match to skip non-full slots in iteration
Iteration over swissmaps with low load (think map with large hint but only one entry) is signicantly regressed vs old maps. See noswiss vs swiss-tip below (+60%). Currently we visit every single slot and individually check if the slot is full or not. We can do much better by using the control word to find all full slots in a group in a single operation. This lets us skip completely empty groups for instance. Always using the control match approach is great for maps with low load, but is a regression for mostly full maps. Mostly full maps have the majority of slots full, so most calls to mapiternext will return the next slot. In that case, doing the full group match on every call is more expensive than checking the individual slot. Thus we take a hybrid approach: on each call, we first check an individual slot. If that slot is full, we're done. If that slot is non-full, then we fall back to doing full group matches. This trade-off works well. Both mostly empty and mostly full maps perform nearly as well as doing all matching and all individual, respectively. The fast path is placed above the slow path loop rather than combined (with some sort of `useMatch` variable) into a single loop to help the compiler's code generation. The compiler really struggles with code generation on a combined loop for some reason, yielding ~15% additional instructions/op. Comparison with old maps prior to this CL: │ noswiss │ swiss-tip │ │ sec/op │ sec/op vs base │ MapIter/Key=int64/Elem=int64/len=6-12 11.53n ± 2% 10.64n ± 2% -7.72% (p=0.002 n=6) MapIter/Key=int64/Elem=int64/len=64-12 10.180n ± 2% 9.670n ± 5% -5.01% (p=0.004 n=6) MapIter/Key=int64/Elem=int64/len=65536-12 10.78n ± 1% 10.15n ± 2% -5.84% (p=0.002 n=6) MapIterLowLoad/Key=int64/Elem=int64/len=6-12 6.116n ± 2% 6.840n ± 2% +11.84% (p=0.002 n=6) MapIterLowLoad/Key=int64/Elem=int64/len=64-12 2.403n ± 2% 3.892n ± 0% +61.95% (p=0.002 n=6) MapIterLowLoad/Key=int64/Elem=int64/len=65536-12 1.940n ± 3% 3.237n ± 1% +66.81% (p=0.002 n=6) MapPop/Key=int64/Elem=int64/len=6-12 66.20n ± 2% 60.14n ± 3% -9.15% (p=0.002 n=6) MapPop/Key=int64/Elem=int64/len=64-12 97.24n ± 1% 171.35n ± 1% +76.21% (p=0.002 n=6) MapPop/Key=int64/Elem=int64/len=65536-12 826.1n ± 12% 842.5n ± 10% ~ (p=0.937 n=6) geomean 17.93n 20.96n +16.88% After this CL: │ noswiss │ swiss-cl │ │ sec/op │ sec/op vs base │ MapIter/Key=int64/Elem=int64/len=6-12 11.53n ± 2% 10.90n ± 3% -5.42% (p=0.002 n=6) MapIter/Key=int64/Elem=int64/len=64-12 10.180n ± 2% 9.719n ± 9% -4.53% (p=0.043 n=6) MapIter/Key=int64/Elem=int64/len=65536-12 10.78n ± 1% 10.07n ± 2% -6.63% (p=0.002 n=6) MapIterLowLoad/Key=int64/Elem=int64/len=6-12 6.116n ± 2% 7.022n ± 1% +14.82% (p=0.002 n=6) MapIterLowLoad/Key=int64/Elem=int64/len=64-12 2.403n ± 2% 1.475n ± 1% -38.63% (p=0.002 n=6) MapIterLowLoad/Key=int64/Elem=int64/len=65536-12 1.940n ± 3% 1.210n ± 6% -37.67% (p=0.002 n=6) MapPop/Key=int64/Elem=int64/len=6-12 66.20n ± 2% 61.54n ± 2% -7.02% (p=0.002 n=6) MapPop/Key=int64/Elem=int64/len=64-12 97.24n ± 1% 110.10n ± 1% +13.23% (p=0.002 n=6) MapPop/Key=int64/Elem=int64/len=65536-12 826.1n ± 12% 504.7n ± 6% -38.91% (p=0.002 n=6) geomean 17.93n 15.29n -14.74% For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10 Change-Id: Ic07f9df763239e85be57873103df5007144fdaef Reviewed-on: https://go-review.googlesource.com/c/go/+/627156 Auto-Submit: Michael Pratt <[email protected]> Reviewed-by: Keith Randall <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Keith Randall <[email protected]>
1 parent 6e9c56e commit 7c40544

File tree

3 files changed

+268
-73
lines changed

3 files changed

+268
-73
lines changed

src/internal/runtime/maps/group.go

+20-1
Original file line numberDiff line numberDiff line change
@@ -44,11 +44,19 @@ func (b bitset) first() uintptr {
4444
return uintptr(sys.TrailingZeros64(uint64(b))) >> 3
4545
}
4646

47-
// removeFirst removes the first set bit (that is, resets the least significant set bit to 0).
47+
// removeFirst removes the first set bit (that is, resets the least significant
48+
// set bit to 0).
4849
func (b bitset) removeFirst() bitset {
4950
return b & (b - 1)
5051
}
5152

53+
// removeBelow removes all set bits below slot i (non-inclusive).
54+
func (b bitset) removeBelow(i uintptr) bitset {
55+
// Clear all bits below slot i's byte.
56+
mask := (uint64(1) << (8*uint64(i))) - 1
57+
return b &^ bitset(mask)
58+
}
59+
5260
// Each slot in the hash table has a control byte which can have one of three
5361
// states: empty, deleted, and full. They have the following bit patterns:
5462
//
@@ -124,6 +132,17 @@ func (g ctrlGroup) matchEmptyOrDeleted() bitset {
124132
return bitset(v & bitsetMSB)
125133
}
126134

135+
// matchFull returns the set of slots in the group that are full.
136+
func (g ctrlGroup) matchFull() bitset {
137+
// An empty slot is 1000 0000
138+
// A deleted slot is 1111 1110
139+
// A full slot is 0??? ????
140+
//
141+
// A slot is full iff bit 7 is unset.
142+
v := uint64(g)
143+
return bitset(^v & bitsetMSB)
144+
}
145+
127146
// groupReference is a wrapper type representing a single slot group stored at
128147
// data.
129148
//

src/internal/runtime/maps/table.go

+216-72
Original file line numberDiff line numberDiff line change
@@ -584,6 +584,83 @@ func (it *Iter) Elem() unsafe.Pointer {
584584
return it.elem
585585
}
586586

587+
func (it *Iter) nextDirIdx() {
588+
// Skip other entries in the directory that refer to the same
589+
// logical table. There are two cases of this:
590+
//
591+
// Consider this directory:
592+
//
593+
// - 0: *t1
594+
// - 1: *t1
595+
// - 2: *t2a
596+
// - 3: *t2b
597+
//
598+
// At some point, the directory grew to accomodate a split of
599+
// t2. t1 did not split, so entries 0 and 1 both point to t1.
600+
// t2 did split, so the two halves were installed in entries 2
601+
// and 3.
602+
//
603+
// If dirIdx is 0 and it.tab is t1, then we should skip past
604+
// entry 1 to avoid repeating t1.
605+
//
606+
// If dirIdx is 2 and it.tab is t2 (pre-split), then we should
607+
// skip past entry 3 because our pre-split t2 already covers
608+
// all keys from t2a and t2b (except for new insertions, which
609+
// iteration need not return).
610+
//
611+
// We can achieve both of these by using to difference between
612+
// the directory and table depth to compute how many entries
613+
// the table covers.
614+
entries := 1 << (it.m.globalDepth - it.tab.localDepth)
615+
it.dirIdx += entries
616+
it.tab = nil
617+
it.group = groupReference{}
618+
it.entryIdx = 0
619+
}
620+
621+
// Return the appropriate key/elem for key at slotIdx index within it.group, if
622+
// any.
623+
func (it *Iter) grownKeyElem(key unsafe.Pointer, slotIdx uintptr) (unsafe.Pointer, unsafe.Pointer, bool) {
624+
newKey, newElem, ok := it.m.getWithKey(it.typ, key)
625+
if !ok {
626+
// Key has likely been deleted, and
627+
// should be skipped.
628+
//
629+
// One exception is keys that don't
630+
// compare equal to themselves (e.g.,
631+
// NaN). These keys cannot be looked
632+
// up, so getWithKey will fail even if
633+
// the key exists.
634+
//
635+
// However, we are in luck because such
636+
// keys cannot be updated and they
637+
// cannot be deleted except with clear.
638+
// Thus if no clear has occurred, the
639+
// key/elem must still exist exactly as
640+
// in the old groups, so we can return
641+
// them from there.
642+
//
643+
// TODO(prattmic): Consider checking
644+
// clearSeq early. If a clear occurred,
645+
// Next could always return
646+
// immediately, as iteration doesn't
647+
// need to return anything added after
648+
// clear.
649+
if it.clearSeq == it.m.clearSeq && !it.typ.Key.Equal(key, key) {
650+
elem := it.group.elem(it.typ, slotIdx)
651+
if it.typ.IndirectElem() {
652+
elem = *((*unsafe.Pointer)(elem))
653+
}
654+
return key, elem, true
655+
}
656+
657+
// This entry doesn't exist anymore.
658+
return nil, nil, false
659+
}
660+
661+
return newKey, newElem, true
662+
}
663+
587664
// Next proceeds to the next element in iteration, which can be accessed via
588665
// the Key and Elem methods.
589666
//
@@ -698,8 +775,8 @@ func (it *Iter) Next() {
698775
}
699776

700777
// Continue iteration until we find a full slot.
701-
for it.dirIdx < it.m.dirLen {
702-
// Find next table.
778+
for ; it.dirIdx < it.m.dirLen; it.nextDirIdx() {
779+
// Resolve the table.
703780
if it.tab == nil {
704781
dirIdx := int((uint64(it.dirIdx) + it.dirOffset) & uint64(it.m.dirLen-1))
705782
newTab := it.m.directoryAt(uintptr(dirIdx))
@@ -725,7 +802,90 @@ func (it *Iter) Next() {
725802
// N.B. Use it.tab, not newTab. It is important to use the old
726803
// table for key selection if the table has grown. See comment
727804
// on grown below.
728-
for ; it.entryIdx <= it.tab.groups.entryMask; it.entryIdx++ {
805+
806+
if it.entryIdx > it.tab.groups.entryMask {
807+
// Continue to next table.
808+
continue
809+
}
810+
811+
// Fast path: skip matching and directly check if entryIdx is a
812+
// full slot.
813+
//
814+
// In the slow path below, we perform an 8-slot match check to
815+
// look for full slots within the group.
816+
//
817+
// However, with a max load factor of 7/8, each slot in a
818+
// mostly full map has a high probability of being full. Thus
819+
// it is cheaper to check a single slot than do a full control
820+
// match.
821+
822+
entryIdx := (it.entryIdx + it.entryOffset) & it.tab.groups.entryMask
823+
slotIdx := uintptr(entryIdx & (abi.SwissMapGroupSlots - 1))
824+
if slotIdx == 0 || it.group.data == nil {
825+
// Only compute the group (a) when we switch
826+
// groups (slotIdx rolls over) and (b) on the
827+
// first iteration in this table (slotIdx may
828+
// not be zero due to entryOffset).
829+
groupIdx := entryIdx >> abi.SwissMapGroupSlotsBits
830+
it.group = it.tab.groups.group(it.typ, groupIdx)
831+
}
832+
833+
if (it.group.ctrls().get(slotIdx) & ctrlEmpty) == 0 {
834+
// Slot full.
835+
836+
key := it.group.key(it.typ, slotIdx)
837+
if it.typ.IndirectKey() {
838+
key = *((*unsafe.Pointer)(key))
839+
}
840+
841+
grown := it.tab.index == -1
842+
var elem unsafe.Pointer
843+
if grown {
844+
newKey, newElem, ok := it.grownKeyElem(key, slotIdx)
845+
if !ok {
846+
// This entry doesn't exist
847+
// anymore. Continue to the
848+
// next one.
849+
goto next
850+
} else {
851+
key = newKey
852+
elem = newElem
853+
}
854+
} else {
855+
elem = it.group.elem(it.typ, slotIdx)
856+
if it.typ.IndirectElem() {
857+
elem = *((*unsafe.Pointer)(elem))
858+
}
859+
}
860+
861+
it.entryIdx++
862+
it.key = key
863+
it.elem = elem
864+
return
865+
}
866+
867+
next:
868+
it.entryIdx++
869+
870+
// Slow path: use a match on the control word to jump ahead to
871+
// the next full slot.
872+
//
873+
// This is highly effective for maps with particularly low load
874+
// (e.g., map allocated with large hint but few insertions).
875+
//
876+
// For maps with medium load (e.g., 3-4 empty slots per group)
877+
// it also tends to work pretty well. Since slots within a
878+
// group are filled in order, then if there have been no
879+
// deletions, a match will allow skipping past all empty slots
880+
// at once.
881+
//
882+
// Note: it is tempting to cache the group match result in the
883+
// iterator to use across Next calls. However because entries
884+
// may be deleted between calls later calls would still need to
885+
// double-check the control value.
886+
887+
var groupMatch bitset
888+
for it.entryIdx <= it.tab.groups.entryMask {
729889
entryIdx := (it.entryIdx + it.entryOffset) & it.tab.groups.entryMask
730890
slotIdx := uintptr(entryIdx & (abi.SwissMapGroupSlots - 1))
731891

@@ -738,13 +898,32 @@ func (it *Iter) Next() {
738898
it.group = it.tab.groups.group(it.typ, groupIdx)
739899
}
740900

741-
// TODO(prattmic): Skip over groups that are composed of only empty
742-
// or deleted slots using matchEmptyOrDeleted() and counting the
743-
// number of bits set.
901+
if groupMatch == 0 {
902+
groupMatch = it.group.ctrls().matchFull()
744903

745-
if (it.group.ctrls().get(slotIdx) & ctrlEmpty) == ctrlEmpty {
746-
// Empty or deleted.
747-
continue
904+
if slotIdx != 0 {
905+
// Starting in the middle of the group.
906+
// Ignore earlier groups.
907+
groupMatch = groupMatch.removeBelow(slotIdx)
908+
}
909+
910+
// Skip over groups that are composed of only empty or
911+
// deleted slots.
912+
if groupMatch == 0 {
913+
// Jump past remaining slots in this
914+
// group.
915+
it.entryIdx += abi.SwissMapGroupSlots - uint64(slotIdx)
916+
continue
917+
}
918+
919+
i := groupMatch.first()
920+
it.entryIdx += uint64(i - slotIdx)
921+
if it.entryIdx > it.tab.groups.entryMask {
922+
// Past the end of this table's iteration.
923+
continue
924+
}
925+
entryIdx += uint64(i - slotIdx)
926+
slotIdx = i
748927
}
749928

750929
key := it.group.key(it.typ, slotIdx)
@@ -766,40 +945,23 @@ func (it *Iter) Next() {
766945
grown := it.tab.index == -1
767946
var elem unsafe.Pointer
768947
if grown {
769-
var ok bool
770-
newKey, newElem, ok := it.m.getWithKey(it.typ, key)
948+
newKey, newElem, ok := it.grownKeyElem(key, slotIdx)
771949
if !ok {
772-
// Key has likely been deleted, and
773-
// should be skipped.
774-
//
775-
// One exception is keys that don't
776-
// compare equal to themselves (e.g.,
777-
// NaN). These keys cannot be looked
778-
// up, so getWithKey will fail even if
779-
// the key exists.
780-
//
781-
// However, we are in luck because such
782-
// keys cannot be updated and they
783-
// cannot be deleted except with clear.
784-
// Thus if no clear has occurted, the
785-
// key/elem must still exist exactly as
786-
// in the old groups, so we can return
787-
// them from there.
788-
//
789-
// TODO(prattmic): Consider checking
790-
// clearSeq early. If a clear occurred,
791-
// Next could always return
792-
// immediately, as iteration doesn't
793-
// need to return anything added after
794-
// clear.
795-
if it.clearSeq == it.m.clearSeq && !it.typ.Key.Equal(key, key) {
796-
elem = it.group.elem(it.typ, slotIdx)
797-
if it.typ.IndirectElem() {
798-
elem = *((*unsafe.Pointer)(elem))
799-
}
800-
} else {
950+
// This entry doesn't exist anymore.
951+
// Continue to the next one.
952+
groupMatch = groupMatch.removeFirst()
953+
if groupMatch == 0 {
954+
// No more entries in this
955+
// group. Continue to next
956+
// group.
957+
it.entryIdx += abi.SwissMapGroupSlots - uint64(slotIdx)
801958
continue
802959
}
960+
961+
// Next full slot.
962+
i := groupMatch.first()
963+
it.entryIdx += uint64(i - slotIdx)
964+
continue
803965
} else {
804966
key = newKey
805967
elem = newElem
@@ -811,43 +973,25 @@ func (it *Iter) Next() {
811973
}
812974
}
813975

814-
it.entryIdx++
976+
// Jump ahead to the next full slot or next group.
977+
groupMatch = groupMatch.removeFirst()
978+
if groupMatch == 0 {
979+
// No more entries in
980+
// this group. Continue
981+
// to next group.
982+
it.entryIdx += abi.SwissMapGroupSlots - uint64(slotIdx)
983+
} else {
984+
// Next full slot.
985+
i := groupMatch.first()
986+
it.entryIdx += uint64(i - slotIdx)
987+
}
988+
815989
it.key = key
816990
it.elem = elem
817991
return
818992
}
819993

820-
// Skip other entries in the directory that refer to the same
821-
// logical table. There are two cases of this:
822-
//
823-
// Consider this directory:
824-
//
825-
// - 0: *t1
826-
// - 1: *t1
827-
// - 2: *t2a
828-
// - 3: *t2b
829-
//
830-
// At some point, the directory grew to accomodate a split of
831-
// t2. t1 did not split, so entries 0 and 1 both point to t1.
832-
// t2 did split, so the two halves were installed in entries 2
833-
// and 3.
834-
//
835-
// If dirIdx is 0 and it.tab is t1, then we should skip past
836-
// entry 1 to avoid repeating t1.
837-
//
838-
// If dirIdx is 2 and it.tab is t2 (pre-split), then we should
839-
// skip past entry 3 because our pre-split t2 already covers
840-
// all keys from t2a and t2b (except for new insertions, which
841-
// iteration need not return).
842-
//
843-
// We can achieve both of these by using to difference between
844-
// the directory and table depth to compute how many entries
845-
// the table covers.
846-
entries := 1 << (it.m.globalDepth - it.tab.localDepth)
847-
it.dirIdx += entries
848-
it.tab = nil
849-
it.group = groupReference{}
850-
it.entryIdx = 0
994+
// Continue to next table.
851995
}
852996

853997
it.key = nil

0 commit comments

Comments
 (0)