Skip to content

Latest commit

 

History

History
374 lines (215 loc) · 9.41 KB

File metadata and controls

374 lines (215 loc) · 9.41 KB

tfra.dynamic_embedding.CuckooHashTable

View source on GitHub




Class CuckooHashTable

A generic mutable hash table implementation.

Data can be inserted by calling the insert method and removed by calling the remove method. It does not support initialization via the init method.

Example usage:

table = tfra.dynamic_embedding.CuckooHashTable(key_dtype=tf.string,
                                               value_dtype=tf.int64,
                                               default_value=-1,
                                               device=['/GPU:0'])
sess.run(table.insert(keys, values))
out = table.lookup(query_keys)
print(out.eval())

__init__

View source

__init__(
    key_dtype,
    value_dtype,
    default_value,
    name='CuckooHashTable',
    checkpoint=(True),
    init_size=0,
    config=None
)

Creates an empty CuckooHashTable object.

Creates a table, the type of its keys and values are specified by key_dtype and value_dtype, respectively.

Args:

  • key_dtype: the type of the key tensors.
  • value_dtype: the type of the value tensors.
  • default_value: The value to use if a key is missing in the table.
  • name: A name for the operation (optional).
  • checkpoint: if True, the contents of the table are saved to and restored from checkpoints. If shared_name is empty for a checkpointed table, it is shared using the table node name.
  • init_size: initial size for the Variable and initial size of each hash tables will be int(init_size / N), N is the number of the devices.

Returns:

A CuckooHashTable object.

Raises:

  • ValueError: If checkpoint is True and no name was specified.

Important update!!

We have made updates to the underlying implementation of the CuckooHashTable. The original CPU table remains unchanged, but the GPU table now uses the HKV implementation instead of nvhash. To ensure interface consistency, the init_capacity and max_capacity of HKV will be set to the init_size value you pass in. It is important to note that after this setting, the GPU hash table will not automatically resize, and the final capacity will be the same as the init_size. The max_hbm_for_values parameter of hkv will be set to a sufficiently large number to ensure that all your data is stored in the GPU table. Additionally, hkv has requirements for GPU compute capability, which needs to be 8.0 or above. For more detailed information about HKV, please refer to the documentation of HKV.

Properties

key_dtype

The table key dtype.

name

The name of the table.

resource_handle

Returns the resource handle associated with this Resource.

value_dtype

The table value dtype.

Methods

__getitem__

__getitem__(keys)

Looks up keys in a table, outputs the corresponding values.

accum

View source

accum(
    keys,
    values_or_deltas,
    exists,
    name=None
)

Associates keys with values.

Args:

  • keys: Keys to accmulate. Can be a tensor of any shape. Must match the table's key type.
  • values_or_deltas: values to be associated with keys. Must be a tensor of the same shape as keys and match the table's value type.
  • exists: A bool type tensor indicates if keys already exist or not. Must be a tensor of the same shape as keys.
  • name: A name for the operation (optional).

Returns:

The created Operation.

Raises:

  • TypeError: when keys or values doesn't match the table data types.

clear

View source

clear(name=None)

clear all keys and values in the table.

Args:

  • name: A name for the operation (optional).

Returns:

The created Operation.

export

View source

export(name=None)

Returns tensors of all keys and values in the table.

Args:

  • name: A name for the operation (optional).

Returns:

A pair of tensors with the first tensor containing all keys and the second tensors containing all values in the table.

insert

View source

insert(
    keys,
    values,
    name=None
)

Associates keys with values.

Args:

  • keys: Keys to insert. Can be a tensor of any shape. Must match the table's key type.
  • values: Values to be associated with keys. Must be a tensor of the same shape as keys and match the table's value type.
  • name: A name for the operation (optional).

Returns:

The created Operation.

Raises:

  • TypeError: when keys or values doesn't match the table data types.

lookup

View source

lookup(
    keys,
    dynamic_default_values=None,
    return_exists=(False),
    name=None
)

Looks up keys in a table, outputs the corresponding values.

The default_value is used for keys not present in the table.

Args:

  • keys: Keys to look up. Can be a tensor of any shape. Must match the table's key_dtype.
  • dynamic_default_values: The values to use if a key is missing in the table. If None (by default), the static default_value self._default_value will be used.
  • return_exists: if True, will return a additional Tensor which indicates if or not keys are existing in the table.
  • name: A name for the operation (optional).

Returns:

A tensor containing the values in the same shape as keys using the table's value type.

  • exists: A bool type Tensor of the same shape as keys which indicates if keys are existing in the table. Only provided if return_exists is True.

Raises:

  • TypeError: when keys do not match the table data types.

remove

View source

remove(
    keys,
    name=None
)

Removes keys and its associated values from the table.

If a key is not present in the table, it is silently ignored.

Args:

  • keys: Keys to remove. Can be a tensor of any shape. Must match the table's key type.
  • name: A name for the operation (optional).

Returns:

The created Operation.

Raises:

  • TypeError: when keys do not match the table data types.

size

View source

size(name=None)

Compute the number of elements in this table.

Args:

  • name: A name for the operation (optional).

Returns:

A scalar tensor containing the number of elements in this table.