Skip to content

Latest commit

 

History

History
478 lines (295 loc) · 12.7 KB

File metadata and controls

478 lines (295 loc) · 12.7 KB

tfra.dynamic_embedding.Variable

View source on GitHub




Class Variable

A Distributed version of HashTable(reference from lookup_ops.MutableHashTable)

It is designed to dynamically store the Sparse Weights(Parameters) of DLRMs.

__init__

View source

__init__(
    key_dtype=dtypes.int64,
    value_dtype=dtypes.float32,
    dim=1,
    devices=None,
    partitioner=default_partition_fn,
    shared_name=None,
    name='DynamicEmbedding_Variable',
    initializer=None,
    trainable=(True),
    checkpoint=(True),
    init_size=0,
    kv_creator=None,
    restrict_policy=None,
    bp_v2=(False)
)

Creates an empty Variable object.

Creates a group of tables placed on devices specified by devices, and the device placement mechanism of TensorFlow will be ignored, the type of its keys and values are specified by key_dtype and value_dtype, respectively. The environment variables 'TF_HASHTABLE_INIT_SIZE' can be used to set the inital size of each tables, which can help reduce rehash times. The default initial table size is 8,192

Args:

  • key_dtype: the type of the key tensors.
  • value_dtype: the type of the value tensors.
  • dim: the length of the value array for each key, on GPUs, dim should be less or equal to 200.
  • devices: the list of devices holding the tables. One table will be created on each device. By default, devices is ['/CPU:0'] and when GPU is available, devices is ['/GPU:0']
  • partitioner: partition function of keys, return the partition index for each key.

Example partition func:

def default_partition_fn(keys, shard_num):
  return tf.cast(keys % shard_num, dtype=tf.int32)
  • shared_name: No used.
  • name: A name for the operation (optional).
  • initializer: The value to use if a key is missing in the hash table. which can be a python number, numpy array or tf.initializer instances. If initializer is None (the default), 0 will be taken.
  • trainable: Bool. If true, the variable will be treated as a trainable. Default is true.
  • checkpoint: if True, the contents of the SparseVariable are saved to and restored from checkpoints. If shared_name is empty for a checkpointed table, it is shared using the table node name.
  • init_size: initial size for the Variable and initial size of each hash tables will be int(init_size / N), N is the number of the devices.
  • restrict_policy: a restrict policy to specify the rule to restrict the size of variable. If in training program, the variable is updated by optimizer, then the sparse slot variables in optimizer are also be restricted.
  • bp_v2: By default with bp_v2=False, the optimizer will update dynamic embedding values by setting (key, value) after optimizer.apply_gradient. If one key is used by multiple workers at the same time, only one of them will be seen, while the others are overwritten. By setting bp_v2=True, the optimizer will update parameters by adding delta instead of setting, which solves the race condition problem among workers during backpropagation in large-scale distributed asynchronous training.

Returns:

A Variable object.

Properties

restrict_policy

tables

trainable_store

Methods

accum

View source

accum(
    keys,
    old_values,
    new_values,
    exists,
    name=None
)

Insert keys with values if not exist, or accumulate a delta value new_values - old_values to 'keys'. This API will help relieve stale gradient problem in asynchronous training.

Args:

  • keys: Keys to insert. Can be a tensor of any shape. Must match the table's key type.
  • old_values: old values to be associated with keys. Must be a tensor of arrays with same shape as keys and match the table's value type.
  • new_values: new values to be associated with keys. Must be a tensor of arrays with same shape as keys and match the table's value type.
  • exists: A bool type tensor indicates if keys existed or not. Must be a tensor of the same shape as keys.
  • name: A name for the operation (optional).

Returns:

The created Operation.

Raises:

  • TypeError: when keys or values doesn't match the table data types.

clear

View source

clear(name=None)

clear all keys and values in the table.

Args:

  • name: A name for the operation (optional).

Returns:

The created Operation.

export

View source

export(name=None)

Returns tensors of all keys and values in the table.

Args:

  • name: A name for the operation (optional).

Returns:

A pair of tensors with the first tensor containing all keys and the second tensors containing all values in the table.

get_slot_variables

View source

get_slot_variables(optimizer)

Get slot variables from optimizer. If Variable is trained by optimizer, then it returns the variables in slots of optimizer, else return an empty list.

Args:

  • optimizer: An optimizer under tf.keras.optimizers or tf.compat.v1.train.

Returns:

List of slot Variables in optimizer.

get_trainable_by_name

View source

get_trainable_by_name(name)

Get trainable shadow variable when using eager execution.

Example:

from tensorflow_recommenders_addons import dynamic_embedding as de
init = tf.keras.initializers.RandomNormal()
params = de.get_variable('foo', dim=4, initializer=init)
optimizer = tf.keras.optimizers.Adam(1E-3)
optimizer = de.DynamicEmbeddingOptimizer(optimizer)

@tf.function
def loss_fn(ids):
  emb = de.embedding_lookup(params, ids, name='user_embedding')
  emb = tf.math.reduce_sum(emb, axis=1)
  loss = tf.reduce_mean(emb)
  return loss

for i in range(10):
  optimizer.minimize(lambda: loss_fn(ids),
                     var_list=[params.get_eager_trainable_by_name('user_embedding')])

Args:

  • name: str. Name used to get the trainable shadow to the Variable.

Returns:

A ShadowVariable object refers to the specific name.

Raises:

  • RuntimeError: if not in eager mode.

lookup

View source

lookup(
    keys,
    return_exists=(False),
    name=None
)

Looks up keys in a Variable, outputs the corresponding values.

The default_value is used for keys not present in the table.

Args:

  • keys: Keys to look up. Can be a tensor of any shape. Must match the table's key_dtype.
  • return_exists: if True, will return a additional Tensor which indicates if keys are existing in the table.
  • name: A name for the operation (optional).

Returns:

A tensor containing the values in the same shape as keys using the table's value type.

  • exists: A bool type Tensor of the same shape as keys which indicates if keys are existing in the table. Only provided if return_exists is True.

remove

View source

remove(
    keys,
    name=None
)

Removes keys and its associated values from the variable.

If a key is not present in the table, it is silently ignored.

Args:

  • keys: Keys to remove. Can be a tensor of any shape. Must match the table's key type.
  • name: A name for the operation (optional).

Returns:

The created Operation.

Raises:

  • TypeError: when keys do not match the table data types.

restrict

View source

restrict(
    num_reserved,
    **kwargs
)

Restrict the size of self, also including features reside in commensal slots, and the policy status. The restriction rule follow the setting in restrict_policy.

Args:

  • num_reserved: int. Number of remaining features after restriction.
  • **kwargs: keyword arguments passing to restrict_policy.apply_restriction.

Returns:

An operation to restrict size of the variable itself. Return None if the restrict policy is not set.

size

View source

size(
    index=None,
    name=None
)

Compute the number of elements in the index-th table of this Variable.

If index is none, the total size of the Variable wil be return.

Args:

  • index: The index of table (optional)
  • name: A name for the operation (optional).

Returns:

A scalar tensor containing the number of elements in this Variable.

upsert

View source

upsert(
    keys,
    values,
    name=None
)

Insert or Update keys with values.

If key exists already, value will be updated.

Args:

  • keys: Keys to insert. Can be a tensor of any shape. Must match the table's key type.
  • values: Values to be associated with keys.Must be a tensor of arrays with same shape as keys and match the table's value type.
  • name: A name for the operation (optional).

Returns:

The created Operation.

Raises:

  • TypeError: when keys or values doesn't match the table data types.