Roadmap for RaggedTensor? #20073

swamidass · 2024-07-31T18:45:01Z

Keras v3 does not work with ragged inputs. What is the roadmap for including this feature?

hertschuh · 2024-08-09T23:41:40Z

Thank you for the report. We currently do not have a clear roadmap for full ragged support. Part of the issue is that this is specific to Tensorflow.

Note that we do have partial support.

Feeding ragged tensors via a tf.data.Dataset, a generator or a PyDataset should generally work.
You do not need to specify ragged=true in the keras.Input.
A number of operations for which Tensorflow supports ragged should work (e.g. concatenate).

For instance:

import tensorflow as tf
import keras

x1 = tf.ragged.constant([[1, 2], [3], [4, 5, 6]])
x2 = tf.ragged.constant([[7], [8, 9, 10], [11, 12]])

model1 = keras.Sequential([keras.layers.Concatenate(axis=0)])
print(model1([x1, x2]))
print(model1.predict([x1, x2]))

input1 = keras.Input(shape=(None,))
input2 = keras.Input(shape=(None,))
output = keras.layers.Concatenate(axis=0)([input1, input2])
model2 = keras.Model(inputs=[input1, input2], outputs=output)
print(model2([x1, x2]))
print(model2.predict([x1, x2]))

Given this. I'm curious to hear what roadblocks you're hitting.

Thanks!

shkarupa-alex · 2024-08-12T06:18:49Z

@hertschuh we can't specify ragged=true in the keras.Input. but we steel NEED this in some case.

E.g. i have a custom layer called 'ToDense' which makes dense any non-dense (ragged or sparse) tensor.
In functional model input for this layer is KerasTensor and right now i have no way to determine if it is ragged. So i don't know should i use tf.sparse.to_dense or .to_tensor method.

hertschuh · 2024-08-12T17:40:33Z

@shkarupa-alex ,

You don't need to know whether it is ragged or sparse or dense at functional model creation time. You can simply do isinstance, it will take the correct branch at tracing time.

class ToDense(keras.Layer):
  def call(self, x):
    if isinstance(x, tf.RaggedTensor):
      return x.to_tensor()
    else:
      return keras.ops.convert_to_tensor(x, sparse=False)  # this densifies sparse tensors

This will work as a workaround for now. Eventually, we want to be able to replace this with:

return keras.ops.convert_to_tensor(x, sparse=False, ragged=False)

shkarupa-alex · 2024-08-13T05:49:49Z

@hertschuh , you are wrong. I need to know. Here is an example https://colab.research.google.com/drive/1oplwYOYZEiytHlClcwt3ucWeJ5M-uA46#scrollTo=Q_5CCb17RgmI

As you can see in functional model inputs are KerasTensor which have flag "sparse" but have no flag "ragged"

hertschuh · 2024-08-13T16:16:28Z

@shkarupa-alex ,

Just in general:

you should never need to handle KerasTensors in a special way
you should never need to test if they're sparse, ragged etc.
instead, you should use keras.ops so that KerasTensors are handled transparently
you cannot use tf operations on KerasTensors, so you need to limit tf operations to blocks like if isinstance(x, tf.RaggedTensor)
actual operations on KerasTensors are unimportant, nothing is actually calculated, only the shape (which is why you don't need to test if they're sparse or ragged).
you should never import from keras.src, this is internal only and is subject to change between versions. You should always import the public symbols: from keras import layers.

With all that, if you rewrite your code this way, it works:

import tensorflow as tf
import keras

class ToDense(keras.layers.Layer):
    """Layer that makes padding and masking a Composite Tensors effortless.
    The layer takes a RaggedTensor or a SparseTensor and converts it to a
    uniform tensor by right-padding it or filling in missing values.

    Arguments:
      pad_value: A value used to pad and fill in the missing values. Should be
        a meaningless value for the input data.
      mask: A Boolean value representing whether to mask the padded values.
        If true, no any downstream Masking layer or Embedding layer with
        mask_zero=True should be added. Default is 'False'.

    Input shape: Any Ragged or Sparse Tensor is accepted, but it requires the
      type of input to be specified via the Input or InputLayer from the
      Keras API.

    Output shape: The output is a uniform tensor having the same shape, in case
      of a ragged input or the same dense shape, in case of a sparse input.
    """

    def __init__(self, pad_value, mask=False, **kwargs):
        super(ToDense, self).__init__(**kwargs)
        self.pad_value = pad_value
        self.mask = mask

    def call(self, inputs):
        print('call', isinstance(inputs, tf.RaggedTensor))
        print('call', type(inputs))

        if isinstance(inputs, tf.RaggedTensor):
            outputs = inputs.to_tensor(default_value=self.pad_value)
        elif isinstance(inputs, tf.SparseTensor):
            outputs = tf.sparse.to_dense(inputs, default_value=self.pad_value)
        else:
            outputs = inputs

        return outputs

    def compute_mask(self, inputs, mask=None):
        print('mask', isinstance(inputs, tf.RaggedTensor))
        print('mask', type(inputs))

        if not self.mask:
            return None

        if isinstance(inputs, tf.RaggedTensor):
            mask = tf.ones_like(inputs.flat_values, "bool")
            mask = inputs.with_flat_values(mask)
            mask = mask.to_tensor(False)
        elif isinstance(inputs, tf.SparseTensor):
            mask = tf.ones_like(inputs.values, "bool")
            mask = inputs.with_values(mask)
            mask = tf.sparse.to_dense(mask, default_value=False)
        else:
            mask = keras.ops.ones_like(inputs, "bool")

        mask = keras.ops.any(mask, axis=-1)

        return mask

    def compute_output_shape(self, input_shape):
        return input_shape

    def get_config(self):
        config = super(ToDense, self).get_config()
        config.update({"pad_value": self.pad_value, "mask": self.mask})

        return config

shkarupa-alex · 2024-08-14T06:37:07Z

@hertschuh all you talking is good in theory. But in practice...

you should never need to handle KerasTensors in a special way
you should never need to test if they're sparse, ragged etc.
instead, you should use keras.ops so that KerasTensors are handled transparently
you cannot use tf operations on KerasTensors, so you need to limit tf operations to blocks like if isinstance(x, tf.RaggedTensor)

But if i don't want to support other backends right now (usually because keras.ops miss required ops e.g. #20046) it should be possible to write layer for single (TF) backend.

What should all we do if our models written and trained for keras V2 used RaggedTensors?

actual operations on KerasTensors are unimportant, nothing is actually calculated, only the shape (which is why you don't need to test if they're sparse or ragged).

But we need to choose right op...

you should never import from keras.src, this is internal only and is subject to change between versions. You should always import the public symbols: from keras import layers.

As i wrote here keras-team/tf-keras#202 current public api is not perfect. I'm tired of importing some things from public api and other from private. I just gave up and always import from keras.src im my libraries.

All i want to say in this issue called "Roadmap for RaggedTensor": WE NEED RAGGED TENSORS.
I understand that it may be a long way, but right now it will be enough to revert back "ragged=True/False" property into Input layer. After that people like me can continue to write and use our ugly, backend[TF]-dependant custom layers.

hertschuh · 2024-08-16T00:45:25Z

it should be possible to write layer for single (TF) backend.

Yes, you can. In the code above, you can replace the keras.ops with the TF ones.

But we need to choose right op...

KerasTensor is not the mechanism to chose the right op. In my example above, KerasTensors are never used. And ragged=True doesn't actually do anything. Only ragged=False does something, it tells Keras 2 to densify ragged tensors.

What should all we do if our models written and trained for keras V2 used RaggedTensors?

Taking a step back. Why are you trying to migrate to keras V3? (if you're not trying to support multiple backends)

shkarupa-alex · 2024-08-19T07:24:43Z

Taking a step back. Why are you trying to migrate to keras V3? (if you're not trying to support multiple backends)

There are 3 reasons:

One day when keras.ops will support all required ops for my custom layers and models i will support multiple backends
As mentioned here https://github.com/tensorflow/tensorflow/releases/tag/v2.16.1 Keras 3.0 will be the default Keras version. You may need to update your script to use Keras 3.0.
Some bugs that still present in TF-Keras are already fixed in Keras V3

swamidass · 2024-08-19T17:18:50Z

So, the pipeline that works fine in Keras 2:

dataset is provided in ragged tensors, which allows fit shuffle and other sorts of alignment issue to work.
First step in forward method is to convert the ragged tensors into value and length vectors.
Next, all remaining computation is done using segment functions
Finally, output is converted from value+length format into ragged tensors

I don't know how to do this in Keras 3. Is there a way I am missing?

There are several new errors that arise in this pipeline in Keras 3:

KerasTensors don't have row_lengths() method https://www.tensorflow.org/api_docs/python/tf/RaggedTensor#row_lengths
KerasTensors don't have values property
RaggedTensor.from_row_lengths doesn't work

github-actions bot assigned sachinprasadhs Jul 31, 2024

sachinprasadhs added keras-team-review-pending Pending review by a Keras team member. backend:tensorflow and removed backend:tensorflow labels Jul 31, 2024

hertschuh removed the keras-team-review-pending Pending review by a Keras team member. label Aug 9, 2024

hertschuh self-assigned this Aug 9, 2024

sachinprasadhs added stat:awaiting keras-eng Awaiting response from Keras engineer type:feature The user is asking for a new feature. labels Aug 14, 2024

dhantule mentioned this issue Jan 7, 2025

similar functions for from_tensor to_tensor from ragged api #20731

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roadmap for RaggedTensor? #20073

Roadmap for RaggedTensor? #20073

swamidass commented Jul 31, 2024

hertschuh commented Aug 9, 2024

shkarupa-alex commented Aug 12, 2024

hertschuh commented Aug 12, 2024

shkarupa-alex commented Aug 13, 2024

hertschuh commented Aug 13, 2024

shkarupa-alex commented Aug 14, 2024

hertschuh commented Aug 16, 2024

shkarupa-alex commented Aug 19, 2024

swamidass commented Aug 19, 2024 •

edited

Loading

Roadmap for RaggedTensor? #20073

Roadmap for RaggedTensor? #20073

Comments

swamidass commented Jul 31, 2024

hertschuh commented Aug 9, 2024

shkarupa-alex commented Aug 12, 2024

hertschuh commented Aug 12, 2024

shkarupa-alex commented Aug 13, 2024

hertschuh commented Aug 13, 2024

shkarupa-alex commented Aug 14, 2024

hertschuh commented Aug 16, 2024

shkarupa-alex commented Aug 19, 2024

swamidass commented Aug 19, 2024 • edited Loading

swamidass commented Aug 19, 2024 •

edited

Loading