Add Roberta converter #2124

omkar-334 · 2025-03-04T06:47:34Z

A few doubts -

the model outputs from keras and huggingface are not similar at all.

from transformers import RobertaTokenizer
from transformers import TFRobertaModel

hf_model = TFRobertaModel.from_pretrained("roberta-base", output_hidden_states=True)
tokenizer = RobertaTokenizer.from_pretrained("roberta-base")

text = "Hello, how are you?"
inputs = tokenizer(text, return_tensors="tf", padding=True, truncation=True)
hf_output = hf_model(**inputs).last_hidden_state
keras_inputs = {
    "token_ids": inputs["input_ids"].numpy(),  # Token IDs
    "padding_mask": inputs["attention_mask"].numpy(),  # Padding Mask
}
keras_output = model(keras_inputs)

Hugging Face’s RoBERTa uses 514 position embeddings (512 positions + 2 extra tokens), whereas Keras only expects 512.
Tokenizer comparison

google-cla · 2025-03-04T06:47:40Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

omkar-334 · 2025-03-04T21:08:35Z

Here's the link to the testing colab - https://colab.research.google.com/github/omkar-334/keras-scripts/blob/main/RoBERTa_converter.ipynb

Also,
RoBERTa doesn't have segment embeddings and pooled layers, but the huggingface model includes a constant segment embedding of dimensions (1, 512) I think. It also included pooled layers for downstream tasks, which the Keras implementation doesn't have.

JyotinderSingh · 2025-03-20T07:54:38Z

Hi @omkar-334. thanks for this PR.
Regarding the mismatched logits, it looks like you're loading the HuggingFace model in float32

hf_model = TFRobertaModel.from_pretrained("roberta-base")

while the Keras model is being quantized into bfloat16.

model = keras_hub.models.RobertaBackbone.from_preset("hf://FacebookAI/roberta-base", dtype="bfloat16")

It might be worth trying to load them in the same precision when verifying the logics.

JyotinderSingh · 2025-03-20T08:08:03Z

I did try to run your notebook by loading in both sets of weights as float32, but the results still don't seem to match.

Huggingface output
 tf.Tensor(
[[[-0.06098365  0.1249077  -0.01024082 ... -0.05549879 -0.05278065
   -0.02032274]
  [-0.33764896  0.20138153  0.07472473 ...  0.16684803  0.02431546
   -0.13936469]
  [-0.02943649  0.23096977  0.18173131 ... -0.14693598 -0.05403079
   -0.02496235]
  ...
  [-0.11631897  0.2576879   0.0894694  ... -0.01494528  0.07766235
    0.03402137]
  [-0.07787547  0.2642327   0.44699728 ... -0.7686613   0.02006039
    0.07307038]
  [-0.0507166   0.14344664 -0.03572293 ... -0.10117416 -0.05277743
   -0.05274259]]], shape=(1, 8, 768), dtype=float32)

Keras output
 tf.Tensor(
[[[-7.23304749e-02  1.11076608e-01 -7.59335235e-04 ... -9.13275555e-02
   -4.67573255e-02 -2.74974313e-02]
  [-1.94556322e-02  7.97019601e-02  1.06528938e-01 ... -2.88743407e-01
   -1.66224763e-02  5.16433269e-02]
  [-7.21454322e-02  1.10889256e-01 -5.06145880e-04 ... -9.06197801e-02
   -4.66767214e-02 -2.69957650e-02]
  ...
  [-3.83148864e-02  1.94189698e-01  2.10571475e-03 ...  6.51391894e-02
   -4.42184880e-03  4.94358130e-02]
  [ 2.18277685e-02  1.65410444e-01  3.22254300e-01 ... -5.11629343e-01
    3.71083468e-02  8.14208537e-02]
  [-6.67822510e-02  1.25953302e-01 -1.97500065e-02 ... -1.38220027e-01
   -4.70701084e-02 -5.49893379e-02]]], shape=(1, 8, 768), dtype=float32)

AssertionError: 
Not equal to tolerance rtol=1e-07, atol=1e-05
...

omkar-334 added 4 commits March 4, 2025 10:29

Added RoBERTa converter

ce2e3bc

Add RoBERTa coverter tests

627ad1b

Added RoBERTA in preset_loader

2035ba3

fix key names

269fbd7

divyashreepathihalli requested a review from JyotinderSingh March 19, 2025 05:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Roberta converter #2124

Add Roberta converter #2124

omkar-334 commented Mar 4, 2025 •

edited

Loading

google-cla bot commented Mar 4, 2025

omkar-334 commented Mar 4, 2025

JyotinderSingh commented Mar 20, 2025

JyotinderSingh commented Mar 20, 2025

Add Roberta converter #2124

Are you sure you want to change the base?

Add Roberta converter #2124

Conversation

omkar-334 commented Mar 4, 2025 • edited Loading

google-cla bot commented Mar 4, 2025

omkar-334 commented Mar 4, 2025

JyotinderSingh commented Mar 20, 2025

JyotinderSingh commented Mar 20, 2025

omkar-334 commented Mar 4, 2025 •

edited

Loading