Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Video Swin model adds to kerashub #1981

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

kernel-loophole
Copy link

#1755
@divyashreepathihalli

Copy link

google-cla bot commented Nov 11, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@kernel-loophole
Copy link
Author

Hi @divyashreepathihalli! I need some guidance on how to test this locally. Could you help me with the steps?

@kernel-loophole
Copy link
Author

@divyashreepathihalli can you run test again.

@kernel-loophole kernel-loophole changed the title Video Swin model added to kerashub Video Swin model adds to kerashub Nov 13, 2024
Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Higher level feedback for a start.

  • Remove all "aliases," that's a CV pattern we did not continue. All arch configurations are stored in json files we upload to Kaggle/HF. The "aliases" is just the preset name you want to use.
  • Follow the task setup we have in KerasHub. I am guessing for this model we want to add a VideoClassifier task that we can model heavily on ImageClassifier.
  • Remove all instances of keras_cv whereever they are.
  • Remove the presets stuff until presets are actually upload.

This will need a much more substantial rewrite to match the abstractions of KerasHub, rather than a more direct copy.

@kernel-loophole
Copy link
Author

Thank you for the feedback. I will remove the aliases and replace them with the appropriate configurations stored in JSON files for Kaggle/HF. I'll follow the KerasHub task setup and create the VideoClassifier task based on the ImageClassifier model.

@divyashreepathihalli
Copy link
Collaborator

@kernel-loophole are you still working on this? Please let us know once you are done addressing the comments.

@kernel-loophole
Copy link
Author

@divyashreepathihalli yes ,i was bit busy ,will try to add this weekend

@kernel-loophole
Copy link
Author

@mattdangerw update all cv pattern .configuration are in json file .it would be great if you can review and let me know your feedback on this .if you can provide any example on task setup that would also be great .

Copy link
Collaborator

@divyashreepathihalli divyashreepathihalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kernel-loophole the PR would need more updates to match KerasHub model implementation style. please follow this folder here as an example - https://github.com/keras-team/keras-hub/tree/master/keras_hub/src/models/sam

input_tensor (KerasTensor, optional): Output of
`keras.layers.Input()`) to use as video input for the model.
Defaults to `None`.
include_rescaling (bool, optional): Whether to rescale the inputs. If
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type hints are specified in this format
arg_name: type. short description
please refer other model implementation -ex: https://github.com/keras-team/keras-hub/blob/master/keras_hub/src/models/vit/vit_backbone.py

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should i format all arg_name according to that .

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes plz

@kernel-loophole
Copy link
Author

kernel-loophole commented Jan 30, 2025

@divyashreepathihalli thanks for review .will update that shortly .

input_tensor (KerasTensor, optional): Output of
`keras.layers.Input()`) to use as video input for the model.
Defaults to `None`.
include_rescaling (bool, optional): Whether to rescale the inputs. If
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should i format all arg_name according to that .


x = input_spec

# if include_rescaling:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by changing the default value of scaling can change the model behavior

@divyashreepathihalli divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Mar 20, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Mar 20, 2025
@divyashreepathihalli
Copy link
Collaborator

divyashreepathihalli commented Mar 20, 2025

@kernel-loophole Sorry about the delayed response. can you please sign the CLA? That would help trigger the CPU tests!

Copy link
Collaborator

@divyashreepathihalli divyashreepathihalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kernel-loophole I did a initial pass - did not look into the layer implementations much. They need to be implemented as layers and not models.
In general please refer to other model implementations to get an idea of design patterns and test routines. And implementation style comments are applicable throughout the code.

return copy.deepcopy(backbone_presets)

@classproperty
def presets_with_weights(cls):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this - not followed in KerasHub style - please refer to other ported models as reference

return config


class VideoSwinPatchingAndEmbedding(keras.Model):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be implemented as a layer

norm_layer : keras.layers. Normalization layer.
Default: LayerNormalization

References:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add reference as part of description

return config


class VideoSwinBasicLayer(keras.Model):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to VideoSwinTransformerLayer

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and it should subclass a layer and not a model

drop_rate=0.0,
attn_drop_rate=0.0,
drop_path_rate=0.2,
depths=[2, 2, 6, 2],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not provide default value for args that change model architecture



class TestVideoSwinPatchingAndEmbedding(TestCase):
def test_patch_embedding_compute_output_shape(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use standerdized test routines for layers - refer to other models and layers

Copy link
Author

@kernel-loophole kernel-loophole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update the changes

qk_scale : float. Override default qk scale of head_dim ** -0.5 if set.
Default to None.
dropout_rate : float. Float between 0 and 1. Fraction of the input units to drop.
Default: 0.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did not get this

x = input_spec

# if include_rescaling:
# # Use common rescaling strategy across keras_cv
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove include_rescaling this is moved into image converter
as discussed before

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants