New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add support for componentjs #4138

Open

NucleonGodX wants to merge 4 commits into aboutcode-org:develop from NucleonGodX:componentjs

NucleonGodX commented Feb 9, 2025

Fixes #4107

Tasks

Reviewed contribution guidelines
PR is descriptively titled 📑 and links the original issue above 🔗
Tests pass -- look for a green checkbox ✔️ a few minutes after opening your PR
Run tests locally to check for errors.
Commits are in uniquely-named feature branch and has no merge conflicts 📁

Signed-off-by: NucleonGodX [email protected]

NucleonGodX changed the title ~~Add support for componentjs #4107~~ Add support for componentjs

NucleonGodX and others added 2 commits

February 9, 2025 23:53


          Add support for componentjs aboutcode-org#4107

efa154e

Signed-off-by: NucleonGodX <[email protected]>


          code cleanup Signed-off-by: NucleonGodX <[email protected]>

14ba0ae

Signed-off-by: NucleonGodX <[email protected]>

NucleonGodX force-pushed the componentjs branch from 0c13beb to 14ba0ae Compare

February 9, 2025 18:23

NucleonGodX added 2 commits

February 10, 2025 21:22


          add header and code cleanup

d75e86d

Signed-off-by: NucleonGodX <[email protected]>


          tests updated

Signed-off-by: NucleonGodX <[email protected]>

AyanSinhaMahapatra requested changes

View reviewed changes

Member

AyanSinhaMahapatra left a comment •

edited

Loading

@NucleonGodX Thanks a lot for the PR, this is looking great as a start.
I have suggested some more improvements here for your considerations, please address these.
A couple general comments:

also implement package assembly
review and check if we are missing anything in the spec
we probably don't need all the staticmethods
make sure the tests pass, and make tests using expectation files

Apologies for the late review 😅

src/packagedcode/componentjs.py

+                  """
+                  datasource_id = "component_json_metadata"
+                  path_patterns = ("*component.json",)
+                  default_package_type = "library"

Member

AyanSinhaMahapatra Apr 4, 2025

Suggested change

      
                default_package_type = "library"
          
                default_package_type = "generic"

src/packagedcode/componentjs.py

+                      return dependencies
+                  @classmethod
+                  def _extract_license_statement(cls, data):

Member

AyanSinhaMahapatra Apr 4, 2025

We don't need to process/normalize this separately, this is handled while creating PackageData generally, since this is common across ecosystems, see https://github.com/aboutcode-org/scancode-toolkit/blob/develop/src/packagedcode/models.py#L782

src/packagedcode/componentjs.py

+                      )
+                      if namespace and name:
+                          package_data['purl'] = PackageURL(

Member

AyanSinhaMahapatra Apr 4, 2025

You do not need to populate the purl field explicitly, this is populated based on the values while creating the PackageData object, in a more general way see https://github.com/aboutcode-org/scancode-toolkit/blob/develop/src/packagedcode/models.py#L302.

src/packagedcode/componentjs.py

+                          name=name,
+                          namespace=namespace,
+                          version=data.get('version'),
+                          description=data.get('description', ''),

Member

AyanSinhaMahapatra Apr 4, 2025

Suggested change

      
                        description=data.get('description', ''),
          
                        description=data.get('description'),

We want the defaults to be None (or whatever is defined at the model for this attribute) always if we don't have any value.

src/packagedcode/componentjs.py

+                          version=data.get('version'),
+                          description=data.get('description', ''),
+                          homepage_url=cls._extract_homepage(data),
+                          keywords=data.get('keywords', []),

Member

AyanSinhaMahapatra Apr 4, 2025

Suggested change

      
                        keywords=data.get('keywords', []),
          
                        keywords=data.get('keywords'),

src/packagedcode/componentjs.py

+                              dependencies.append(
+                                  models.DependentPackage(
+                                      purl=purl,
+                                      scope='runtime',

Member

AyanSinhaMahapatra Apr 4, 2025

Suggested change

      
                                    scope='runtime',
          
                                    scope='dependencies',

Here this is specific to the manifest type

src/packagedcode/componentjs.py

+                      with open(location, "r", encoding="utf-8") as f:
+                          data = json.load(f)
+                      name = data.get('name') or data.get('repo', '').split('/')[-1]

Member

AyanSinhaMahapatra Apr 4, 2025

Is name not a required attribute here? is repo always formatted like the following: "repo": "chaijs/chai"?
From https://github.com/componentjs/spec/blob/master/component.json/specifications.md#name seems like this is required. Same comment for namespace processing
Please go through the full spec carefully

tests/packagedcode/data/componentjs/chai/component.json

+                    , "testing"
+                    , "chai"
+                  ]
+                , "main": "index.js"

Member

AyanSinhaMahapatra Apr 4, 2025

You need to add a assemble and assign_packages_to_resources_* functions to process these properly. These are used to process top level packages and assign files to these package objects (to resolve which files are part of a package). This is handled by default functions which are base implementations: https://github.com/aboutcode-org/scancode-toolkit/blob/develop/src/packagedcode/models.py#L1137 but whenever these are specific data we need to override these by explicit functions. This populates the for_packages attribute of resources and does a couple other things.

Then also add tests with directories and files to test this too.

See other examples of this in other datafilehandlers, like the simple assembly in https://github.com/aboutcode-org/scancode-toolkit/blob/develop/src/packagedcode/conda.py#L42
Please ask questions on this if you need help with this, as this can be more complex

tests/packagedcode/test_componentjs.py

+                      test_file = self.get_test_loc('componentjs/jszip/component.json')
+                      result_packages = list(componentjs.ComponentJSONMetadataHandler.parse(test_file))
+                      expected_packages = [
+                          models.PackageData(

Member

AyanSinhaMahapatra Apr 4, 2025

Test data which are like these should be in expectation JSON files, see

scancode-toolkit/tests/packagedcode/test_npm.py

Line 247 in 4b57a7f

self.check_packages_data(packages, expected_loc, regen=REGEN_TEST_FIXTURES)

where we have check_packages_data to support this in an easy way, like full JSON scans. Otherwise we have to add/maintain this manually which isn't great :P

Member

AyanSinhaMahapatra Apr 4, 2025

This is applicable for all the tests below

tests/packagedcode/test_componentjs.py

+                      check_json_scan(expected_file, result_file, regen=REGEN_TEST_FIXTURES)
+                  def test_parse_jszip_component_json(self):
+                      test_file = self.get_test_loc('componentjs/jszip/component.json')

Member

AyanSinhaMahapatra Apr 4, 2025

You don't need to add two tests, one with full scans with --package and then one here. See the comment below on how to test single manifests. Full scans are useful to test both package manifest parsing and package assembly together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet