Issue #99 proposal #101

thirtytwobits · 2024-03-30T05:53:13Z

This is a draft PR to review a proposed change to the public APIs for pydsdl.

After playing with a Nunavut integration I realized the previous design was deficient.

This is non-breaking change that adds an optional named tuple to the read_namespace method. This named tuple allows specifying a set of filtering rules to apply when discovering DSDL types from files under a given namespace filetree. The intent is to connect the proposed TypeFilterEngine to the _construct_dsdl_definitions_from_namespace method after it creates a DSDLDefinition to apply the initial filtering of the target type. From there I need to figure out how to mark that selected type and its transitive dependencies as either weakly-excluded or strongly-included and detect any conflict where user provided rules conflict as processing proceeds. Any advice as to where I could best work with a tree-representation ahead of any expensive processing would be appreciated.

The new proposal uses a visitor pattern that is more powerful and will allow Nunavut to fully describe all file dependencies closing OpenCyphal/nunavut#58 and allowing the #99 filter to be implemented in nunavut.

pavel-kirienko · 2024-04-02T18:33:46Z

I want to postpone reviewing this until this discussion is concluded: https://forum.opencyphal.org/t/proposal-for-a-type-manifest-specification/2107/6

One point that is probably invariant to the conclusion is that I would like to bring PyDSDL into the context of processing DSDL manifests, since manifests are closely related to the problem domain of PyDSDL itself. PyDSDL shouldn't be involved in parsing the manifests but it should probably accept some abstract representation of the manifest at the input (optionally, of course, as we don't want it to be a breaking change).

Also, I would welcome a more functional-style design, if there are such options on the table.

thirtytwobits · 2024-04-02T21:16:00Z

This is separable from the manifest format. pydsdl needs to support the concepts of target types instead of target namespaces and transitive closures for those target types across namespaces. For example: if

# TARGET                       |   DEPENDENCY               
root_ns_b/d/FType.1.0 -> root_ns_a/b/GType.1.0
root_ns_b/d/HType.1.0 -> root_ns_b/b/FType.1.0
root_ns_a/b/CType.1.0 -> root_ns_b/d/EType.1.0

The current API requires two calls, one with a target NS of root_ns_b that produces one tree and then another root_ns_a target NS that produces a second tree one has to merge with the previous tree. I guess I could design a new API but the old one cannot be retrofitted to support this in a single call.

As for "functional" can you explain? The dependencies are found by parser events so it seems natural to support dependency discovery events.

pavel-kirienko · 2024-04-03T10:02:59Z

This is separable from the manifest format. pydsdl needs to support the concepts of target types instead of target namespaces and transitive closures for those target types across namespaces.

I agree.

What I was trying to communicate is that exposing a generic visitor-based API works but the utility of this solution is not very high because it requires the user to implement much of the manifest processing manually. It will be implemented in Nunavut and then again in other software involved in the processing of DSDL definitions; I would like to avoid such duplication. Hence, I am trying to say that the concerns related to the selection of data types based on user-provided requirements should be addressed within PyDSDL instead of being pushed up the stack via the abstract visitors.

I guess I could design a new API but the old one cannot be retrofitted to support this in a single call.

Can we explore the idea of adding a new top-level entry point beside read_namespace that accepts an explicit list of types of interest -- closely following the manifest model (note that I am not talking about the manifest format here) -- and the set of root namespace directories to look for them in?

The existing read_namespace may or may not be marked as deprecated.

This is a draft PR to review a proposed change to the public APIs for pydsdl. This is non-breaking change that adds a new public method and datatype ## New Method read_files - a file-oriented entry point to the front end. This takes a list of target DSDL files allowing the user to maintain an explicit list instead of depending on globular filesystem discovery. ## New Datatype DsdlFile – We define and publish a new abstract data type that encapsulates all the information collected about and from a given dsdl file. This allows a backend to associate datatypes with files for the purposes of managing dependencies (e.g. this would allow the creation of .d files by a back end).

thirtytwobits · 2024-04-08T16:56:37Z

I've updated the proposal. Python docs aren't fixed up nor is there testing. This is just the first thwack to see if we're aligned. Please review _namespace.py::read_files (again, I didn't fix the docs sorry). Example usage would be:

target_types = [
    Path("/path/to/root/taxonomy/animalia/chordata/mammalia/artiodactyla/suidae/sus/sus_domesticus.1.0.dsdl"),
    Path("/another_path/to/roots/taxonomy/animalia/chordata/mammalia/artiodactyla/suidae/sus/sus_scrofa.1.0.dsdl"),
    Path("/path/to/root/plantae/tracheophytes/gymnospermae/pinophyta/pinopsida/cupressales/cupressaceae/sequoia/sequoia_sempervirens.1.0.dsdl")
]
target_dsdl, dependency_dsdl = pydsdl.read_files(target_types, ["animalia", "plantae"], path_to_public_types)

emrainey · 2024-04-10T14:07:40Z

I'm a little confused by the example, am I getting the same set of files (or types or ??) in target_dsdl that I put into target_types (which are just file paths, not types)? It essentially discovers all the dependency_dsdl that I have amongst the provided namespaces and roots paths for that particular set of target_dsdl?

thirtytwobits · 2024-04-10T16:34:24Z

I'm a little confused by the example, am I getting the same set of files (or types or ??) in target_dsdl that I put into target_types (which are just file paths, not types)? It essentially discovers all the dependency_dsdl that I have amongst the provided namespaces and roots paths for that particular set of target_dsdl?

Right. you get a tuple where the target_dsdl "files" are abstractions that provide the CompositeType as a getter associated with the metadata about the file used to create that type. This is 1:1 for the target types passed into the method. The second member of the tuple is the same but contains all dependent types discovered using the lookup paths and needed by the transitive closure of the target set.

pavel-kirienko

Nice.

I would like to avoid exposing anything but CompositeType to the client; we can add new properties to that class for the missing context information. By adding an entity that claims to deduce information from the file path we would create an obstacle for the evolution of the DSDL language because eventually certain properties may be moved from the path into the file itself; in addition, it adds a pesky failure mode where the information may be deduced differently by the parser itself and the entity that processes the file path, which is actually recognized in the doc comments you added (where you talk about the possible divergence between the two).

All error types must inherit from either InternalError or InvalidDefinitionError; only the former is of relevance for this changeset.

Why do we need to return the dependent types as a separate entity?

Some of the visitors introduced with the previous iteration appear to be unnecessary and should be removed.

I would prefer to avoid the inclusion of the DSDL submodule in this changeset and focus on this later on. The CI already tests PyDSDL against the standard uavcan namespace here, but this approach admittedly has drawbacks:

pydsdl/.github/workflows/test-and-release.yml

Line 63 in 6a6828b

python -c "import pydsdl; pydsdl.read_namespace('.dsdl-test/uavcan', [])"

Perhaps it would be wiser to move it into a separate nox session because adding a submodule just for the sake of sourcing test data is undesirable. Let us please look into this later.

thirtytwobits · 2024-04-16T16:30:30Z

I would like to avoid exposing anything but CompositeType to the client; we can add new properties to that class for the missing context information.

Please look at the latest proposal. It only exposes CompositeType but adds properties to it.

pavel-kirienko

I like the overall idea but the implementation appears to be too heavy and convoluted; I shared a few ideas how to simplify it (mostly by dialing down on the OOP a little)

pavel-kirienko · 2024-05-06T11:37:21Z

pydsdl/_dsdl.py

+        raise NotImplementedError()
+
+
+class DsdlFileBuildable(DsdlFile):


The current name breaks the naming consistency. It should be BuildableDSDLFile or, in this specific case, I would name it ReadableDSDLFile, because it has read() not build()

pavel-kirienko · 2024-05-06T11:47:39Z

pydsdl/_serializable/_composite.py

        """
        return self._source_file_path

+    @property
+    def source_file_path_to_root(self) -> Path:


Suggested change

def source_file_path_to_root(self) -> Path:

def root_namespace_path(self) -> Path:

pavel-kirienko · 2024-05-06T11:52:06Z

pydsdl/_serializable/_composite.py

+        For synthesized types such as service request/response sections, this property is the path to the service type
+        since request and response types are defined within the service type's dsdl file.


Suggested change

For synthesized types such as service request/response sections, this property is the path to the service type

since request and response types are defined within the service type's dsdl file.

For service request/response types this property is the path to the service type

since request and response types are defined within the service type's DSDL file.

pavel-kirienko · 2024-05-06T12:00:26Z

pydsdl/_data_type_builder.py

@@ -198,6 +198,7 @@ def resolve_versioned_data_type(self, name: str, version: _serializable.Version)
        del name
        found = list(filter(lambda d: d.full_name == full_name and d.version == version, self._lookup_definitions))
        if not found:
+


Suggested change

pavel-kirienko · 2024-05-06T12:10:37Z

pydsdl/_dsdl.py

+"""Invoked when the frontend encounters a print directive or needs to output a generic diagnostic."""
+
+
+class DsdlFile(ABC):


In this particular project "DSDL" is spelled as DSDL in definitions, not Dsdl, so we should follow that for consistency.

pavel-kirienko · 2024-05-06T15:24:42Z

pydsdl/_dsdl.py

+def normalize_paths_argument(
+    namespaces_or_namespace: Union[None, Path, str, Iterable[Union[Path, str]]],
+    type_cast: Callable[[Iterable], PathListT],
+) -> PathListT:


I am looking at the usages and it seems like this can be simplified such that type_cast is removed (always perform something roughly similar to {Path(p) for p in namespaces_or_namespace}) and the return type can be a concrete list[Path] or set[Path]. Or am I missing something?

pavel-kirienko · 2024-05-06T15:27:40Z

pydsdl/_dsdl.py

+
+from abc import ABC, abstractmethod
+from pathlib import Path
+from typing import Callable, Iterable, List, Optional, Set, Tuple, TypeVar, Union


Suggested change

from typing import Callable, Iterable, List, Optional, Set, Tuple, TypeVar, Union

from typing import Callable, Iterable, Optional, TypeVar

Deprecated items should not be used in new code; see PEP 585

Union is not needed anymore because it can be written in shorthand as A | B

Optional arguably might not be needed as well because it is representable as T | None

pavel-kirienko · 2024-05-06T15:33:53Z

pydsdl/_error.py

+class DependentFileError(RuntimeError):
+    """
+    Raised by the DefinitionVisitor when it encounters a dependent dsdl file that is not allowed by the user.
+    """


This is used for control flow, not for error reporting, so it shouldn't be here. If it were to be kept here it would have to implement FrontendError.

pavel-kirienko · 2024-05-06T15:37:27Z

pydsdl/_dsdl.py

+class DefinitionVisitor(ABC):
+    """
+    An interface that allows visitors to index dependent types of a target DSDL file. This allows visitors to
+    build a closure of dependent types while parsing a set of target DSDL files.
+    """
+
+    @abstractmethod
+    def on_discover_lookup_dependent_file(self, target_dsdl_file: DsdlFile, dependent_type: DsdlFile) -> None:
+        """
+        Called by the parser after if finds a dependent type but before it parses a file in a lookup namespace.
+        :param DsdlFile target_dsdl_file: The target DSDL file that has dependencies the parser is searching for.
+        :param DsdlFile dependent_type: The dependency of target_dsdl_file file the parser is about to parse.
+        :raises DependentFileError: If the dependent file is not allowed by the visitor.
+        """
+        raise NotImplementedError()


I would suggest replacing this with a callback function and avoid using exceptions for control flow, as we can communicate the fact that the dependent file is not allowed by returning a tag type:

@dataclasses.dataclass(frozen=True) # ok this is optional class DependentFileNotAllowed: pass # some context may be added later #... def on_discover_lookup_dependent_file(self, target_dsdl_file: DsdlFile, dependent_type: DsdlFile) -> None | DependentFileNotAllowed:

pavel-kirienko · 2024-05-06T15:49:39Z

pydsdl/_namespace_reader.py

+        return union_files
+
+
+class Closure(DefinitionVisitor):


I think this approach might be too stateful and OOP-heavy to be considered Pythonic. Continuing the idea of converting the DefinitionVisitor into a callback, this would also become a function:

@dataclasses.dataclass(frozen=True) class Result: direct: set[DSDLFile] transitive: set[DSDLFile] @property def all(self) -> set[DSDLFile]: return self.direct | self.transitive def read_definitions(target_definitions: SortedFileList, lookup_definitions: SortedFileList, allow_unregulated_fixed_port_id: bool, print_output_handler: PrintOutputHandler | None) -> Result: ...

The FilesContainer are an implementation detail of this module so they should not be visible nor returned

oof. Okay. I think I need to start over then. Let me try again in a new pull-request.

…Cyphal/pydsdl/pull/101\#discussion_r1590898374 and replace a few deprecated typing entities with modern alternatives

thirtytwobits force-pushed the main branch 2 times, most recently from 7cbe690 to 4f651a6 Compare March 31, 2024 20:46

thirtytwobits force-pushed the main branch from aa5b3ff to fe124f7 Compare April 8, 2024 16:44

Issue OpenCyphal#99 (modified) proposal

5ad9875

pavel-kirienko reviewed Apr 16, 2024

View reviewed changes

pavel-kirienko requested changes May 6, 2024

View reviewed changes

thirtytwobits closed this May 6, 2024

pavel-kirienko mentioned this pull request Jun 7, 2024

Adding support for file-based reads and externalized type dependencies #103

Merged

pavel-kirienko referenced this pull request in thirtytwobits/pydsdl Jun 28, 2024

Uphold the naming consistency as commented at https://github.com/Open…

c32dae1

…Cyphal/pydsdl/pull/101\#discussion_r1590898374 and replace a few deprecated typing entities with modern alternatives

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue #99 proposal #101

Issue #99 proposal #101

thirtytwobits commented Mar 30, 2024 •

edited

Loading

pavel-kirienko commented Apr 2, 2024 •

edited

Loading

thirtytwobits commented Apr 2, 2024

pavel-kirienko commented Apr 3, 2024

thirtytwobits commented Apr 8, 2024

emrainey commented Apr 10, 2024

thirtytwobits commented Apr 10, 2024

pavel-kirienko left a comment

thirtytwobits commented Apr 16, 2024

pavel-kirienko left a comment

pavel-kirienko May 6, 2024

pavel-kirienko May 6, 2024

pavel-kirienko May 6, 2024

pavel-kirienko May 6, 2024

pavel-kirienko May 6, 2024

pavel-kirienko May 6, 2024

pavel-kirienko May 6, 2024

pavel-kirienko May 6, 2024

pavel-kirienko May 6, 2024

pavel-kirienko May 6, 2024

thirtytwobits May 6, 2024

		raise NotImplementedError()


		class DsdlFileBuildable(DsdlFile):

	def source_file_path_to_root(self) -> Path:
	def root_namespace_path(self) -> Path:

		For synthesized types such as service request/response sections, this property is the path to the service type
		since request and response types are defined within the service type's dsdl file.

		"""Invoked when the frontend encounters a print directive or needs to output a generic diagnostic."""


		class DsdlFile(ABC):

	from typing import Callable, Iterable, List, Optional, Set, Tuple, TypeVar, Union
	from typing import Callable, Iterable, Optional, TypeVar

Issue #99 proposal #101

Issue #99 proposal #101

Conversation

thirtytwobits commented Mar 30, 2024 • edited Loading

pavel-kirienko commented Apr 2, 2024 • edited Loading

thirtytwobits commented Apr 2, 2024

pavel-kirienko commented Apr 3, 2024

thirtytwobits commented Apr 8, 2024

emrainey commented Apr 10, 2024

thirtytwobits commented Apr 10, 2024

pavel-kirienko left a comment

Choose a reason for hiding this comment

thirtytwobits commented Apr 16, 2024

pavel-kirienko left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thirtytwobits commented Mar 30, 2024 •

edited

Loading

pavel-kirienko commented Apr 2, 2024 •

edited

Loading