DM-48944 DREAM should send files to LFA #12

bbrondel · 2025-02-17T19:38:40Z

This patch adds functionality to the DREAM CSC to use the getNewDataProducts command from DREAM. For each new data product, it sets up a key on the LFA and uploads the file. If upload to the LFA fails, it saves the file in the /tmp directory.

Note that this creates significant extra requirements for disk storage for the DREAM CSC. Given how rapidly DREAM generates data it may not be worth the effort to save data on the local disk as a fallback, and to just accept lost data whenever the LFA is unavailable.

tribeiro

This looks good but I have some comments I hope you will consider before I can approve the PR.

tribeiro · 2025-02-17T20:32:36Z

python/lsst/ts/dream/csc/config_schema.py

+          Large File Annex S3 instance, for example "cp", "tuc" or  "ls".
+        type: string
+        pattern: "^[a-z0-9][.a-z0-9]*[a-z0-9]$"
+      url_root:


would this be any different than the hostname above? I imagine the data service will be running on the same machine that runs the DREAM controller no?

Definitely, the hostname will match the host configuration item. What I'm not sure about is (1) whether the HTTP server will be http or https, and (2) which port the HTTP server would be expected to run on.

Right, but I am pretty sure those are all things we control, and can specify. For example, we can say that the port will be port+1 (increment the communication port by one) and that it is not going to be https.

tribeiro · 2025-02-17T20:36:29Z

python/lsst/ts/dream/csc/dream_csc.py

@@ -142,6 +158,7 @@ async def connect(self) -> None:
        self.weather_and_status_loop_task = asyncio.ensure_future(
            self.weather_and_status_loop()
        )
+        self.data_product_loop_task = asyncio.ensure_future(self.data_product_loop())


The recommended way (also here) to schedule background tasks is to use create_task instead of ensure_future. I noticed that there are others use of ensure_future here, you might want to update those as well.

tribeiro · 2025-02-17T20:48:08Z

python/lsst/ts/dream/csc/dream_csc.py

        try:
            await self.weather_and_status_loop_task
+            await self.data_product_loop_task


note that since self.weather_and_status_loop_task was also cancelled, you will never reach this await here. You might want to do a gather here and since you are basically expecting them to be cancelled, you could use return_exception and ignore the results.. something like this:

await asyncio.gather(self.weather_and_status_loop_task, self.data_product_loop_task, return_exception=True)

you can even remote the try/except in this case.

tribeiro · 2025-02-17T20:51:45Z

python/lsst/ts/dream/csc/dream_csc.py

+
+        Parameters
+        ----------
+        data_product: DataProduct


Please, reformat this to fit the numpy docs style, e.g.:

data_product : `DataProduct`

tribeiro · 2025-02-17T20:53:03Z

python/lsst/ts/dream/csc/dream_csc.py

+                    try:
+                        await self.upload_data_product(data_product)
+                    except Exception:
+                        self.log.exception("Upload data product failed")


So, if this raises an exception, you are logging it and just ignoring. I think we might have to take a more proactive measure here, like sending the CSC to fault or something.

tribeiro · 2025-02-17T20:55:13Z

python/lsst/ts/dream/csc/dream_csc.py

+        async with httpx.AsyncClient() as client:
+            async with client.stream("GET", dream_url) as response:
+                if response.status_code != 200:
+                    self.log.error(


should this be an exception instead?

tribeiro · 2025-02-17T20:57:29Z

python/lsst/ts/dream/csc/dream_csc.py

+                    )
+
+            # Second attempt: Fresh request and save locally
+            async with client.stream("GET", dream_url) as response:


Do you really need to retrieve the data again? You already retrieve it above, so why not fallback to saving it to local disk soon after the failure above?

You could do something like:

try: # First attempt: Save to S3 await self.save_to_s3(response, key) return # Success! except Exception: self.log.exception( f"Could not upload {key} to S3; trying to save to local disk." ) await self.save_to_local_disk(response, key)

Also, I am a bit weary of all these potential errors being ignored (after being logged). If you are afraid some transient errors might happen and want to make it resilient to that, it is probably fine. However, ignoring all errors like this is probably too extreme. I would rather have the CSC going to Fault.

tribeiro · 2025-02-17T21:00:34Z

python/lsst/ts/dream/csc/dream_csc.py

+            )
+            self.log.info(f"Successfully uploaded {key} to S3.")
+        except Exception:
+            self.log.exception(f"Failed to upload {key} to S3.")


I have the impression you are logging the same exceptions multiple times here. You might want to remove the try/except here and let the exception be lifted and handled at another layer.

tribeiro · 2025-02-17T21:01:22Z

python/lsst/ts/dream/csc/dream_csc.py

+            self.log.info(f"Saved {key} to local disk at {filepath}")
+        except Exception:
+            self.log.exception("Could not save the file to local disk.")
+            raise


Same as above. Consider just removing the try/except clause here and handle the exception at another layer.

tribeiro · 2025-02-17T21:02:53Z

python/lsst/ts/dream/csc/dream_csc.py

+            generator="dream",
+            date=data_product.start,
+            other=other,
+            suffix=".fits",


is this right? is the data always going to be fits? I have the impression that in the mock below you are handling txt files. This should probably be extracted from the data_product no?

I contacted Sjoerd to ask about this but I haven't heard back from him yet.

Changed this line to suffix=pathlib.Path(data_product.filename).suffix so we should now be good in any case.

tribeiro · 2025-02-18T21:13:38Z

python/lsst/ts/dream/csc/config_schema.py

+          Large File Annex S3 instance, for example "cp", "tuc" or  "ls".
+        type: string
+        pattern: "^[a-z0-9][.a-z0-9]*[a-z0-9]$"
+      url_root:


Right, but I am pretty sure those are all things we control, and can specify. For example, we can say that the port will be port+1 (increment the communication port by one) and that it is not going to be https.

tribeiro · 2025-02-18T21:20:32Z

python/lsst/ts/dream/csc/dream_csc.py

+        if not self.s3bucket:
+            raise RuntimeError("S3 bucket not configured")
+
+        filepath = pathlib.Path("/tmp") / self.s3bucket.name / key


Are you sure you want to write this data to /tmp? Data here can be wiped out by the OS at anytime. I think you probably want to write it somewhere else, maybe even to a configurable directory. This would allow us to, for instance, mount a nfs drive to be used as a backup.

It you really want to write to temporary storage, you might want to look into Python's tempfile module though, I am pretty sure that is not really what we want to do here.

Changed to a configurable directory.

tribeiro · 2025-02-18T21:21:38Z

python/lsst/ts/dream/csc/dream_csc.py

+            async for chunk in response.aiter_bytes():
+                tmp_file.write(chunk)
+
+        await self.evt_largeFileObjectAvailable.set_write(


So, this event is reserved from when we write things to the S3 bucket so, in this case, don't publish it.

tribeiro

Looks good! Thanks for the updates!

bbrondel added 2 commits February 14, 2025 17:30

Add code to send getNewDataProducts and process the result.

087b8bb

Add mock and unit tests for file upload functionality.

b9e4c01

bbrondel mentioned this pull request Feb 17, 2025

DM-48944 Add configuration for DREAM use of LFA. lsst-ts/ts_config_ocs#379

Open

bbrondel marked this pull request as ready for review February 17, 2025 20:01

bbrondel requested a review from tribeiro February 17, 2025 20:01

tribeiro reviewed Feb 17, 2025

View reviewed changes

Add changes responsive to comments by @tribeiro.

820dcf8

bbrondel force-pushed the tickets/DM-48944 branch from d900120 to 820dcf8 Compare February 17, 2025 22:37

tribeiro reviewed Feb 18, 2025

View reviewed changes

Add additional changes responsive to comments by @tribeiro.

544a106

tribeiro approved these changes Feb 19, 2025

View reviewed changes

bbrondel added 2 commits February 19, 2025 17:26

Don't assume the order of largeFileObjectAvailable events

ab0860c

Don't reuse variable name

2fa9a77

bbrondel merged commit c68e2f5 into develop Feb 19, 2025
2 checks passed

bbrondel deleted the tickets/DM-48944 branch February 19, 2025 20:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-48944 DREAM should send files to LFA #12

DM-48944 DREAM should send files to LFA #12

bbrondel commented Feb 17, 2025

tribeiro left a comment

tribeiro Feb 17, 2025

bbrondel Feb 17, 2025

tribeiro Feb 18, 2025

tribeiro Feb 17, 2025

tribeiro Feb 17, 2025

tribeiro Feb 17, 2025

tribeiro Feb 17, 2025

tribeiro Feb 17, 2025

tribeiro Feb 17, 2025

tribeiro Feb 17, 2025

tribeiro Feb 17, 2025

tribeiro Feb 17, 2025

bbrondel Feb 19, 2025

bbrondel Feb 19, 2025

tribeiro Feb 18, 2025

tribeiro Feb 18, 2025

bbrondel Feb 19, 2025

tribeiro Feb 18, 2025

tribeiro left a comment

DM-48944 DREAM should send files to LFA #12

DM-48944 DREAM should send files to LFA #12

Conversation

bbrondel commented Feb 17, 2025

tribeiro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tribeiro left a comment

Choose a reason for hiding this comment