test(robot): add uninstallation with backups and system backup test cases #2274

yangchiu · 2025-01-15T23:52:37Z

Which issue(s) this PR fixes:

Issue longhorn/longhorn#10148

What this PR does / why we need it:

add uninstallation with backups and system backup test cases

Special notes for your reviewer:

Additional documentation or context

Summary by CodeRabbit

New Features
- Added support for creating backups without waiting for completion.
- Introduced system backup and restore functionality.
- Enhanced volume data checksum tracking.
- Added new keywords for backup and volume management.
- Added cleanup keywords for system backups and restores.
Bug Fixes
- Improved error handling in various utility functions.
- Added retry mechanisms for backup and system backup operations.
Documentation
- Updated logging and error messages for better traceability.
Tests
- Added new test cases for backup, system backup, and uninstallation scenarios.

coderabbitai · 2025-01-15T23:52:47Z

Walkthrough

This pull request introduces comprehensive enhancements to Longhorn's end-to-end (e2e) testing framework, focusing on system backup, backup management, and uninstallation processes. The changes span multiple files across the test infrastructure, introducing new keywords, methods, and test cases. Key additions include asynchronous backup creation, system backup functionality, improved volume management, and robust error handling mechanisms. The modifications aim to strengthen the testing coverage for critical Longhorn operations, particularly around backup and uninstallation scenarios.

Changes

File	Change Summary
`e2e/keywords/backup.resource`	Added new keyword for creating backup without waiting for completion
`e2e/keywords/common.resource`	Added cleanup keywords for system backups and restores
`e2e/keywords/system_backup.resource`	Introduced system backup and restore keywords
`e2e/keywords/variables.resource`	Added `@{volume_checksums}` variable
`e2e/keywords/volume.resource`	Enhanced volume management with checksum handling and new wait keywords
`e2e/libs/backup/*`	Updated backup classes with new methods for async backup and cleanup
`e2e/libs/system_backup/*`	Created new system backup infrastructure with base, CRD, and REST implementations
`e2e/tests/regression/`	Added new test cases for system backup and uninstallation scenarios

Assessment against linked issues

Objective	Addressed	Explanation
Automate e2e test cases for uninstallation	✅	Added comprehensive test cases in `test_backup.robot` and `test_system_backup.robot` covering uninstallation scenarios

Possibly related PRs

test(robot): volume should not reattach after node eviction #2164: The main PR introduces a new keyword for creating a backup without waiting for completion, which relates to the changes in the volume.resource file that enhance volume management functionality, including restoration processes.
test(robot): fix pull backup created by another longhorn system test case for v2 volumes #2198: The main PR's focus on backup creation and verification aligns with the modifications in the backup.resource file, which updates the backup verification keyword to include a source volume ID.
feat(backup): fix test cases for multiple backup stores #2202: The main PR's enhancements to backup verification and management are relevant to the changes in the backupstore.py file, which improve error handling and retry mechanisms for backup operations.

Suggested reviewers

roger-ryao
chriscchien

Poem

🐰 Backup Bunny's Ballad 🥕
In Longhorn's meadow, tests now dance,
Async backups take their stance,
System restores hop with glee,
Uninstall tests set kernels free!

Thump thump goes the testing beat! 🎉

Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 6

🔭 Outside diff range comments (1)

e2e/libs/k8s/k8s.py (1)
Line range hint 106-126: Replace assert False with proper exception handling and improve error message.

The job completion check logic is good, but there are two issues:

Using assert False is not recommended as assertions can be disabled.

The error message could be more descriptive.

Apply this diff to improve error handling:
-    assert False, f"Job not complete: {get_pod_logs(namespace, job_label)}"
+    raise RuntimeError(
+        f"Job with label {job_label} in namespace {namespace} did not complete within "
+        f"{LONGHORN_UNINSTALL_TIMEOUT} retries. Pod logs: {get_pod_logs(namespace, job_label)}"
+    )
🧰 Tools

🪛 Ruff (0.8.2)

126-126: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

🧹 Nitpick comments (11)

e2e/libs/system_backup/base.py (1)
1-11: LGTM! Well-structured abstract base class.

Good use of ABC to define the contract for backup operations. The abstract methods clearly define the required interface for subclasses.

Consider removing the return NotImplemented statements as they are redundant when using @abstractmethod. The decorator already ensures that these methods must be implemented by subclasses.
    @abstractmethod
    def create(self, backup_name):
-        return NotImplemented
+        pass

    @abstractmethod
    def restore(self, backup_name):
-        return NotImplemented
+        pass
e2e/keywords/system_backup.resource (1)

1-6: LGTM! Well-structured Robot Framework resource file.

Good organization with clear library imports and documentation.

Consider enhancing the documentation to include:

Description of the keywords

Expected parameters

Example usage
e2e/libs/keywords/system_backup_keywords.py (1)
9-19: Add docstrings and error handling.

The methods lack documentation and error handling.

Consider adding:

Class and method docstrings

Error handling for failed operations

Parameter validation

Example implementation:
def create_system_backup(self, backup_name):
    """Create a system backup with the given name.
    
    Args:
        backup_name (str): Name for the system backup
        
    Raises:
        ValueError: If backup_name is None or empty
        SystemBackupError: If backup creation fails
    """
    if not backup_name:
        raise ValueError("backup_name cannot be empty")
    try:
        self.system_backup.create(backup_name)
    except Exception as e:
        raise SystemBackupError(f"Failed to create system backup: {e}")
e2e/libs/system_backup/system_backup.py (1)
8-8: Consider making the strategy configurable.

The strategy is hardcoded as REST. Consider making it configurable through environment variables or constructor parameters for better flexibility.
-    _strategy = LonghornOperationStrategy.REST
+    def __init__(self, strategy=LonghornOperationStrategy.REST):
+        self._strategy = strategy
e2e/tests/regression/test_system_backup.robot (1)
2-2: Fix typo in documentation.

There's a typo in the documentation.
-Documentation    Systen Backup Test Cases
+Documentation    System Backup Test Cases
🧰 Tools

🪛 GitHub Check: codespell

[failure] 2-2:
Systen ==> System
e2e/libs/backup/crd.py (1)
58-76: Enhance error handling in cleanup_backups method.

The error handling could be improved in the following ways:

The exception message is not being raised properly

The error logging could be more detailed
 def cleanup_backups(self):
     backups = self.obj_api.list_namespaced_custom_object("longhorn.io",
                                                         "v1beta2",
                                                         "longhorn-system",
                                                         "backups")
     for backup in backups['items']:
         logging(f"Deleting backup {backup['metadata']['name']}")
         try:
             self.obj_api.delete_namespaced_custom_object("longhorn.io",
                                                         "v1beta2",
                                                         "longhorn-system",
                                                         "backups",
                                                         backup['metadata']['name'])
         except Exception as e:
             if e.reason != "Not Found":
-                Exception(f"Deleting backup failed: {e}")
+                logging(f"Error deleting backup {backup['metadata']['name']}: {e}")
+                raise Exception(f"Failed to delete backup {backup['metadata']['name']}: {e}")
e2e/libs/system_backup/rest.py (2)
56-62: Improve loop variable naming in retry blocks.

The loop control variables i are unused in the retry blocks. Consider using _ for unused variables.
-        for i in range(self.retry_count):
+        for _ in range(self.retry_count):
             system_backups = get_longhorn_client().list_system_backup()
             if len(system_backups) == 0:
                 deleted = True
                 break
             time.sleep(self.retry_interval)

-        for i in range(self.retry_count):
+        for _ in range(self.retry_count):
             system_restores = get_longhorn_client().list_system_restore()
             if len(system_restores) == 0:
                 deleted = True
                 break
             time.sleep(self.retry_interval)
Also applies to: 74-80

🧰 Tools

🪛 Ruff (0.8.2)

56-56: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

13-28: Consider adding timeout handling in create method.

The retry mechanism could benefit from a timeout check to prevent infinite waiting in case of system issues.
     def create(self, backup_name):
         logging(f"Creating system backup {backup_name}")
         get_longhorn_client().create_system_backup(Name=backup_name)
         ready = False
+        start_time = time.time()
+        timeout = self.retry_count * self.retry_interval
         for i in range(self.retry_count):
             logging(f"Waiting for system backup {backup_name} to be ready ... ({i})")
             try:
                 system_backup = get_longhorn_client().by_id_system_backup(backup_name)
                 if system_backup.state == "Ready":
                     ready = True
                     break
             except Exception as e:
                 logging(f"Waiting for system backup {backup_name} to be ready failed: {e}")
+            if time.time() - start_time > timeout:
+                logging(f"Timeout waiting for system backup {backup_name}")
+                break
             time.sleep(self.retry_interval)
         assert ready, f"Waiting for system backup {backup_name} to be ready failed: {system_backup}"
e2e/tests/regression/test_backup.robot (1)
111-135: Consider adding verification steps after uninstall.

The test case could benefit from additional verification steps:

Verify the state of backups after uninstall

Verify the state of volumes after reinstall

Consider adding these steps after line 134:
    And Verify all backups are preserved
    And Verify volumes are removed
    And Verify backup store is accessible
🧰 Tools

🪛 GitHub Check: codespell

[failure] 124-124:
bakups ==> backups

[failure] 124-124:
avaiable ==> available

🪛 GitHub Actions: Codespell

[error] 124-124: Spelling error: 'bakups' should be 'backups'
e2e/libs/backup/rest.py (1)
221-223: Consider adding error handling to cleanup_backups.

The method delegates to CRD's cleanup_backups but doesn't handle potential exceptions.

Consider applying this diff:
     def cleanup_backups(self):
-        return CRD().cleanup_backups()
+        try:
+            return CRD().cleanup_backups()
+        except Exception as e:
+            logging(f"Failed to cleanup backups: {e}")
+            raise
e2e/libs/utility/utility.py (1)
233-238: Enhance error handling granularity.

The current error handling returns an empty string for all error cases, which could mask different types of issues (e.g., CR not found vs annotation not found). Consider differentiating between error cases.
 def get_annotation_value(group, version, namespace, plural, name, annotation_key):
     try:
         cr = get_cr(group, version, namespace, plural, name)
+        if not cr:
+            logging(f"Custom resource {plural} {name} not found in {namespace}")
+            return None
+        if 'annotations' not in cr['metadata']:
+            logging(f"No annotations found in {plural} {name} in {namespace}")
+            return None
         return cr['metadata']['annotations'].get(annotation_key)
     except Exception as e:
         logging(f"Failed to get annotation {annotation_key} from {plural} {name} in {namespace}: {e}")
-        return ""
+        return None

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3480ad and 76cbb85.

📒 Files selected for processing (27)

e2e/keywords/backup.resource (1 hunks)
e2e/keywords/common.resource (2 hunks)
e2e/keywords/system_backup.resource (1 hunks)
e2e/keywords/variables.resource (1 hunks)
e2e/keywords/volume.resource (6 hunks)
e2e/libs/backup/backup.py (2 hunks)
e2e/libs/backup/base.py (2 hunks)
e2e/libs/backup/crd.py (1 hunks)
e2e/libs/backup/rest.py (4 hunks)
e2e/libs/backupstore/base.py (3 hunks)
e2e/libs/k8s/k8s.py (3 hunks)
e2e/libs/keywords/backup_keywords.py (2 hunks)
e2e/libs/keywords/system_backup_keywords.py (1 hunks)
e2e/libs/keywords/volume_keywords.py (4 hunks)
e2e/libs/system_backup/__init__.py (1 hunks)
e2e/libs/system_backup/base.py (1 hunks)
e2e/libs/system_backup/crd.py (1 hunks)
e2e/libs/system_backup/rest.py (1 hunks)
e2e/libs/system_backup/system_backup.py (1 hunks)
e2e/libs/utility/constant.py (1 hunks)
e2e/libs/utility/utility.py (1 hunks)
e2e/libs/volume/base.py (2 hunks)
e2e/libs/volume/crd.py (4 hunks)
e2e/libs/volume/rest.py (2 hunks)
e2e/libs/volume/volume.py (2 hunks)
e2e/tests/regression/test_backup.robot (2 hunks)
e2e/tests/regression/test_system_backup.robot (1 hunks)

✅ Files skipped from review due to trivial changes (1)

e2e/keywords/variables.resource

🧰 Additional context used

🪛 GitHub Check: codespell

e2e/tests/regression/test_backup.robot

[failure] 124-124:
bakups ==> backups

[failure] 124-124:
avaiable ==> available

e2e/tests/regression/test_system_backup.robot

[failure] 2-2:
Systen ==> System

🪛 GitHub Actions: Codespell

e2e/tests/regression/test_backup.robot

[error] 124-124: Spelling error: 'bakups' should be 'backups'

🪛 Ruff (0.8.2)

e2e/libs/k8s/k8s.py

126-126: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

e2e/libs/system_backup/rest.py

56-56: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

74-74: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

e2e/libs/volume/crd.py

258-258: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

e2e/libs/backupstore/base.py

129-129: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

e2e/libs/system_backup/__init__.py

1-1: system_backup.system_backup.SystemBackup imported but unused

(F401)

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: Build images
GitHub Check: Summary

🔇 Additional comments (33)

e2e/keywords/system_backup.resource (1)

8-14: LGTM! Clear and consistent keyword implementation.

Keywords follow good practices:

Consistent naming convention

Proper use of helper functions

Clear separation of concerns

e2e/libs/system_backup/system_backup.py (2)

10-15: LGTM! Clean strategy initialization.

The constructor properly initializes the appropriate strategy based on the configuration.

16-26: LGTM! Well-structured delegation methods.

The methods follow good design principles:

Clean delegation to the strategy implementation

Consistent method signatures

Clear separation of concerns

e2e/libs/utility/constant.py (1)

26-27: LGTM! Well-documented timeout constant.

The timeout value matches the activeDeadlineSeconds in the uninstall configuration, and the comment provides good traceability.

e2e/tests/regression/test_system_backup.robot (2)

17-37: LGTM! Comprehensive backup and restore test.

The test case follows best practices:

Clear Given-When-Then structure

Proper volume lifecycle management

Data integrity validation

38-56: LGTM! Well-structured uninstallation test.

The test case effectively validates:

Both v1 and v2 data engines

System backup before uninstallation

Complete cleanup (CRD removal)

Successful reinstallation

e2e/libs/keywords/backup_keywords.py (2)

12-13: LGTM! Enhanced backup creation with async support.

Good addition of the wait parameter with a safe default value of True for backward compatibility.

52-53: LGTM! Comprehensive cleanup implementation.

The cleanup method now properly handles all backup types: system backups, backup volumes, and individual backups.

e2e/libs/backup/backup.py (2)

18-19: LGTM: Method signature update maintains strategy pattern.

The addition of the wait parameter to the create method properly delegates to the strategy-specific implementation while maintaining the class's role in the strategy pattern.

70-71: LGTM: Clean delegation of backup cleanup.

The new cleanup_backups method follows the established pattern of delegating to the strategy-specific implementation.

e2e/keywords/common.resource (2)

23-23: LGTM: Library import for system backup functionality.

The addition of the system backup keywords library aligns with the new system backup functionality.

61-62: LGTM: Comprehensive cleanup sequence.

The addition of system backup and restore cleanup steps in the cleanup sequence ensures thorough resource cleanup.

e2e/keywords/backup.resource (1)

12-15: LGTM: Well-structured asynchronous backup creation keyword.

The new keyword follows the established pattern and clearly indicates its asynchronous nature through both its name and implementation.

e2e/libs/backup/base.py (2)

13-14: LGTM: Abstract method signature properly updated.

The addition of the wait parameter to the abstract create method ensures consistent interface across all implementations.

108-110: LGTM: New abstract method follows established patterns.

The new cleanup_backups abstract method maintains consistency with other abstract methods in the class.

e2e/libs/backup/crd.py (1)

10-54: Consider implementing the NotImplemented methods.

The class has multiple methods marked as NotImplemented. Consider implementing these methods or adding TODO comments with implementation plans.

Let's check which of these methods are required by the test cases:

e2e/libs/volume/base.py (1)

83-86: LGTM! Clean and consistent method naming.

The changes follow a consistent naming pattern and maintain backward compatibility.

Also applies to: 104-104

e2e/libs/backupstore/base.py (1)

44-44: LGTM: Good initialization of retry parameters.

The initialization of retry parameters using the utility function ensures consistent retry behavior across the codebase.

e2e/libs/volume/volume.py (2)

57-59: LGTM: Well-structured method delegation.

The new method follows the established pattern of delegating to the underlying volume implementation.

90-91: LGTM: Good naming consistency improvement.

The rename from completed to to_complete improves naming consistency with other similar methods in the class.
e2e/libs/backup/rest.py (1)

Line range hint 23-52: LGTM: Good implementation of asynchronous backup creation.

The addition of the wait parameter provides flexibility in backup creation, allowing for both synchronous and asynchronous operations. The implementation correctly handles both cases.

Let's verify that all callers of this method have been updated to handle the new parameter:

✅ Verification successful

All callers properly handle the new wait parameter

The codebase shows a well-structured implementation where the wait parameter is consistently handled through all layers, from the high-level API (with a sensible default of wait=True) down to the actual implementation.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for calls to the create method
rg -A 2 "create$.*volume_name.*backup_id" --type python
Length of output: 89

Script:
#!/bin/bash
# First, let's find the class definition
rg -B 2 "def create.*volume_name.*backup_id" 

# Then find any calls to create method
rg "create\([^)]*volume_name[^)]*backup_id[^)]*$"
Length of output: 1579
e2e/libs/keywords/volume_keywords.py (4)

106-106: LGTM! Improved data validation.

The functions now properly return values for checksum validation, enhancing data integrity checks.

Also applies to: 114-114

266-269: LGTM! Added volume creation completion check.

New method properly waits for volume creation to complete before proceeding.

306-308: LGTM! Improved method naming consistency.

Renamed method to follow a more consistent naming pattern (to_complete vs completed).

323-325: LGTM! Added volume deletion completion check.

New method properly waits for volume deletion to complete.

e2e/libs/volume/rest.py (1)

94-94: LGTM! Consistent method naming.

Method renamed to follow the established naming pattern.

e2e/keywords/volume.resource (3)

98-99: LGTM! Enhanced data integrity validation.

Added proper checksum tracking for volume data operations, improving data integrity validation.

Also applies to: 103-104, 108-109, 113-114

128-131: LGTM! Added volume lifecycle management.

New keywords properly handle volume creation and deletion completion checks.

Also applies to: 148-151

334-336: LGTM! Improved data validation.

Enhanced data validation by comparing actual and expected checksums.

e2e/libs/system_backup/__init__.py (1)

1-1: LGTM! Package-level import looks good

The import statement follows the common Python pattern of re-exporting symbols at the package level for easier access by other modules.

🧰 Tools

🪛 Ruff (0.8.2)

1-1: system_backup.system_backup.SystemBackup imported but unused

(F401)

e2e/libs/volume/crd.py (3)

379-406: LGTM! Well-structured restoration verification.

The implementation properly handles the two-phase verification process for volume restoration, with appropriate error handling and logging. The method name change to to_complete (from completed) follows better naming conventions.

454-456: LGTM! Good defensive programming.

The addition of the volume state verification before writing data prevents potential issues that could arise from writing to an unattached volume.

474-476: LGTM! Consistent with write_random_data changes.

The addition of the volume state verification maintains consistency with the changes in write_random_data and prevents potential issues.

coderabbitai · 2025-01-15T23:56:55Z

e2e/libs/system_backup/crd.py

+class CRD(Base):
+
+    def __init__(self):
+        pass


⚠️ Potential issue

Implementation incomplete: Required methods and initialization missing.

The CRD class is missing several critical components:

The __init__ method should initialize self.obj_api with client.CustomObjectsApi()

Abstract methods create and restore from the Base class must be implemented

According to the PR context, several other methods like get, list, delete, etc. should be present

Please implement the required initialization:

def __init__(self): - pass + super().__init__() + from kubernetes import client + self.obj_api = client.CustomObjectsApi()

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2025-01-15T23:56:55Z

e2e/libs/keywords/system_backup_keywords.py

+class system_backup_keywords:
+
+    def __init__(self):
+        self.system_backup = SystemBackup()
+


🛠️ Refactor suggestion

Rename class to follow Python naming conventions.

The class name should be in PascalCase as per PEP 8.

-class system_backup_keywords: +class SystemBackupKeywords:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class system_backup_keywords:

def __init__(self):

self.system_backup = SystemBackup()

class SystemBackupKeywords:

def __init__(self):

self.system_backup = SystemBackup()

coderabbitai · 2025-01-15T23:56:55Z

e2e/tests/regression/test_backup.robot

+    # create failed backups by resetting backupstore and create bakups without avaiable backup targets
+    And Reset Backupstore
+    And Create backup 0 for volume 0 without waiting for completion
+    And Create backup 1 for volume 1 without waiting for completion
+    And Set Backupstore


⚠️ Potential issue

Fix spelling errors in comments.

There are spelling errors in the comment:

"bakups" should be "backups"

"avaiable" should be "available"

- # create failed backups by resetting backupstore and create bakups without avaiable backup targets + # create failed backups by resetting backupstore and create backups without available backup targets

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# create failed backups by resetting backupstore and create bakups without avaiable backup targets

And Reset Backupstore

And Create backup 0 for volume 0 without waiting for completion

And Create backup 1 for volume 1 without waiting for completion

And Set Backupstore

# create failed backups by resetting backupstore and create backups without available backup targets

And Reset Backupstore

And Create backup 0 for volume 0 without waiting for completion

And Create backup 1 for volume 1 without waiting for completion

And Set Backupstore

🧰 Tools

🪛 GitHub Check: codespell

[failure] 124-124:
bakups ==> backups

[failure] 124-124:
avaiable ==> available

🪛 GitHub Actions: Codespell

[error] 124-124: Spelling error: 'bakups' should be 'backups'

coderabbitai · 2025-01-15T23:56:56Z

e2e/libs/backupstore/base.py

+        for i in range(self.retry_count):
+            logging(f"Updating backupstore to {url}${credential_secret} ... ({i})")
+            try:
+                backup_target.backupTargetUpdate(name=self.DEFAULT_BACKUPTARGET,
+                                                 backupTargetURL=url,
+                                                 credentialSecret=credential_secret,
+                                                 pollInterval=str(poll_interval))
+                return
+            except Exception as e:
+                logging(f"Failed to update backupstore to {url}${credential_secret}: {e}")
+            time.sleep(self.retry_interval)
+        assert False, f"Failed to update backupstore to {url}${credential_secret}"


⚠️ Potential issue

Replace assert False with proper exception handling.

The retry mechanism is well-implemented, but using assert False for error handling is not recommended as assertions can be disabled with Python's -O flag.

Apply this diff to improve error handling:

- assert False, f"Failed to update backupstore to {url}${credential_secret}" + raise RuntimeError(f"Failed to update backupstore to {url}${credential_secret}")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

for i in range(self.retry_count):

logging(f"Updating backupstore to {url}${credential_secret} ... ({i})")

try:

backup_target.backupTargetUpdate(name=self.DEFAULT_BACKUPTARGET,

backupTargetURL=url,

credentialSecret=credential_secret,

pollInterval=str(poll_interval))

return

except Exception as e:

logging(f"Failed to update backupstore to {url}${credential_secret}: {e}")

time.sleep(self.retry_interval)

assert False, f"Failed to update backupstore to {url}${credential_secret}"

for i in range(self.retry_count):

logging(f"Updating backupstore to {url}${credential_secret} ... ({i})")

try:

backup_target.backupTargetUpdate(name=self.DEFAULT_BACKUPTARGET,

backupTargetURL=url,

credentialSecret=credential_secret,

pollInterval=str(poll_interval))

return

except Exception as e:

logging(f"Failed to update backupstore to {url}${credential_secret}: {e}")

time.sleep(self.retry_interval)

raise RuntimeError(f"Failed to update backupstore to {url}${credential_secret}")

🧰 Tools

🪛 Ruff (0.8.2)

129-129: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

coderabbitai · 2025-01-15T23:56:56Z

e2e/libs/volume/rest.py

+    def wait_for_volume_to_be_created(self, volume_name):
+        return NotImplemented
+


💡 Codebase verification

REST implementation needed for volume creation check

The REST implementation should follow the CRD pattern with proper retry mechanism and error handling:

def wait_for_volume_to_be_created(self, volume_name): for i in range(self.retry_count): logging(f"Waiting for volume {volume_name} to be created ... ({i})") try: self.get(volume_name) return except Exception as e: print(f"Failed to wait for volume {volume_name} to be created: {e}") time.sleep(self.retry_interval) assert False, f"Failed to wait for volume {volume_name} to be created"

🔗 Analysis chain

Implementation needed for volume creation check.

The method is currently marked as NotImplemented. Please provide an implementation to maintain consistency with the CRD implementation.

Run this script to find the CRD implementation for reference:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Find the CRD implementation of wait_for_volume_to_be_created ast-grep --pattern 'def wait_for_volume_to_be_created($$$)'

Length of output: 61

Script:

#!/bin/bash # Find volume-related files fd volume # Search for volume creation waiting patterns rg -i "wait.*volume.*creat" -A 5 # Search specifically in volume directories for similar implementations rg "def wait.*volume" -A 5

Length of output: 74835

coderabbitai · 2025-01-15T23:56:56Z

e2e/libs/volume/crd.py

+    def wait_for_volume_to_be_created(self, volume_name):
+        for i in range(self.retry_count):
+            logging(f"Waiting for volume {volume_name} to be created ... ({i})")
+            try:
+                self.get(volume_name)
+                return
+            except Exception as e:
+                print(f"Failed to wait for volume {volume_name} to be created: {e}")
+            time.sleep(self.retry_interval)
+        assert False, f"Failed to wait for volume {volume_name} to be created"


⚠️ Potential issue

Improve error handling and logging consistency.

The implementation has the following issues:

Inconsistent error logging: Uses print instead of the logging function.

Use of assert False which can be stripped by Python -O flag.

Apply this diff to fix the issues:

def wait_for_volume_to_be_created(self, volume_name): for i in range(self.retry_count): logging(f"Waiting for volume {volume_name} to be created ... ({i})") try: self.get(volume_name) return except Exception as e: - print(f"Failed to wait for volume {volume_name} to be created: {e}") + logging(f"Failed to wait for volume {volume_name} to be created: {e}") time.sleep(self.retry_interval) - assert False, f"Failed to wait for volume {volume_name} to be created" + raise AssertionError(f"Failed to wait for volume {volume_name} to be created")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def wait_for_volume_to_be_created(self, volume_name):

for i in range(self.retry_count):

logging(f"Waiting for volume {volume_name} to be created ... ({i})")

try:

self.get(volume_name)

return

except Exception as e:

print(f"Failed to wait for volume {volume_name} to be created: {e}")

time.sleep(self.retry_interval)

assert False, f"Failed to wait for volume {volume_name} to be created"

def wait_for_volume_to_be_created(self, volume_name):

for i in range(self.retry_count):

logging(f"Waiting for volume {volume_name} to be created ... ({i})")

try:

self.get(volume_name)

return

except Exception as e:

logging(f"Failed to wait for volume {volume_name} to be created: {e}")

time.sleep(self.retry_interval)

raise AssertionError(f"Failed to wait for volume {volume_name} to be created")

🧰 Tools

🪛 Ruff (0.8.2)

258-258: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (1)

e2e/libs/volume/crd.py (1)
249-259: ⚠️ Potential issue

Improve error handling and logging consistency.

The implementation has the following issues:

Inconsistent error logging: Uses print instead of the logging function.

Use of assert False which can be stripped by Python -O flag.

Apply this diff to fix the issues:
     def wait_for_volume_to_be_created(self, volume_name):
         for i in range(self.retry_count):
             logging(f"Waiting for volume {volume_name} to be created ... ({i})")
             try:
                 self.get(volume_name)
                 return
             except Exception as e:
-                print(f"Failed to wait for volume {volume_name} to be created: {e}")
+                logging(f"Failed to wait for volume {volume_name} to be created: {e}")
             time.sleep(self.retry_interval)
-        assert False, f"Failed to wait for volume {volume_name} to be created"
+        raise AssertionError(f"Failed to wait for volume {volume_name} to be created")
🧰 Tools

🪛 Ruff (0.8.2)

258-258: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

🧹 Nitpick comments (4)

e2e/libs/system_backup/rest.py (2)
13-27: Consider enhancing error handling and state validation.

While the implementation is functional, consider these improvements for robustness:

Handle error states (e.g., "Error", "Failed") explicitly

Add timeout handling beyond retry count

Log the actual state when not "Ready"
 def create(self, backup_name):
     logging(f"Creating system backup {backup_name}")
     get_longhorn_client().create_system_backup(Name=backup_name)
     ready = False
     for i in range(self.retry_count):
         logging(f"Waiting for system backup {backup_name} to be ready ... ({i})")
         try:
             system_backup = get_longhorn_client().by_id_system_backup(backup_name)
-            if system_backup.state == "Ready":
+            if system_backup.state in ["Error", "Failed"]:
+                raise RuntimeError(f"System backup {backup_name} failed: {system_backup.error}")
+            elif system_backup.state == "Ready":
                 ready = True
                 break
+            else:
+                logging(f"Current state: {system_backup.state}")
         except Exception as e:
             logging(f"Waiting for system backup {backup_name} to be ready failed: {e}")
         time.sleep(self.retry_interval)
     assert ready, f"Waiting for system backup {backup_name} to be ready failed: {system_backup}"
1-80: Overall implementation is solid with room for improvement.

The code successfully implements system backup and restore functionality with appropriate retry mechanisms and logging. Consider the suggested refactoring to improve maintainability and robustness:

Extract common retry logic

Consolidate cleanup methods

Enhance error handling

Address static analysis warnings

The current implementation is functional and can be merged, with these improvements tracked as follow-up tasks.

🧰 Tools

🪛 Ruff (0.8.2)

56-56: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

74-74: Loop control variable i not used within loop body

Rename unused i to _i

(B007)
e2e/tests/regression/test_system_backup.robot (1)
38-56: Consider adding validation for system backup completion.

While the test structure is good, consider adding explicit validation after system backup creation to ensure it completed successfully before proceeding with uninstallation.

Add these steps after line 51:
    And Create system backup 0
+    And Wait for system backup 0 completion
+    And Verify system backup 0 exists
e2e/libs/backup/rest.py (1)
221-223: Consider adding error handling in cleanup_backups.

While the delegation to CRD().cleanup_backups() is clean, consider adding error handling to catch and log potential failures.
 def cleanup_backups(self):
-    return CRD().cleanup_backups()
+    try:
+        return CRD().cleanup_backups()
+    except Exception as e:
+        logging(f"Failed to cleanup backups: {e}")
+        raise

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76cbb85 and eb7fc95.

📒 Files selected for processing (27)

e2e/keywords/backup.resource (1 hunks)
e2e/keywords/common.resource (2 hunks)
e2e/keywords/system_backup.resource (1 hunks)
e2e/keywords/variables.resource (1 hunks)
e2e/keywords/volume.resource (6 hunks)
e2e/libs/backup/backup.py (2 hunks)
e2e/libs/backup/base.py (2 hunks)
e2e/libs/backup/crd.py (1 hunks)
e2e/libs/backup/rest.py (4 hunks)
e2e/libs/backupstore/base.py (3 hunks)
e2e/libs/k8s/k8s.py (3 hunks)
e2e/libs/keywords/backup_keywords.py (2 hunks)
e2e/libs/keywords/system_backup_keywords.py (1 hunks)
e2e/libs/keywords/volume_keywords.py (4 hunks)
e2e/libs/system_backup/__init__.py (1 hunks)
e2e/libs/system_backup/base.py (1 hunks)
e2e/libs/system_backup/crd.py (1 hunks)
e2e/libs/system_backup/rest.py (1 hunks)
e2e/libs/system_backup/system_backup.py (1 hunks)
e2e/libs/utility/constant.py (1 hunks)
e2e/libs/utility/utility.py (1 hunks)
e2e/libs/volume/base.py (2 hunks)
e2e/libs/volume/crd.py (4 hunks)
e2e/libs/volume/rest.py (2 hunks)
e2e/libs/volume/volume.py (2 hunks)
e2e/tests/regression/test_backup.robot (2 hunks)
e2e/tests/regression/test_system_backup.robot (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (17)

e2e/libs/utility/constant.py
e2e/libs/system_backup/crd.py
e2e/keywords/variables.resource
e2e/keywords/backup.resource
e2e/keywords/system_backup.resource
e2e/keywords/common.resource
e2e/tests/regression/test_backup.robot
e2e/libs/backup/base.py
e2e/libs/utility/utility.py
e2e/libs/volume/base.py
e2e/libs/system_backup/base.py
e2e/libs/volume/rest.py
e2e/libs/backup/backup.py
e2e/libs/volume/volume.py
e2e/libs/keywords/volume_keywords.py
e2e/libs/system_backup/system_backup.py
e2e/libs/keywords/system_backup_keywords.py

🧰 Additional context used

🪛 Ruff (0.8.2)

e2e/libs/volume/crd.py

258-258: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

e2e/libs/backupstore/base.py

129-129: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

e2e/libs/k8s/k8s.py

126-126: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

e2e/libs/system_backup/__init__.py

1-1: system_backup.system_backup.SystemBackup imported but unused

(F401)

e2e/libs/system_backup/rest.py

56-56: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

74-74: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: Build images
GitHub Check: Summary

🔇 Additional comments (19)

e2e/libs/system_backup/rest.py (2)

1-8: LGTM! Well-structured imports and class definition.

The imports are logically organized and the class inheritance from Base is appropriate for the system backup functionality.

10-11: LGTM! Clean constructor implementation.

Good practice using the utility function for retry configuration.

e2e/keywords/volume.resource (4)

98-99: LGTM! Improved data integrity verification.

The changes correctly capture and store the checksum for later verification.

128-131: LGTM! Added volume creation wait keyword.

The keyword correctly waits for volume creation and follows the established pattern.

148-151: LGTM! Added volume deletion wait keyword.

The keyword correctly waits for volume deletion and follows the established pattern.

334-336: LGTM! Added checksum verification.

The code correctly verifies data integrity by comparing actual and expected checksums.

e2e/libs/volume/crd.py (2)

379-406: LGTM! Well-structured volume restoration wait logic.

The implementation:

Correctly waits for engine restoration

Properly waits for backup update

Appropriately checks restore required status

454-456: LGTM! Added volume state validation.

The check ensures the volume is in "attached" state before writing data, preventing potential errors.

e2e/libs/keywords/backup_keywords.py (2)

12-13: LGTM! Good addition of asynchronous backup creation support.

The new wait parameter allows for more flexible testing scenarios by supporting both synchronous and asynchronous backup creation.

53-53: LGTM! Enhanced cleanup with backup removal.

Good addition of cleanup_backups() call to ensure thorough cleanup of all backup resources.

e2e/libs/system_backup/__init__.py (1)

1-1: LGTM! Appropriate package-level import.

The SystemBackup import in init.py follows the Python package convention of exposing classes at the package level. The static analysis warning about unused import can be safely ignored as this import is meant for external module usage.

🧰 Tools

🪛 Ruff (0.8.2)

1-1: system_backup.system_backup.SystemBackup imported but unused

(F401)

e2e/tests/regression/test_system_backup.robot (1)

17-37: LGTM! Well-structured system backup and restore test.

The test follows BDD style and has good coverage of the backup and restore workflow. The validation steps are comprehensive, checking both volume health and data integrity.
e2e/libs/backup/crd.py (1)

10-54: Consider implementing high-priority methods first.

Several methods are marked as NotImplemented. Consider prioritizing the implementation of critical methods needed for the system backup test cases, such as:

create

get

wait_for_backup_completed

restore

Let's check which methods are actually used in the test cases:

✅ Verification successful

The suggested prioritization of critical methods is correct.

The codebase analysis confirms that the following methods are actively used in test cases and should be prioritized:

create: Used in backup keywords for creating backups

get: Used in backup operations and verification

wait_for_backup_completed: Used in backup completion verification

restore: Used in backup restoration tests
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Find usage of CRD class methods in test files
rg -l "CRD" | xargs rg "def \w+\(" | cut -d: -f2 | sort | uniq
Length of output: 10719

Script:
#!/bin/bash
# Find actual usage of the specific CRD methods in test files
echo "=== Searching for CRD method calls ==="
for method in "create" "get" "wait_for_backup_completed" "restore" "get_backup_volume" "delete" "list" "list_all"; do
    echo "Looking for .$method calls:"
    rg "\.\s*$method\s*\(" --type py
done
Length of output: 40893
e2e/libs/backupstore/base.py (2)

44-44: LGTM: Good initialization of retry parameters.

The retry parameters are properly initialized using the utility function.

118-129: ⚠️ Potential issue

Replace assert False with proper exception handling.

The retry mechanism is well-implemented, but using assert False for error handling is not recommended as assertions can be disabled with Python's -O flag.

Apply this diff to improve error handling:
-        assert False, f"Failed to update backupstore to {url}${credential_secret}"
+        raise RuntimeError(f"Failed to update backupstore to {url}${credential_secret}")
Likely invalid or redundant comment.

🧰 Tools

🪛 Ruff (0.8.2)

129-129: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)
e2e/libs/k8s/k8s.py (1)

Line range hint 106-120: LGTM: Improved job completion logging.

Good enhancement of logging for job completion status, which will help with debugging uninstallation issues.

🧰 Tools

🪛 Ruff (0.8.2)

126-126: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

e2e/libs/backup/rest.py (3)

23-32: LGTM: Good addition of wait parameter.

The wait parameter provides flexibility in backup creation, allowing for both synchronous and asynchronous operations.

39-42: LGTM: Enhanced logging clarity.

The logging messages now provide clearer information about the expected backup name.

182-183: LGTM: Improved backup volume deletion logging.

Added informative logging for backup volume deletion progress.

coderabbitai · 2025-01-16T00:09:27Z

e2e/libs/system_backup/rest.py

+    def restore(self, backup_name):
+        restore_name = generate_name_random("system-restore-")
+        logging(f"Creating system restore {restore_name} from system backup {backup_name}")
+        get_longhorn_client().create_system_restore(Name=restore_name, SystemBackup=backup_name)
+        completed = False
+        for i in range(self.retry_count):
+            logging(f"Waiting for system restore {restore_name} to be completed ... ({i})")
+            try:
+                system_restore = get_longhorn_client().by_id_system_restore(restore_name)
+                if system_restore.state == "Completed":
+                    completed = True
+                    break
+            except Exception as e:
+                logging(f"Waiting for system restore {restore_name} to be completed failed: {e}")
+            time.sleep(self.retry_interval)
+        assert completed, f"Waiting for system restore {restore_name} to be completed failed: {system_restore}"


🛠️ Refactor suggestion

Consider extracting common retry logic and enhancing error handling.

The restore method has similar patterns to the create method. Consider:

Extracting the retry logic into a common utility method

Enhancing error state handling similar to the create method

Example of extracted retry logic:

def _wait_for_state(self, resource_type, resource_name, desired_state, get_resource_func): """ Generic wait function for resource state """ success = False for _ in range(self.retry_count): try: resource = get_resource_func(resource_name) if resource.state in ["Error", "Failed"]: raise RuntimeError(f"{resource_type} {resource_name} failed: {resource.error}") elif resource.state == desired_state: success = True break else: logging(f"Current state: {resource.state}") except Exception as e: logging(f"Waiting for {resource_type} {resource_name} failed: {e}") time.sleep(self.retry_interval) assert success, f"Waiting for {resource_type} {resource_name} failed: State not reached"

coderabbitai · 2025-01-16T00:09:27Z

e2e/libs/system_backup/rest.py

+    def cleanup_system_backups(self):
+        system_backups = get_longhorn_client().list_system_backup()
+        for system_backup in system_backups:
+            logging(f"Deleting system backup {system_backup['name']}")
+            try:
+                get_longhorn_client().delete(system_backup)
+            except Exception as e:
+                logging(f"Deleting system backup {system_backup['name']} failed: {e}")
+
+        deleted = False
+        for i in range(self.retry_count):
+            system_backups = get_longhorn_client().list_system_backup()
+            if len(system_backups) == 0:
+                deleted = True
+                break
+            time.sleep(self.retry_interval)
+        assert deleted, f"Cleaning up system backups failed: {system_backups}"
+
+    def cleanup_system_restores(self):
+        system_restores = get_longhorn_client().list_system_restore()
+        for system_restore in system_restores:
+            logging(f"Deleting system restore {system_restore['name']}")
+            try:
+                get_longhorn_client().delete(system_restore)
+            except Exception as e:
+                logging(f"Deleting system restore {system_restore['name']} failed: {e}")
+
+        deleted = False
+        for i in range(self.retry_count):
+            system_restores = get_longhorn_client().list_system_restore()
+            if len(system_restores) == 0:
+                deleted = True
+                break
+            time.sleep(self.retry_interval)
+        assert deleted, f"Cleaning up system restores failed: {system_restores}"


🛠️ Refactor suggestion

Consolidate cleanup methods and address static analysis warnings.

Consider these improvements:

Consolidate the duplicate cleanup logic into a single method

Rename unused loop variables from i to _ as suggested by static analysis

Add error state handling for cleanup operations

Example consolidated cleanup method:

def _cleanup_resources(self, resource_type, list_func, resource_name_key='name'): """ Generic cleanup function for system resources """ resources = list_func() for resource in resources: name = resource[resource_name_key] logging(f"Deleting {resource_type} {name}") try: get_longhorn_client().delete(resource) except Exception as e: logging(f"Deleting {resource_type} {name} failed: {e}") deleted = False for _ in range(self.retry_count): resources = list_func() if len(resources) == 0: deleted = True break time.sleep(self.retry_interval) assert deleted, f"Cleaning up {resource_type}s failed: {resources}" def cleanup_system_backups(self): self._cleanup_resources('system backup', get_longhorn_client().list_system_backup) def cleanup_system_restores(self): self._cleanup_resources('system restore', get_longhorn_client().list_system_restore)

🧰 Tools

🪛 Ruff (0.8.2)

56-56: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

74-74: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

coderabbitai · 2025-01-16T00:09:27Z

e2e/libs/backup/crd.py

+    def cleanup_backups(self):
+        # Use k8s api to delete all backup especially backup in error state
+        # Because backup in error state does not have backup volume
+        backups = self.obj_api.list_namespaced_custom_object("longhorn.io",
+                                                             "v1beta2",
+                                                             "longhorn-system",
+                                                             "backups")
+        for backup in backups['items']:
+            logging(f"Deleting backup {backup['metadata']['name']}")
+            try:
+                self.obj_api.delete_namespaced_custom_object("longhorn.io",
+                                                             "v1beta2",
+                                                             "longhorn-system",
+                                                             "backups",
+                                                             backup['metadata']['name'])
+            except Exception as e:
+                if e.reason != "Not Found":
+                    Exception(f"Deleting backup failed: {e}")
+


⚠️ Potential issue

Fix error handling in cleanup_backups method.

The current error handling has two issues:

The exception is not properly raised (using Exception() instead of raise)

The error message could be more descriptive

Apply this diff to improve error handling:

except Exception as e: if e.reason != "Not Found": - Exception(f"Deleting backup failed: {e}") + raise Exception( + f"Failed to delete backup {backup['metadata']['name']}: {e}" + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def cleanup_backups(self):

# Use k8s api to delete all backup especially backup in error state

# Because backup in error state does not have backup volume

backups = self.obj_api.list_namespaced_custom_object("longhorn.io",

"v1beta2",

"longhorn-system",

"backups")

for backup in backups['items']:

logging(f"Deleting backup {backup['metadata']['name']}")

try:

self.obj_api.delete_namespaced_custom_object("longhorn.io",

"v1beta2",

"longhorn-system",

"backups",

backup['metadata']['name'])

except Exception as e:

if e.reason != "Not Found":

Exception(f"Deleting backup failed: {e}")

def cleanup_backups(self):

# Use k8s api to delete all backup especially backup in error state

# Because backup in error state does not have backup volume

backups = self.obj_api.list_namespaced_custom_object("longhorn.io",

"v1beta2",

"longhorn-system",

"backups")

for backup in backups['items']:

logging(f"Deleting backup {backup['metadata']['name']}")

try:

self.obj_api.delete_namespaced_custom_object("longhorn.io",

"v1beta2",

"longhorn-system",

"backups",

backup['metadata']['name'])

except Exception as e:

if e.reason != "Not Found":

raise Exception(

f"Failed to delete backup {backup['metadata']['name']}: {e}"

)

coderabbitai · 2025-01-16T00:09:27Z

e2e/libs/k8s/k8s.py

                return

        logging(f"Waiting for job with label {job_label} complete, retry ({i}) ...")
        time.sleep(retry_interval)

-    assert False, 'Job not complete'
+    assert False, f"Job not complete: {get_pod_logs(namespace, job_label)}"


⚠️ Potential issue

Replace assert False with proper exception handling.

Similar to the previous file, assertions should be replaced with proper exceptions.

Apply this diff to improve error handling:

- assert False, f"Job not complete: {get_pod_logs(namespace, job_label)}" + raise RuntimeError(f"Job not complete: {get_pod_logs(namespace, job_label)}")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

assert False, f"Job not complete: {get_pod_logs(namespace, job_label)}"

raise RuntimeError(f"Job not complete: {get_pod_logs(namespace, job_label)}")

🧰 Tools

🪛 Ruff (0.8.2)

126-126: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

…ases Signed-off-by: Yang Chiu <[email protected]>

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

e2e/libs/volume/crd.py (1)
249-259: 🛠️ Refactor suggestion

Replace assert False with raise AssertionError().

The implementation correctly implements a retry mechanism for waiting until a volume is created. However, the use of assert False should be replaced as it can be stripped by Python's -O flag.

Apply this diff to fix the issue:
-        assert False, f"Failed to wait for volume {volume_name} to be created"
+        raise AssertionError(f"Failed to wait for volume {volume_name} to be created")
🧰 Tools

🪛 Ruff (0.8.2)

258-258: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)
e2e/libs/k8s/k8s.py (1)
126-126: ⚠️ Potential issue

Replace assert False with proper exception handling.

Similar to the previous file, assertions should be replaced with proper exceptions.
-    assert False, f"Job not complete: {get_pod_logs(namespace, job_label)}"
+    raise RuntimeError(f"Job not complete: {get_pod_logs(namespace, job_label)}")
🧰 Tools

🪛 Ruff (0.8.2)

126-126: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

🧹 Nitpick comments (3)

e2e/libs/system_backup/__init__.py (1)
1-1: Consider using absolute import path.

While the current relative import works, consider using an absolute import path for better clarity and to avoid potential circular import issues:
-from system_backup.system_backup import SystemBackup
+from longhorn_tests.e2e.libs.system_backup.system_backup import SystemBackup
🧰 Tools

🪛 Ruff (0.8.2)

1-1: system_backup.system_backup.SystemBackup imported but unused

(F401)
e2e/libs/backup/rest.py (2)
Line range hint 39-47: Enhance error message for lastBackup assertion

The assertion message could be more descriptive to help debug test failures.
-        assert volume.lastBackup == backup.name, \
-            f"expect volume lastBackup is {backup.name}, but it's {volume.lastBackup}"
+        assert volume.lastBackup == backup.name, \
+            f"Volume {volume_name}'s lastBackup expected to be '{backup.name}', but got '{volume.lastBackup}'"
221-223: LGTM: Added backup cleanup method

The new method provides a clean interface for backup cleanup operations, which is essential for testing uninstallation scenarios with backups.

Consider adding a docstring to describe the method's purpose and behavior.
     def cleanup_backups(self):
+        """
+        Cleans up all backup resources by delegating to the CRD implementation.
+        This is particularly useful for testing uninstallation scenarios.
+        """
         return CRD().cleanup_backups()

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eb7fc95 and 9c7686c.

📒 Files selected for processing (27)

e2e/keywords/backup.resource (1 hunks)
e2e/keywords/common.resource (2 hunks)
e2e/keywords/system_backup.resource (1 hunks)
e2e/keywords/variables.resource (1 hunks)
e2e/keywords/volume.resource (6 hunks)
e2e/libs/backup/backup.py (2 hunks)
e2e/libs/backup/base.py (2 hunks)
e2e/libs/backup/crd.py (1 hunks)
e2e/libs/backup/rest.py (4 hunks)
e2e/libs/backupstore/base.py (3 hunks)
e2e/libs/k8s/k8s.py (3 hunks)
e2e/libs/keywords/backup_keywords.py (2 hunks)
e2e/libs/keywords/system_backup_keywords.py (1 hunks)
e2e/libs/keywords/volume_keywords.py (4 hunks)
e2e/libs/system_backup/__init__.py (1 hunks)
e2e/libs/system_backup/base.py (1 hunks)
e2e/libs/system_backup/crd.py (1 hunks)
e2e/libs/system_backup/rest.py (1 hunks)
e2e/libs/system_backup/system_backup.py (1 hunks)
e2e/libs/utility/constant.py (1 hunks)
e2e/libs/utility/utility.py (1 hunks)
e2e/libs/volume/base.py (2 hunks)
e2e/libs/volume/crd.py (4 hunks)
e2e/libs/volume/rest.py (2 hunks)
e2e/libs/volume/volume.py (2 hunks)
e2e/tests/regression/test_backup.robot (2 hunks)
e2e/tests/regression/test_system_backup.robot (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (18)

e2e/libs/utility/constant.py
e2e/keywords/variables.resource
e2e/keywords/system_backup.resource
e2e/libs/system_backup/base.py
e2e/keywords/common.resource
e2e/libs/utility/utility.py
e2e/libs/backup/base.py
e2e/libs/keywords/backup_keywords.py
e2e/libs/system_backup/crd.py
e2e/libs/volume/volume.py
e2e/libs/volume/rest.py
e2e/tests/regression/test_system_backup.robot
e2e/libs/backup/backup.py
e2e/libs/system_backup/system_backup.py
e2e/libs/keywords/system_backup_keywords.py
e2e/libs/keywords/volume_keywords.py
e2e/libs/volume/base.py
e2e/libs/backup/crd.py

🧰 Additional context used

🪛 Ruff (0.8.2)

e2e/libs/k8s/k8s.py

126-126: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

e2e/libs/volume/crd.py

258-258: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

e2e/libs/backupstore/base.py

129-129: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

e2e/libs/system_backup/__init__.py

1-1: system_backup.system_backup.SystemBackup imported but unused

(F401)

e2e/libs/system_backup/rest.py

56-56: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

74-74: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: Build images
GitHub Check: Summary

🔇 Additional comments (20)

e2e/libs/system_backup/__init__.py (1)

1-1: Verify the SystemBackup class usage.

The import looks good for exposing the SystemBackup class at package level. Let's verify its usage in the codebase:

✅ Verification successful

SystemBackup class import and usage verified successfully

The class is properly exposed at package level and is actively used in the test automation framework, particularly in the keywords module and is well-documented.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify that SystemBackup is imported and used in other modules
# Expected: Find imports and usage of SystemBackup in keywords and test files

echo "Checking imports of SystemBackup:"
rg "from.*system_backup.*import.*SystemBackup" -g '!system_backup/__init__.py'

echo -e "\nChecking usage of SystemBackup class:"
rg "SystemBackup\(" -g '!system_backup/__init__.py'
Length of output: 895
🧰 Tools

🪛 Ruff (0.8.2)

1-1: system_backup.system_backup.SystemBackup imported but unused

(F401)
e2e/libs/backup/rest.py (3)

4-4: LGTM: Import added for new cleanup functionality

The CRD import is appropriately placed and necessary for the new cleanup_backups method.

23-32: LGTM: Added async backup creation support

The addition of the wait parameter with early return logic enables asynchronous backup creation, which is useful for testing scenarios where immediate return is needed.

182-183: LGTM: Improved logging for backup volume deletion

Enhanced logging with iteration counter improves observability during backup volume deletion.

e2e/keywords/volume.resource (6)

98-99: LGTM!

The implementation correctly captures and stores the checksum for data integrity verification.

103-104: LGTM!

The implementation maintains consistency with the previous segment in handling checksums.

108-109: LGTM!

The implementation maintains consistency in checksum handling across all data writing keywords.

Also applies to: 113-114

128-131: LGTM!

The new keyword enhances the test framework by ensuring volume creation is complete before proceeding.

148-151: LGTM!

The new keyword enhances the test framework by ensuring volume deletion is complete before proceeding.

217-220: LGTM!

The renaming improves readability and follows better naming conventions.

Also applies to: 408-408

e2e/libs/volume/crd.py (3)

379-407: LGTM!

The implementation correctly verifies backup restoration by checking:

Engine's last restored backup matches the expected backup

Volume's last backup is updated

Volume's restore required status is set to False (for non-standby volumes)

455-457: LGTM!

The addition of the state check prevents potential race conditions by ensuring the volume is in "attached" state before writing data.

475-477: LGTM!

The implementation maintains consistency with the previous segment by adding the same state check.

e2e/keywords/backup.resource (1)

12-14: LGTM! Clean implementation of async backup creation.

The new keyword follows the same pattern as the existing synchronous backup creation, with the only difference being the wait=False parameter. This maintains consistency while adding the needed asynchronous capability.
e2e/libs/system_backup/rest.py (2)

46-80: Consolidate duplicate cleanup logic.

The cleanup methods for system backups and restores share similar logic. Consider extracting the common cleanup logic into a reusable method.
+    def _cleanup_resources(self, resource_type, list_func, resource_name_key='name'):
+        """Generic cleanup function for system resources"""
+        resources = list_func()
+        for resource in resources:
+            name = resource[resource_name_key]
+            logging(f"Deleting {resource_type} {name}")
+            try:
+                get_longhorn_client().delete(resource)
+            except Exception as e:
+                logging(f"Deleting {resource_type} {name} failed: {e}")
+
+        deleted = False
+        for _ in range(self.retry_count):
+            resources = list_func()
+            if len(resources) == 0:
+                deleted = True
+                break
+            time.sleep(self.retry_interval)
+        assert deleted, f"Cleaning up {resource_type}s failed: {resources}"
+
     def cleanup_system_backups(self):
-        system_backups = get_longhorn_client().list_system_backup()
-        for system_backup in system_backups:
-            logging(f"Deleting system backup {system_backup['name']}")
-            try:
-                get_longhorn_client().delete(system_backup)
-            except Exception as e:
-                logging(f"Deleting system backup {system_backup['name']} failed: {e}")
-
-        deleted = False
-        for i in range(self.retry_count):
-            system_backups = get_longhorn_client().list_system_backup()
-            if len(system_backups) == 0:
-                deleted = True
-                break
-            time.sleep(self.retry_interval)
-        assert deleted, f"Cleaning up system backups failed: {system_backups}"
+        self._cleanup_resources('system backup', get_longhorn_client().list_system_backup)

     def cleanup_system_restores(self):
-        system_restores = get_longhorn_client().list_system_restore()
-        for system_restore in system_restores:
-            logging(f"Deleting system restore {system_restore['name']}")
-            try:
-                get_longhorn_client().delete(system_restore)
-            except Exception as e:
-                logging(f"Deleting system restore {system_restore['name']} failed: {e}")
-
-        deleted = False
-        for i in range(self.retry_count):
-            system_restores = get_longhorn_client().list_system_restore()
-            if len(system_restores) == 0:
-                deleted = True
-                break
-            time.sleep(self.retry_interval)
-        assert deleted, f"Cleaning up system restores failed: {system_restores}"
+        self._cleanup_resources('system restore', get_longhorn_client().list_system_restore)
🧰 Tools

🪛 Ruff (0.8.2)

56-56: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

74-74: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

29-44: 🛠️ Refactor suggestion

Add error state handling in restore method.

The restore method should handle error states similar to the create method. Consider checking for error/failed states in the system restore.
 def restore(self, backup_name):
     restore_name = generate_name_random("system-restore-")
     logging(f"Creating system restore {restore_name} from system backup {backup_name}")
     get_longhorn_client().create_system_restore(Name=restore_name, SystemBackup=backup_name)
     completed = False
     for i in range(self.retry_count):
         logging(f"Waiting for system restore {restore_name} to be completed ... ({i})")
         try:
             system_restore = get_longhorn_client().by_id_system_restore(restore_name)
+            if system_restore.state in ["Error", "Failed"]:
+                raise RuntimeError(f"System restore {restore_name} failed: {system_restore.error}")
             if system_restore.state == "Completed":
                 completed = True
                 break
Likely invalid or redundant comment.
e2e/tests/regression/test_backup.robot (2)

124-124: Fix spelling errors in comment.

Correct the spelling errors in the comment:

"bakups" should be "backups"

"avaiable" should be "available"

111-135: LGTM! Comprehensive test coverage for uninstallation with backups.

The test case effectively covers:

Both v1 and v2 data engine backups

Failed backup scenarios

Proper cleanup and reinstallation
e2e/libs/backupstore/base.py (1)

118-129: ⚠️ Potential issue

Replace assert False with proper exception handling.

Using assert False is not recommended as assertions can be disabled with Python's -O flag.
-        assert False, f"Failed to update backupstore to {url}${credential_secret}"
+        raise RuntimeError(f"Failed to update backupstore to {url}${credential_secret}")
Likely invalid or redundant comment.

🧰 Tools

🪛 Ruff (0.8.2)

129-129: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)
e2e/libs/k8s/k8s.py (1)

Line range hint 106-120: LGTM! Improved job completion monitoring.

The changes enhance debugging capabilities by:

Using appropriate timeout for uninstallation

Adding detailed logging of job completion status

🧰 Tools

🪛 Ruff (0.8.2)

126-126: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

yangchiu self-assigned this Jan 15, 2025

yangchiu requested a review from a team as a code owner January 15, 2025 23:52

coderabbitai bot reviewed Jan 15, 2025

View reviewed changes

yangchiu force-pushed the robot-system-backup branch from 76cbb85 to eb7fc95 Compare January 16, 2025 00:06

coderabbitai bot reviewed Jan 16, 2025

View reviewed changes

yangchiu force-pushed the robot-system-backup branch from eb7fc95 to c221dc7 Compare January 16, 2025 00:09

test(robot): add uninstallation with backups and system backup test c…

9c7686c

…ases Signed-off-by: Yang Chiu <[email protected]>

yangchiu force-pushed the robot-system-backup branch from c221dc7 to 9c7686c Compare January 16, 2025 02:06

coderabbitai bot reviewed Jan 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(robot): add uninstallation with backups and system backup test cases #2274

test(robot): add uninstallation with backups and system backup test cases #2274

yangchiu commented Jan 15, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 15, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Jan 15, 2025

coderabbitai bot Jan 15, 2025

coderabbitai bot Jan 15, 2025

coderabbitai bot Jan 15, 2025

coderabbitai bot Jan 15, 2025

coderabbitai bot Jan 15, 2025

coderabbitai bot left a comment

coderabbitai bot Jan 16, 2025

coderabbitai bot Jan 16, 2025

coderabbitai bot Jan 16, 2025

coderabbitai bot Jan 16, 2025

coderabbitai bot left a comment

		def wait_for_volume_to_be_created(self, volume_name):
		return NotImplemented

	assert False, f"Job not complete: {get_pod_logs(namespace, job_label)}"
	raise RuntimeError(f"Job not complete: {get_pod_logs(namespace, job_label)}")

test(robot): add uninstallation with backups and system backup test cases #2274

Are you sure you want to change the base?

test(robot): add uninstallation with backups and system backup test cases #2274

Conversation

yangchiu commented Jan 15, 2025 • edited by coderabbitai bot Loading

Which issue(s) this PR fixes:

What this PR does / why we need it:

Special notes for your reviewer:

Additional documentation or context

Summary by CodeRabbit

coderabbitai bot commented Jan 15, 2025 • edited Loading

Walkthrough

Changes

Assessment against linked issues

Possibly related PRs

Suggested reviewers

Poem

Finishing Touches

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 15, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 15, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 15, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 15, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 15, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 15, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 16, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 16, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 16, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 16, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

yangchiu commented Jan 15, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 15, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)