Skip to content

.pr_agent_auto_best_practices

root edited this page Mar 27, 2025 · 11 revisions

Pattern 1: Add null safety checks and error handling for potential None values or missing attributes before performing operations on them to prevent runtime errors.

Example code before:

response = response.strip()
score_int = int(suggestion['score'])

Example code after:

if response:
    response = response.strip()
score_int = int(suggestion.get('score', 0))
Relevant past accepted suggestions:
Suggestion 1:

Initialize auth attribute properly

The method doesn't initialize self.auth when neither 'app' nor 'user' deployment types are used, but then tries to check if self.auth exists. This could lead to an AttributeError when an unsupported deployment type is provided.

pr_agent/git_providers/github_provider.py [759-783]

 def _get_github_client(self):
     self.deployment_type = get_settings().get("GITHUB.DEPLOYMENT_TYPE", "user")
+    self.auth = None
     if self.deployment_type == 'app':
         try:
             private_key = get_settings().github.private_key
             app_id = get_settings().github.app_id
         except AttributeError as e:
             raise ValueError("GitHub app credentials are required when using GitHub app deployment") from e
         if not hasattr(self, 'installation_id'):
             raise ValueError("GitHub app installation ID is required when using GitHub app deployment")
-        auth = AppAuthentication(app_id=app_id, private_key=private_key,
+        self.auth = AppAuthentication(app_id=app_id, private_key=private_key,
                                  installation_id=self.installation_id)
-        self.auth = auth
     elif self.deployment_type == 'user':
         try:
             token = get_settings().github.user_token
         except AttributeError as e:
             raise ValueError(
                 "GitHub token is required when using user deployment. See: "
                 "https://github.com/Codium-ai/pr-agent#method-2-run-from-source") from e
         self.auth = Auth.Token(token)
     if self.auth:
         return Github(auth=self.auth, base_url=self.base_url)
     else:
         raise ValueError("Could not authenticate to GitHub")

Suggestion 2:

Handle missing score_why values safely

Ensure that suggestion['score_why'] is always defined and non-empty to prevent potential runtime errors when accessing it in the new code.

pr_agent/tools/pr_code_suggestions.py [798]

-pr_body += f"__\n\nWhy: {suggestion['score_why']}\n\n"
+pr_body += f"__\n\nWhy: {suggestion.get('score_why', 'No explanation provided')}\n\n"

Suggestion 3:

Handle missing or invalid score values

Add a fallback mechanism for suggestion['score'] to handle cases where it might be missing or not convertible to an integer.

pr_agent/tools/pr_code_suggestions.py [804]

-score_int = int(suggestion['score'])
+score_int = int(suggestion.get('score', 0))

Suggestion 4:

Handle missing score values safely

Handle potential None or non-integer score values before conversion to prevent runtime errors

pr_agent/tools/pr_code_suggestions.py [804-808]

-score_int = int(suggestion['score'])
-score_str = f"{score_int}"
-if get_settings().pr_code_suggestions.new_score_mechanism:
-    score_str = self.get_score_str(score_int)
+score = suggestion.get('score')
+if score is not None:
+    score_int = int(score)
+    score_str = f"{score_int}"
+    if get_settings().pr_code_suggestions.new_score_mechanism:
+        score_str = self.get_score_str(score_int)
+else:
+    score_str = "N/A"

Suggestion 5:

Add null safety check

Add a null check for user_message_only_models before using it in the any() function to prevent potential NoneType errors if the setting is not defined.

pr_agent/algo/ai_handlers/litellm_ai_handler.py [201-202]

-user_message_only_models = get_settings().config.user_message_only_models
+user_message_only_models = get_settings().config.user_message_only_models or []
 if user_message_only_models and any(entry in model for entry in user_message_only_models):

Suggestion 6:

Add null check before string operations to prevent potential runtime errors

Add error handling for the case when response is empty or None before attempting string operations. The current implementation could raise AttributeError if response is None.

pr_agent/tools/pr_update_changelog.py [115-117]

+if not response:
+    return ""
 response = response.strip()
 if response.startswith("```"):
     response_lines = response.splitlines()

Suggestion 7:

Add graceful fallback when configuration settings are missing

Add error handling for the case when the model configuration is missing. Currently, if the config doesn't have the required model setting, it will raise an unhandled attribute error.

pr_agent/algo/pr_processing.py [357-360]

 if model_type == ModelType.WEAK:
-    model = get_settings().config.model_week
+    model = getattr(get_settings().config, 'model_week', get_settings().config.model)
 else:
     model = get_settings().config.model

Suggestion 8:

Add error handling for file operations to improve robustness

Consider adding error handling when opening and reading files to gracefully handle potential IOErrors or other exceptions that may occur during file operations.

pr_agent/tools/pr_help_message.py [102-105]

 for file in md_files:
-    with open(file, 'r') as f:
-        file_path = str(file).replace(str(docs_path), '')
-        docs_prompt += f"==file name:==\n\n{file_path}\n\n==file content:==\n\n{f.read().strip()}\n=========\n\n"
+    try:
+        with open(file, 'r') as f:
+            file_path = str(file).relative_to(docs_path)
+            docs_prompt += f"==file name:==\n\n{file_path}\n\n==file content:==\n\n{f.read().strip()}\n=========\n\n"
+    except IOError as e:
+        get_logger().error(f"Error reading file {file}: {e}")

Suggestion 9:

Add input validation to prevent potential runtime errors caused by invalid input

Consider adding input validation for the repo_path parameter to ensure it's not empty or None before using it in the f-string, preventing potential runtime errors.

pr_agent/tools/ticket_pr_compliance_check.py [53-54]

-if issue_number.isdigit() and len(issue_number) < 5:
+if issue_number.isdigit() and len(issue_number) < 5 and repo_path:
     github_tickets.add(f'https://github.com/{repo_path}/issues/{issue_number}')

Pattern 2: Use consistent and descriptive variable/function names that clearly indicate their purpose, and follow naming conventions throughout the codebase.

Example code before:

skip_draft_mr = get_settings().get("GITLAB.SKIP_DRAFT_MR", False)
def should_skip_mr(draft: bool, mr_url: str):

Example code after:

should_skip_draft_mrs = get_settings().get("GITLAB.SKIP_DRAFT_MR", False)
def should_skip_draft_mr(draft: bool, mr_url: str):
Relevant past accepted suggestions:
Suggestion 1:

Use a more descriptive variable name to improve code readability

Consider using a more descriptive variable name for skip_draft_mr. A name like should_skip_draft_mrs would make the boolean's purpose clearer.

pr_agent/servers/gitlab_webhook.py [127]

-skip_draft_mr = get_settings().get("GITLAB.SKIP_DRAFT_MR", False)
+should_skip_draft_mrs = get_settings().get("GITLAB.SKIP_DRAFT_MR", False)
 

Suggestion 2:

Rename function to better reflect its specific purpose

Consider using a more descriptive name for the should_skip_mr function, such as should_skip_draft_mr, to better reflect its specific purpose of checking if a draft merge request should be skipped.

pr_agent/servers/gitlab_webhook.py [78-83]

-def should_skip_mr(draft: bool, mr_url: str):
+def should_skip_draft_mr(draft: bool, mr_url: str):
     skip_draft_mr = get_settings().get("GITLAB.SKIP_DRAFT_MR", False)
     if draft and skip_draft_mr:
         get_logger().info(f"Skipping draft MR: {mr_url}")
         return True
     return False
 

Suggestion 3:

Rename configuration option for clarity

Consider using a more specific name for the skip_types configuration option to better reflect its purpose in the context of patch extension.

pr_agent/settings/configuration.toml [23]

-skip_types =[".md",".txt"]
+patch_extension_skip_types = [".md", ".txt"]
 

Pattern 3: Extract repeated code patterns into separate functions to reduce duplication and improve maintainability.

Example code before:

if PATCH_EXTRA_LINES_BEFORE > MAX_EXTRA_LINES:
    PATCH_EXTRA_LINES_BEFORE = MAX_EXTRA_LINES
    get_logger().warning(f"patch_extra_lines_before was {PATCH_EXTRA_LINES_BEFORE}, capping to {MAX_EXTRA_LINES}")
if PATCH_EXTRA_LINES_AFTER > MAX_EXTRA_LINES:
    PATCH_EXTRA_LINES_AFTER = MAX_EXTRA_LINES
    get_logger().warning(f"patch_extra_lines_after was {PATCH_EXTRA_LINES_AFTER}, capping to {MAX_EXTRA_LINES}")

Example code after:

def cap_and_log_extra_lines(value, direction):
    if value > MAX_EXTRA_LINES:
        get_logger().warning(f"patch_extra_lines_{direction} was {value}, capping to {MAX_EXTRA_LINES}")
        return MAX_EXTRA_LINES
    return value

PATCH_EXTRA_LINES_BEFORE = cap_and_log_extra_lines(PATCH_EXTRA_LINES_BEFORE, "before")
PATCH_EXTRA_LINES_AFTER = cap_and_log_extra_lines(PATCH_EXTRA_LINES_AFTER, "after")
Relevant past accepted suggestions:
Suggestion 1:

Extract capping and logging logic into a separate function

Consider extracting the logic for capping and logging the extra lines into a separate function to reduce code duplication.

pr_agent/algo/pr_processing.py [41-46]

-if PATCH_EXTRA_LINES_BEFORE > MAX_EXTRA_LINES:
-    PATCH_EXTRA_LINES_BEFORE = MAX_EXTRA_LINES
-    get_logger().warning(f"patch_extra_lines_before was {PATCH_EXTRA_LINES_BEFORE}, capping to {MAX_EXTRA_LINES}")
-if PATCH_EXTRA_LINES_AFTER > MAX_EXTRA_LINES:
-    PATCH_EXTRA_LINES_AFTER = MAX_EXTRA_LINES
-    get_logger().warning(f"patch_extra_lines_after was {PATCH_EXTRA_LINES_AFTER}, capping to {MAX_EXTRA_LINES}")
+PATCH_EXTRA_LINES_BEFORE = cap_and_log_extra_lines(PATCH_EXTRA_LINES_BEFORE, "before")
+PATCH_EXTRA_LINES_AFTER = cap_and_log_extra_lines(PATCH_EXTRA_LINES_AFTER, "after")
 
+def cap_and_log_extra_lines(value, direction):
+    if value > MAX_EXTRA_LINES:
+        get_logger().warning(f"patch_extra_lines_{direction} was {value}, capping to {MAX_EXTRA_LINES}")
+        return MAX_EXTRA_LINES
+    return value
+

Suggestion 2:

Extract draft MR skipping logic into a separate function to improve code organization

Consider extracting the logic for checking if a merge request is a draft and should be skipped into a separate function. This will improve code readability and reduce duplication.

pr_agent/servers/gitlab_webhook.py [133-135]

-if draft and skip_draft_mr:
-    get_logger().info(f"Skipping draft MR: {url}")
-    return JSONResponse(status_code=status.HTTP_200_OK, content=jsonable_encoder({"message": "success"}))
+if should_skip_draft_mr(draft, skip_draft_mr, url):
+    return SUCCESS_RESPONSE
 

Suggestion 3:

Extract repeated code into a separate function to reduce duplication

Consider extracting the repeated JSONResponse creation into a separate function to reduce code duplication and improve maintainability.

pr_agent/servers/gitlab_webhook.py [139-143]

+def create_success_response():
+    return JSONResponse(status_code=status.HTTP_200_OK, content=jsonable_encoder({"message": "success"}))
+
 if should_skip_mr(draft, url):
-    return JSONResponse(status_code=status.HTTP_200_OK, content=jsonable_encoder({"message": "success"}))
+    return create_success_response()
 
 get_logger().info(f"New merge request: {url}")
 await _perform_commands_gitlab("pr_commands", PRAgent(), url, log_context)
 

Pattern 4: Use more efficient data structures like dictionaries or sets instead of lists when performing frequent lookups or when dealing with unique items.

Example code before:

files_to_exclude = ['EXAMPLE_BEST_PRACTICE.md', 'compression_strategy.md', '/docs/overview/index.md']
if file_name in files_to_exclude:
    continue

Example code after:

files_to_exclude = {'EXAMPLE_BEST_PRACTICE.md', 'compression_strategy.md', '/docs/overview/index.md'}
if file_name in files_to_exclude:
    continue
Relevant past accepted suggestions:
Suggestion 1:

Use a set for faster lookup of excluded files

Consider using a set instead of a list for files_to_exclude to improve lookup performance, especially if the list of excluded files grows larger.

pr_agent/tools/pr_help_message.py [90]

-files_to_exclude = ['EXAMPLE_BEST_PRACTICE.md', 'compression_strategy.md', '/docs/overview/index.md']
+files_to_exclude = {'EXAMPLE_BEST_PRACTICE.md', 'compression_strategy.md', '/docs/overview/index.md'}

Suggestion 2:

Use a set instead of a list to handle duplicates and improve performance

Consider using a set instead of a list for github_tickets to automatically handle duplicates and improve performance when checking for existing entries.

pr_agent/tools/ticket_pr_compliance_check.py [35-46]

-github_tickets = []
+github_tickets = set()
 try:
     # Pattern to match full GitHub issue URLs and shorthand notations like owner/repo#issue_number or https://github.com/owner/repo/issues/issue_number
     pattern = r'(https://github[^/]+/[^/]+/[^/]+/issues/\d+)|(\b(\w+)/(\w+)#(\d+)\b)'
 
     matches = re.findall(pattern, pr_description)
     for match in matches:
         if match[0]:  # Full URL match
-            github_tickets.append(match[0])
+            github_tickets.add(match[0])
         else:  # Shorthand notation match
             owner, repo, issue_number = match[2], match[3], match[4]
-            github_tickets.append(f'https://github.com/{owner}/{repo}/issues/{issue_number}')
+            github_tickets.add(f'https://github.com/{owner}/{repo}/issues/{issue_number}')

Suggestion 3:

Refactor the mock method to use a dictionary for improved efficiency and maintainability

Consider using a more efficient data structure, such as a dictionary, for the mock_get_content_of_file method. This would improve the readability and maintainability of the code, especially as the number of test cases grows.

tests/unittest/test_bitbucket_provider.py [25-40]

 def mock_get_content_of_file(self, project_key, repository_slug, filename, at=None, markup=None):
-    if at == '9c1cffdd9f276074bfb6fb3b70fbee62d298b058':
-        return 'file\nwith\nsome\nlines\nto\nemulate\na\nreal\nfile\n'
-    elif at == '2a1165446bdf991caf114d01f7c88d84ae7399cf':
-        return 'file\nwith\nmultiple \nlines\nto\nemulate\na\nfake\nfile\n'
-    elif at == 'f617708826cdd0b40abb5245eda71630192a17e3':
-        return 'file\nwith\nmultiple \nlines\nto\nemulate\na\nreal\nfile\n'
-    elif at == 'cb68a3027d6dda065a7692ebf2c90bed1bcdec28':
-        return 'file\nwith\nsome\nchanges\nto\nemulate\na\nreal\nfile\n'
-    elif at == '1905dcf16c0aac6ac24f7ab617ad09c73dc1d23b':
-        return 'file\nwith\nsome\nlines\nto\nemulate\na\nfake\ntest\n'
-    elif at == 'ae4eca7f222c96d396927d48ab7538e2ee13ca63':
-        return 'readme\nwithout\nsome\nlines\nto\nsimulate\na\nreal\nfile'
-    elif at == '548f8ba15abc30875a082156314426806c3f4d97':
-        return 'file\nwith\nsome\nlines\nto\nemulate\na\nreal\nfile'
-    return ''
+    content_map = {
+        '9c1cffdd9f276074bfb6fb3b70fbee62d298b058': 'file\nwith\nsome\nlines\nto\nemulate\na\nreal\nfile\n',
+        '2a1165446bdf991caf114d01f7c88d84ae7399cf': 'file\nwith\nmultiple \nlines\nto\nemulate\na\nfake\nfile\n',
+        'f617708826cdd0b40abb5245eda71630192a17e3': 'file\nwith\nmultiple \nlines\nto\nemulate\na\nreal\nfile\n',
+        'cb68a3027d6dda065a7692ebf2c90bed1bcdec28': 'file\nwith\nsome\nchanges\nto\nemulate\na\nreal\nfile\n',
+        '1905dcf16c0aac6ac24f7ab617ad09c73dc1d23b': 'file\nwith\nsome\nlines\nto\nemulate\na\nfake\ntest\n',
+        'ae4eca7f222c96d396927d48ab7538e2ee13ca63': 'readme\nwithout\nsome\nlines\nto\nsimulate\na\nreal\nfile',
+        '548f8ba15abc30875a082156314426806c3f4d97': 'file\nwith\nsome\nlines\nto\nemulate\na\nreal\nfile'
+    }
+    return content_map.get(at, '')
 

Pattern 5: Fix incorrect boolean logic or conditional expressions that could lead to unexpected behavior or bugs.

Example code before:

if data['changes']['title']['previous'].find('Draft:') != -1 and not data['changes']['title']['current'].find('Draft:') != -1:
    return True

Example code after:

if data['changes']['title']['previous'].find('Draft:') != -1 and data['changes']['title']['current'].find('Draft:') == -1:
    return True
Relevant past accepted suggestions:
Suggestion 1:

Fix incorrect draft-ready state detection

The condition in the draft-to-ready check for older GitLab versions is incorrect. The current logic with double negation is confusing and may lead to false positives.

pr_agent/servers/gitlab_webhook.py [94-95]

-if data['changes']['title']['previous'].find('Draft:') != -1 and not data['changes']['title']['current'].find('Draft:') != -1:
+if data['changes']['title']['previous'].find('Draft:') != -1 and data['changes']['title']['current'].find('Draft:') == -1:
     return True

Suggestion 2:

Fix incorrect temperature handling logic

The condition for handling temperature settings should be inverted - if a model doesn't support temperature or is a custom reasoning model, temperature should be excluded, not included.

pr_agent/algo/ai_handlers/litellm_ai_handler.py [230-231]

-if model not in self.no_support_temperature_models or get_settings().config.custom_reasoning_model:
+if model not in self.no_support_temperature_models and not get_settings().config.custom_reasoning_model:
     kwargs["temperature"] = temperature

Suggestion 3:

Simplify the boolean logic in the function to improve readability

The should_skip_draft_mr function could be simplified by returning the boolean expression directly, without using an if-else statement.

pr_agent/servers/gitlab_webhook.py [83-88]

 def should_skip_draft_mr(draft: bool, mr_url: str):
     skip_draft_mr = get_settings().get("GITLAB.SKIP_DRAFT_MR", False)
-    if draft and skip_draft_mr:
+    should_skip = draft and skip_draft_mr
+    if should_skip:
         get_logger().info(f"Skipping draft MR: {mr_url}")
-        return True
-    return False
+    return should_skip
 

[Auto-generated best practices - 2025-03-27]

Clone this wiki locally