Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3205 reparse command refactor #3361

Merged
merged 24 commits into from
Feb 19, 2025
Merged

Conversation

raftmsohani
Copy link

@raftmsohani raftmsohani commented Dec 11, 2024

Summary of Changes

Pull request closes #3205 _

Moved all the utility functions to utilize.py, moved frontend reparse command to a separate function and cleaned up clean_and_reparse management command.

How to Test

Follow steps below:

task up
  1. Open http://localhost:3000/ and sign in.
  2. Proceed to admin page
  3. make sure you have OFA admin role (to see the reparse command in admin) and go to datafiles list
  4. Perform reparse admin command and watch the logs

Deliverables

More details on how deliverables herein are assessed included here.

Deliverable 1: Accepted Features

Checklist of ACs:

  • reparsing from DAC consistent with actions before/after moving utility functions
  • lfrohlich and/or adpennington confirmed that ACs are met.

Deliverable 2: Tested Code

  • Are all areas of code introduced in this PR meaningfully tested?
    • If this PR introduces backend code changes, are they meaningfully tested?
    • If this PR introduces frontend code changes, are they meaningfully tested?
  • Are code coverage minimums met?
    • Frontend coverage: [insert coverage %] (see CodeCov Report comment in PR)
    • Backend coverage: [insert coverage %] (see CodeCov Report comment in PR)

Deliverable 3: Properly Styled Code

  • Are backend code style checks passing on CircleCI?
  • Are frontend code style checks passing on CircleCI?
  • Are code maintainability principles being followed?

Deliverable 4: Accessible

  • Does this PR complete the epic?
  • Are links included to any other gov-approved PRs associated with epic?
  • Does PR include documentation for Raft's a11y review?
  • Did automated and manual testing with iamjolly and ttran-hub using Accessibility Insights reveal any errors introduced in this PR?

Deliverable 5: Deployed

  • Was the code successfully deployed via automated CircleCI process to development on Cloud.gov?

Deliverable 6: Documented

  • Does this PR provide background for why coding decisions were made?
  • If this PR introduces backend code, is that code easy to understand and sufficiently documented, both inline and overall?
  • If this PR introduces frontend code, is that code easy to understand and sufficiently documented, both inline and overall?
  • If this PR introduces dependencies, are their licenses documented?
  • Can reviewer explain and take ownership of these elements presented in this code review?

Deliverable 7: Secure

  • Does the OWASP Scan pass on CircleCI?
  • Do manual code review and manual testing detect any new security issues?
  • If new issues detected, is investigation and/or remediation plan documented?

Deliverable 8: User Research

Research product(s) clearly articulate(s):

  • the purpose of the research
  • methods used to conduct the research
  • who participated in the research
  • what was tested and how
  • impact of research on TDP
  • (if applicable) final design mockups produced for TDP development

@raftmsohani raftmsohani self-assigned this Dec 13, 2024
@raftmsohani raftmsohani added the raft review This issue is ready for raft review label Dec 19, 2024
@@ -1,6 +1,4 @@
# Base Docker compose for all environments
version: "3.4"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is depreciated

Copy link

@jtimpe jtimpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working as expected! One minor organization comment

delete_associated_models,
count_total_num_records,
calculate_timeout,
handle_datafiles,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to just have all these functions in the reparse.py file?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of that as well

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to have a cleaner reparse file while also we can re-use some of these functions if needed. However, for some of these functions it might make sense to move them to reparse, for example handle_datafile is specific to reparse

)

is_sequential = assert_sequential_execution(log_context)
should_exit(not is_sequential)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update this to raise an exception/return an error to the user so that they know why the reparse didn't happen? Even writing to the console would be a start.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can we move this to the very beginning of the function to avoid unnecessary computation if we arent sequential?

fiscal_quarter = None
fiscal_year = None
all_reparse = False
new_indices = False
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a future ticket, could we deduce these fields from the selected datafiles?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I will leave a todo

Copy link

codecov bot commented Dec 26, 2024

Codecov Report

Attention: Patch coverage is 60.14235% with 112 lines in your changes missing coverage. Please review.

Project coverage is 90.66%. Comparing base (f8353dd) to head (df2230b).
Report is 2 commits behind head on develop.

Files with missing lines Patch % Lines
tdrs-backend/tdpservice/search_indexes/reparse.py 30.48% 57 Missing ⚠️
tdrs-backend/tdpservice/search_indexes/utils.py 68.15% 45 Missing and 5 partials ⚠️
...h_indexes/management/commands/clean_and_reparse.py 90.00% 4 Missing ⚠️
tdrs-backend/tdpservice/data_files/tasks.py 50.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3361      +/-   ##
===========================================
- Coverage    91.15%   90.66%   -0.50%     
===========================================
  Files          308      310       +2     
  Lines         8832     8899      +67     
  Branches       670      674       +4     
===========================================
+ Hits          8051     8068      +17     
- Misses         654      704      +50     
  Partials       127      127              
Flag Coverage Δ
dev-backend 90.44% <60.14%> (-0.57%) ⬇️
dev-frontend 92.18% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
tdrs-backend/tdpservice/data_files/tasks.py 73.91% <50.00%> (ø)
...h_indexes/management/commands/clean_and_reparse.py 87.83% <90.00%> (+14.66%) ⬆️
tdrs-backend/tdpservice/search_indexes/utils.py 68.15% <68.15%> (ø)
tdrs-backend/tdpservice/search_indexes/reparse.py 30.48% <30.48%> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f2bdc38...df2230b. Read the comment docs.

return True


def should_exit(condition):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be needed after the management command is removed

@raftmsohani raftmsohani requested a review from elipe17 January 2, 2025 21:17
@raftmsohani raftmsohani requested review from ADPennington and removed request for elipe17 January 7, 2025 16:39
@raftmsohani
Copy link
Author

backup filename is not associated with reparse model id. some concerns about backups being overwritten.

backup filename is not associated with reparse model id. some concerns about backups being overwritten.

@ADPennington Fixed the association with meta model id, also added datetime at the end of filename. But can definitely remove it. With datetime, we also have a timestamp for the backup filename.

@raftmsohani
Copy link
Author

@elipe17 @andrew-jameson @jtimpe : I re-requested your review on this since I had changed the way backup filename is assigned. There was an error for some of backup filenames not getting Meta id and add None to backup filename

raise Exception(f"Sequential execution required for selected file ids: {selected_file_ids}")
meta_model.save()
# Backup the Postgres DB
backup_file_name += f"_rpv{meta_model.pk}_{datetime.datetime.now().strftime('%d-%M-%Y-%H%M%s')}.pg"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an issue with the strftime. I believe it is using minutes for both the minutes and months. Also, can you separate the hours minutes and seconds with colons?

Screenshot 2025-02-05 at 10 57 23 AM

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed the issue with minutes/month, but it is not possible to add colon since it is setting the filename. However, I did separate them with '-'

Copy link

@elipe17 elipe17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick change needed for backup filename

@raftmsohani raftmsohani requested a review from elipe17 February 5, 2025 16:15
@raftmsohani raftmsohani removed the request for review from andrew-jameson February 10, 2025 15:22
@raftmsohani raftmsohani added QASP Review and removed raft review This issue is ready for raft review labels Feb 10, 2025
@lhuxraft
Copy link

Blocked on QASP review until review of #3440 completes

@ADPennington ADPennington added the Deploy with CircleCI-qasp Deploy to https://tdp-frontend-qasp.app.cloud.gov through CircleCI label Feb 18, 2025
Copy link
Collaborator

@ADPennington ADPennington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀 thanks @raftmsohani

Screenshot 2025-02-19 100308

@ADPennington ADPennington added Ready to Merge and removed QASP Review Deploy with CircleCI-qasp Deploy to https://tdp-frontend-qasp.app.cloud.gov through CircleCI labels Feb 19, 2025
@raftmsohani raftmsohani merged commit 9f452a3 into develop Feb 19, 2025
17 checks passed
@raftmsohani raftmsohani deleted the 3205-reparse-command-refactor branch February 19, 2025 15:14
@ADPennington ADPennington mentioned this pull request Mar 6, 2025
30 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Re-parse command refactor
6 participants