Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APPEALS-26986: Stuck jobs bulk fix multiple jobs #19101

Closed
wants to merge 4 commits into from

Conversation

gcd253
Copy link
Contributor

@gcd253 gcd253 commented Aug 1, 2023

Co-authored-by: Ron Wabukenda [email protected]
Co-authored-by: AdamShawBAH [email protected]

Resolves Jira Issue Title
https://jira.devops.va.gov/browse/APPEALS-26955
https://jira.devops.va.gov/browse/APPEALS-27036
https://jira.devops.va.gov/browse/APPEALS-27067
https://jira.devops.va.gov/browse/APPEALS-27061

Description

This pull request creates the StuckJobsFix class within the Warroom Module. The aim of this class is to handle the clearing of errors on multiple stuck jobs.

The pr creates the BulkFixStuckJob. This job will trigger four methods inside the StuckJobsFix class job.

  1. dta_sc_creation_failed_fix
  2. claim_date_dt_fix
  3. claim_not_established_fix
  4. sc_dta_for_appeal_fix

In the future, we will be able to add more jobs to clear error records caused by stuck jobs by simply adding another method inside the StuckJobsFix class and BulkFixStuckJob.

Acceptance Criteria

  • Code compiles correctly

Testing Plan

  1. Go to Jira Issue/Test Plan Link or list them below
  • For feature branches merging into master: Was this deployed to UAT?

Frontend

User Facing Changes

  • Screenshots of UI changes added to PR & Original Issue
BEFORE AFTER

Storybook Story

For Frontend (Presentation) Components

  • Add a Storybook file alongside the component file (e.g. create MyComponent.stories.js alongside MyComponent.jsx)
  • Give it a title that reflects the component's location within the overall Caseflow hierarchy
  • Write a separate story (within the same file) for each discrete variation of the component

Backend

Database Changes

Only for Schema Changes

  • Add typical timestamps (created_at, updated_at) for new tables
  • Update column comments; include a "PII" prefix to indicate definite or potential PII data content
  • Have your migration classes inherit from Caseflow::Migration, especially when adding indexes (use add_safe_index) (see Writing DB migrations)
  • Verify that migrate:rollback works as desired (change supported functions)
  • Perform query profiling (eyeball Rails log, check bullet and fasterer output)
  • For queries using raw sql was an explain plan run by System Team
  • Add appropriate indexes (especially for foreign keys, polymorphic columns, unique constraints, and Rails scopes)
  • Run make check-fks; add any missing foreign keys or add to config/initializers/immigrant.rb (see Record associations and Foreign Keys)
  • Add belongs_to for associations to enable the schema diagrams to be automatically updated
  • Document any non-obvious semantics or logic useful for interpreting database data at Caseflow Data Model and Dictionary

Integrations: Adding endpoints for external APIs

  • Check that Caseflow's external API code for the endpoint matches the code in the relevant integration repo
    • Request: Service name, method name, input field names
    • Response: Check expected data structure
    • Check that calls are wrapped in MetricService record block
  • Check that all configuration is coming from ENV variables
    • Listed all new ENV variables in description
    • Worked with or notified System Team that new ENV variables need to be set
  • Update Fakes
  • For feature branches: Was this tested in Caseflow UAT

Best practices

Code Documentation Updates

  • Add or update code comments at the top of the class, module, and/or component.

Tests

Test Coverage

Did you include any test coverage for your code? Check below:

  • RSpec
  • Jest
  • Other

Code Climate

Your code does not add any new code climate offenses? If so why?

  • No new code climate issues added

Monitoring, Logging, Auditing, Error, and Exception Handling Checklist

Monitoring

  • Are performance metrics (e.g., response time, throughput) being tracked?
  • Are key application components monitored (e.g., database, cache, queues)?
  • Is there a system in place for setting up alerts based on performance thresholds?

Logging

  • Are logs being produced at appropriate log levels (debug, info, warn, error, fatal)?
  • Are logs structured (e.g., using log tags) for easier querying and analysis?
  • Are sensitive data (e.g., passwords, tokens) redacted or omitted from logs?
  • Is log retention and rotation configured correctly?
  • Are logs being forwarded to a centralized logging system if needed?

Auditing

  • Are user actions being logged for audit purposes?
  • Are changes to critical data being tracked ?
  • Are logs being securely stored and protected from tampering or exposing protected data?

Error Handling

  • Are errors being caught and handled gracefully?
  • Are appropriate error messages being displayed to users?
  • Are critical errors being reported to an error tracking system (e.g., Sentry, ELK)?
  • Are unhandled exceptions being caught at the application level ?

Exception Handling

  • Are custom exceptions defined and used where appropriate?
  • Is exception handling consistent throughout the codebase?
  • Are exceptions logged with relevant context and stack trace information?
  • Are exceptions being grouped and categorized for easier analysis and resolution?

@codeclimate
Copy link

codeclimate bot commented Aug 1, 2023

Code Climate has analyzed commit 3d2486b and detected 8 issues on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 3
Style 5

View more on Code Climate.

@nkutub
Copy link
Contributor

nkutub commented Aug 1, 2023

@ronwabVa , overall, good structure and error handling... here are a couple of thoughts:

using generic classes for multiple concerns is not a good idea:

  • as things grow it get really complicated
  • its already too long of a class to keep track of what is what
  • my recommendation is to create a separate class for each error fix.
  • when you do split out these tasks into separate classes I would recommend creating a base class do reduce duplication of code.

using a single job to run multiple isolated work could create issues:

  • the frequency to run and fix each error my not be the same as its peers
  • long running job that may error or get stuck my stop the next activity to run.
  • overall I don't think we should worry about automation at this time with jobs , we should get reporting in first then look at what type of automation is needed if any.
  • The future maybe an admin report in the app where the support team can review and manually trigger a cleanup when needed.

@gcd253 gcd253 closed this Aug 31, 2023
@andrecolinone andrecolinone deleted the dooley/APPEALS-26986 branch November 27, 2023 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants