Skip to content

Milestone 3

xts2002 edited this page Nov 13, 2023 · 48 revisions

Incremental Construction

Our goals for this milestones are as follows:

  • Implement Architecture Tactics for Availability
    • Add error handling (e.g. try & catch blocks) to prevent faults
    • Add logging for each of our microservices to detect faults
    • Add DB replication to recover from faults
  • Pick a unique design pattern for each of the three microservices
  • Create State Model diagrams
  • Create Activity Model diagrams
  • Fix minor bugs in our microservices

In addition to the goals above, we also plan to create a front end to improve the overall usability and accessibility of our microservices. Rather than requiring users to use Postman or manually craft HTTP requests, we will provide a website in which they can perform all of the same actions that they would through our APIs.

As for our next milestone, we plan to do the following:

  • Deploy front-end to the cloud (i.e. have a proper website rather than having to run it locally)
  • Improve front-end styling for all components
  • Add triage form functionality
  • Fix minor bugs in our microservices
  • Add features (e.g., password reset) if time allows

Architecture Tactics - Availability

Checklist for Availability

Tactics Group Tactics Question Support? (Y/N) Risk Design Decisions and Locations Rationale and Assumptions
Detect Faults Does the system use ping/echo to detect failure of a component or connection, or network congestion? N Without this tactic, we will not be able to know the speed of communication between each services/component of the system. In worst case if a service takes too long to respond, a user trying to use the service will become frustrated since everything will be stuck on loading with no clear indication of what is happening with the service. There should be a system monitor in place that checks and displays the ping of each components/services every second. If a service takes too long to respond to a user, an error message should be sent to notify both the user and the system administrator of the problem happening. This allows for constant check of the communication speed of each services/component of the system, allowing us to quickly detect if there are problems with component connection or network congestion. We also assume that the ping displayed for each service is always correct and any anomalies in the ping displayed will be reported manually or autonomously.
Does the system use a component to monitor the state of health of other parts of the system? A system monitor can detect failure or congestion in the network or other shared resources, such as from a denial-of-service attack. Y Without a monitor in the system, faults will be hard to discover and trace to in their early stages. The time we find them is usually when they are significantly impacting the usage of the software such that we know something is wrong simply through using it. All microservices are deployed on the Digital Ocean platform, which means system and process logs can be easily viewed and analyzed with the help of the built-in features of Digital Ocean. Log type including activity log, build log, deploy log, and runtime log can be not only easily viewed, but also quickly analyzed through the log forwarding feature provided by Digital Ocean where we can forward logs to external log management providers. Keeping track of software state is like keeping track of a human's health, it is important, and you never know when an issue can occur. A logging function in the design provides us with easy ways to keep track of the state of the program, quickly detect fault in its early stages, and easily track faults when they occur. This all ensure our program will run smoothly and if any fault occurs, have the capability to quickly address such issues. This design assumes that the logging function will be functioning properly all the time.
Does the system use a heartbeat—a periodic message exchange between a system monitor and a process—to detect failure of a component or connection, or network congestion? N Without this, if a process fails, we may not be able to be quickly notified on the fault, potentially leaving the faulty service run for an extended period and causes more and more damage to the whole system. Each microservices, while running, should send their stats periodically to one main monitor service/program. This allows for autonomous and continuous monitoring of the states of each microservices, ensuring no fault can happen for a long period of time without being detected. This assumes that the main monitor service/program always function properly or is maintained periodically to ensure that no fault happens in that service/program as well.
Does the system use a timestamp to detect incorrect sequences of events in distributed systems? N If somehow the sequence of data sent from one service to another is messed up, data may get sent to incorrect locations which can cause issue ranging from faulty information stored in the database to faulty end behavior of the system. Sequence numbers should be created and be associated with the data that is sent across the system, and the system will need to check the sequence number both on the sending and receiving end to ensure that data are in correct order. Correct data that was sent in incorrect sequence can easily result in faulty behavior. Thus, it's necessary to ensure the correct sending and receiving order of data that is transmitted with in a system. This design also assumes that the process assigning the sequence numbers to data are always functioning properly.
"Does the system use voting to check that replicated components are producing the same results? The replicated components may be identical replicas, functionally redundant, or analytically redundant. N In our system, the user account information, which is used across multiple services, is not verified each time before they are used. This can potentially lead to discrepancy in such information across all services if some fault is to happen on one service, corrupting the said information. Each time before user account information is used, it should be checked with a database that stores all such information, to verify the integrity of the information. This way each time a user account information is being used, the integrity of the information will be checked first for any discrepancies, allowing extra accuracy on the information used and processed by each services of the system. Note this assumes that the information stored in the information database is always correct.
Does the system use exception detection to detect a system condition that alters the normal flow of execution (e.g., system exception, parameter fence, parameter typing, timeout)? Y One service can wait for response from another service for an abnormal amount of time without producing any error message. There should be a set value on the expected wait time and the maximum wait time allowed for a service to wait for a response from another service. Once the maximum wait time allowed is exceeded, the service waiting for response need to raise an exception. This allows each service to check for anomalies in other services through checking the time for other services to respond to them, providing a quick and easy way to check for faults in services. Note this assume that that the service waiting for responses is in working order and is functioning properly.
Can the system do a self-test to test itself for correct operation? N Without a self-test feature, each service or the whole system will not be able to detect fault when the damage caused by the fault is minimum or ignorable. Autonomous maintaining of the whole system also won't be viable and manual labor will have to be involved. There needs to be a full self-check of the whole system scheduled in a set frequency (ex. Every month), a quick self-check should also be initiated in the background when a system fault is detected, either for the service where the fault occurred or the whole system. Regular maintenance of the whole system is always good and the quick self-check that will be triggered after a fault is detected will also be able to help with fully scanning the system of any potential faults. Note this assume that the self-check feature will be running independently from the main program feature and services and will not be affected by the fault occurring in the main program/services.
Recover from Faults (Preparation and Repair) Does the system employ redundant spares? Is a component’s role as active versus spare fixed, or does it change in the presence of a fault? What is the switchover mechanism? What is the trigger for a switchover? How long does it take for a spare to assume its duties? Y Redundant spares provide immediate backup to the working system in case of a potential fault occurring on the system or a component of the system. Without it a system is always at the risk of having faults and not being able to function until a fix is implemented. Since our microservices are deployed on Digital Ocean, it includes the Digital Ocean daily backups and automated failover function. The automated failover function is an active redundant spare feature that automatically switch to a standby replicate services and databases whenever the primary services and databases are unavailable due to faulty behavior or maintenance. The service/components' role will change when fault occur as the replicate component will assume the role of the primary component, however manual failback is still an option if we want to designate the fixed component as the primary component again. The switchover process from primary to replicate component will be fully automatic and without warning to ensure downtime is reduced to the minimum. The time it takes for a replicate server to assume its duty and become fully functional is almost instant, as data are regularly backed up automatically and the replicate server will always be in standby mode. The daily backup and the automatic failover function allows us to be free from worrying whether our services will break or data will corrupt with no ways to recover. Also since it's an active redundant spare feature, everything will be up to date even when the replicate services/data kicks in, ensuring the minimum amount of downtime on our system. Note this assume that the Digital Ocean platform will always function properly.
"Does the system employ exception handling to deal with faults? Typically, the handling involves either reporting, correcting, or masking the fault. Y If a component detects an error and raises an exception, without a properly implemented exception handling method, the system will simply crash. And this can be both frustrating to the user and the maintainer of the program since we are not presented with what's wrong with the program but simply a sign telling us that 'something is wrong'. This can significantly hinder the process to find and fix the fault of the program and downgrades the user experience of the program. Whenever an exception is raised, the system first needs to report this exception to the user or the maintainer of the program, outlining which component raised the exception and the details of the exception. The exception report should also contain relevant information in solving the exception, which can then be used by the software o the user/maintainer of the program to solve the exception. An exception report such the one outlined in the left cell can greatly boost the speed of fixing fault occurring within a program, since the time originally allocated for finding the details of the fault can now be used in fixing the fault. This assumes that the exception report provided by the system is accurate and comprehensive, without fault or errors.
Does the system employ rollback, so that it can revert to a previously saved good state (the “rollback line”) in the event of a fault? N If a serious fault occurred and is significantly hindering the normal operation of the whole system, the users of the program will be forced to use the program in a poor state until the said fault is fixed, which can destroy the user experience of the program, and it may even cause the user to stop using the program. A backup state of the whole system should be saved periodically so that in the event of a series fault happening in the system, the current system will roll-back to the backed-up state and continue operating normally while developers work on solving the fault. Before a state is backed up, a quick check needs to be done on the state to ensure that everything is in good condition. This allows user to experience the minimum amount of faulty program and allows the developer to fix the fault in the program with no tight time constraints. This assumes that the backed-up state of the system will stay in good condition and will be compatible with the new information and data once the roll back is finished.
Can the system perform in-service software upgrades to executable code images in a non-service-affecting manner? N Without this, every time a fault occurred and is fixed, the software will need to be updated which will stop the program from functioning until the update is done. Users wanting to use the program in this period will not be able to use the program. And if some fault happened while the software is being updated or if the update takes too long, given the nature that this program is built for medical purposes the user/patients health might be affected as well since they won't be able to get any help through this software when it is updating. In-service software upgrade feature should be applied to all services of the software. And when small changes or fixes are being applied to a specific feature or services, the downtime in that services need to be non-existent or minimized depending on the scale and type of changes being applied to the services. Other services in the program not intended to be upgraded should not be affected by that upgrade and continue functioning as intended. Through this implementation, users/patients who are logged in to the program and undergoing the virtual triage should not be affected by an upgrade on the user account auth service. And if a small change is being applied to the GUI of the user log in page, users that are currently on the login page should still be able to login as normal and only see the upgraded GUI on the next time they enter the login page. This assumes that each service has built in feature that correctly identifies whether the service should continue running or shut down when an upgrade is being initiated, based on the scale and the nature of the upgrade that will be happening.
Does the system systematically retry in cases where the component or connection failure may be transient? N Since it is possible that connection issue may happen only momentarily between services, or between the user client and the main server. If we spare resources to investigate a timeout error or connection exception whenever we see any sign of fault, system resources have a possibility of becoming too strained and this might slow down other processes within the system. Set a rule for services to raise exception. For example, in the case of a timeout exception where it takes too long for one service to respond to another one. The service waiting for a response should only raise an exception to the system after three consecutive timeouts occurred or when the service is experiencing continuous time-out (5 or more time-out within ten requests). After the initial time-out happened, the service should request another response from the other service again until the threshold for raising the exception to the system is reached, this is when the service stop requesting any more response and this exception will be raised to the system. This allows the system to not use most of the resources to check for tiny/momentary faults occurring within the system, freeing up these resources for other more important aspect of the program. This assumes that the service that's requesting the response is functioning properly.
Can the system simply ignore faulty behavior (e.g., ignore messages when it is determined that those messages are spurious)? N If the system cannot ignore faulty messages or data sent from faulty services/component, these faulty messages or data can make their way into the database and remain there even after the faulty component/source is fixed, creating long lasting negative impact on the whole system. Whenever a fault is discovered or reported in the system, the system should initiate the following action, if the fault is happening from a specific service or component or services, any data that is and was received from that service need to be identified and marked with a special symbol. If later, it was found that the faulty service sent corrupted data to other components of the system then these data ill either need to be adjusted or completely erased based on the specific fault happening. This allows us to correctly identify faulty data within the system while also not risk jeopardizing program operation through completely ignoring data transmitted from a service when the service is determined to be faulty. This assumes that the faulty services are still capable of sending data across the system and is not completely non-functional.
Does the system have a policy of degradation when resources are compromised, maintaining the most critical system functions in the presence of component failures, and dropping less critical functions? N Without this, system functions will just shut down in random order in the event of a critical fault happening, users who are undergoing the triage may suddenly disconnect from everything while another user who is simply browsing the waitlist with no urgency may still be able to see the waitlist just fine. This is clearly not desirable. In the current program if an emergency is to happen. And not all services can run, the triage service will take priority and gets all the resources that it requires. Then the waitlist service will take the resource it needs. And then the authentication service. This way in the time of a fault occurring in the system and not all services can be active, users/patients who are currently undergoing triage or those who need to go to emergency room can still get processed. This assumes that user/patient who has yet to login to the program in not in an emergency.
Does the system have consistent policies and mechanisms for reconfiguration after failures, reassigning responsibilities to the resources left functioning, while maintaining as much functionality as possible? N In absence of such policy, services still functioning may get less to no resources while services not functioning may be getting most to all system resources, which is clearly not desirable. Each services need to have inner mechanism that makes them release resources when prompted by the system, and a mechanism in the whole system that allocate resources based on whether the service/component is functioning, and their priority level based on how critical that service is. This allows functioning and more critical services to be able to get all the resources that they need in times when a fault happens, which in the end make sure no system resources will go to waste while the whole program is operating. This assumes that faulty services can still release the resources that they are using when they are not functioning properly.
Recover from Faults (Reintroduction) Can the system operate a previously failed or in-service upgraded component in a “shadow mode” for a predefined time prior to reverting the component back to an active role? N If the fixed service/component is immediately released live, there's still a possibility of fault occurring since the fixed service will be interacting with other live services and data for the first time. And if a fault is to happen, the whole system might be disrupted once again, more fixes will need to be introduced, and user experience will be negatively impacted since a supposedly fixed component just doesn't seem to be fixed. After a fix is implemented to a service, it will enter a 'shadow mode' for a set amount of time (based on active user and tasks within the system). Then the system can initiate mock user test and feed the service live user data and examine how it process and respond to those data. If when the set 'shadow mode' time is reached and no faulty behavior has been found in the service, the service will then be completely released live. This will ensure that fixed service/component are checked thoroughly before they are released for public use. Doing this can also ensure the least number of disruptions to active users who are currently using the program. This assumes that system have the capability of checking whether the service is behaving normally and the ability to stop the service and any faulty data it produces from being sent to other components/services in case of fault happening.
If the system uses active or passive redundancy, does it also employ state resynchronization to send state information from active components to standby components? Y Without state resynchronization, the replicated service/process will have outdated information, which can be quite detrimental. For example, if a user updated his/her password on the program but then the next time he/she tries to login the new password that was set won't work. Since the main Authentication service is now replaced by a replication but the new password that was set was not recorded in the replicated process. Since the services are hosted on Digital Ocean, an active replication feature is automatically included, and this means that state information will be automatically synchronized between the active components and the standby components. It's almost always good to have the most updated information, and active replication with resynchronization allows our software to not only have backup functionalities ready all the time, but also up to date data with those functionalities. The Software will be able to function normally even when faults occur, and users will be able to use the program without hiccups. Though this assumes that all information sent from the active service to the replicate service are always correct. In the events that faulty data are sent to the replicate service, such data will need to be traced and adjusted.
Does the system employ escalating restart to recover from faults by varying the granularity of the component(s) restarted and minimizing the level of service affected? N Without escalating restart, a full restart will be initiated every time a fix is to be introduced into the system. And this can be extremely disruptive to users who are currently using the system since the services that they are using might not be affected by the fault that the fix is intended for. There should be 3 restart protocol built into the system. First, a low level restart where only the processes associated with the service at fault is restarted; Second, a medium level restart where a majority of the system components/services are restarted but few critical components(not faulty) are still kept running, for example patients in the ER waitlist can still view the waitlist queue but people who has yet to login to the program will not be able to until the restart process is complete. The third level of restart will be when everything is completely restarted, and the software will not be usable until the restart is complete. These set levels of restart will provide minimum disruption to the whole software whenever a new fix needs to be introduced into it, these levels of restart also keep in mind of making sure critical function of the system are almost always made available to users. Note this assumes that the system is capable of correctly identifying the proper restart protocol to initiate whenever a fix needs to be introduced into the system.
Can message processing and routing portions of the system employ nonstop forwarding, where functionality is split into supervisory and data planes? N This design can be helpful in the sense that even if one thing breaks or has some fault occurring in a system, not all parts of the system will be affected by that fault and some part can still function normally while the faulty part is getting fixed. If everything is bundled together and one thing breaks, the others will too. There can be a function where one process, for example the process of sending patient information from one service to another. There can be one component that handles the sending of the service, while another component handles that validation part of the service. Whenever some data is sent, it will be sent immediately through the sending component, but the receiving service will only receive but not accept the data sent. The data will only be accepted and used once the validation component validates the data to be correct. With this design, in cases where data is taking a long time to verify, the transfer speed of data will not be impacted significantly since the validation part and the sending part will be executed simultaneously, providing fast data transfer time. Note this assume that any data that is received but not validated will not affect the receiving service in any way, as an example they may be temporarily contained in a separate but fast storage space.
Prevent Faults Can the system remove components from service, temporarily placing a system component in an out-of-service state for the purpose of preempting potential system failures? N Faulty components that are not stopped in time can negatively impact the system more and more the longer it is kept running, potentially leading to corruption of database and even system wide failure if it runs long enough. The system should have a feature where it can evaluate faults occurring in a service. If the fault is determined to be minimal and can be ignored, the system will only record the fault but will not do anything. If the fault is determined to be intermediate or critical and can potentially damages the system more the longer it is left running, the service where the fault occurred will be placed in an out-of-service state where the fault will be tracked down and fixed. We cannot disable a service just because a fault has occurred, there should be a metric on when a service needs to be disabled or when the service can be left running. Though it always good to try fixing the fault at its early stages, if we only try to fix fault when it has caused system wide error, the fault may become significantly harder to fix and lots of clean-up will need to be done on the system.
Does the system employ transactions—bundling state updates so that asynchronous messages exchanged between distributed components are atomic, consistent, isolated, and durable? N Without the transactions tactic, it is highly likely for race conditions to occur. One doctor and another can be modifying the waitlist status of a patient at the same time, and with no clear state update on each of their end. The program may get stuck on an infinite loop due to it not knowing which process's command it should follow. A coordinator function need to be implemented. Whenever a process wants to change some data, the coordinator will lock that specific data for some time (time dictated by average reasonable time to update the data in question, no outliers included, and multiples by 3) to only allow that specific process to modify it. After the process finish modifying the data or after a long time has passed and the process show no sign of planning on modifying the data, the lock will be released and other processes can modify that data again. This design can easily resolve data modification conflict between two processes, two users, etc. The extended time in the modification period is to make sure people who has reading or writing difficulty have adequate time to modify any information they want to. Note this design assumes that enough coordinator functions exist to lock all possible data in the system.
"Does the system use a predictive model to monitor the state of health of a component to ensure that the system is operating within nominal parameters? When conditions are detected that are predictive of likely future faults, the model initiates corrective action. N A predictive model allows us to see trends of when a fault is likely to occur within the program and the kind of behavior the program or each of its services exhibit before a fault occur. Without it, we won't have an effective mean to prevent fault from happening or to improve the program based on how fault historically happened. The system needs a predictive model for each microservices that keeps track of their ping, active user count, and amount of data processed. It should have the capability to produce a report on the monthly/yearly state of each service outlining their health, also the cause and likelihood of a fault occurring. A warning message also need to be produced every time the model predicts an intermediate or critical fault is going to happen. Ping, active user count, and the amount of data processed in a service is easy to keep track of and can directly impact the states of each service and the whole program. Thus, a model keeping track the state of them can be helpful in keeping up the health of the system and to help with fault prevention. This assumes that the predictive model record and use correct data and can differentiate the type of fault that's likely to occur.

For our program, we have chosen 3 architectural tactics. In the area of fault detection, we will use the Monitor tactic. To recover from faults, specifically in the preparation and repair area, we will use the Redundant Spare tactic. And lastly, for preventing faults, we will use the exception prevention tactic.

Monitor

All microservices are deployed on the Digital Ocean platform, which means system and process logs can be easily viewed and analyzed with the help of the built-in features of Digital Ocean. Log type including activity log, build log, deploy log, and runtime log can be not only easily viewed, but also quickly analyzed through the log forwarding feature provided by Digital Ocean where we can forward logs to external log management providers. This allows us to easily get notified of faulty processes in our system, and give us quick insight on the details of the issue including how to fix the issue, ensuring the highest level of availability.

Redundant Spare

Since our microservices are deployed on Digital Ocean, it includes the Digital Ocean daily backups and automated failover function. The automated failover function is an active redundant spare feature that automatically switch to a standby replicate services and databases whenever the primary services and databases are unavailable due to faulty behaviour or maintenance. The service/components' role will change when fault occur as the replicate component will assume the role of the primary component, however manual failback is still an option if we want to designate the fixed component as the primary component again. The switchover process from primary to replicate component will be fully automatic and without warning to ensure downtime is reduced to the minimum. The time it takes for a replicate server to assume its duty and become fully functional is almost instant, as data are regularly backed up automatically and the replicate server will always be in standby mode. The daily backup and the automatic failover function allows us to be free from worrying whether our services will break or data will corrupt with no ways to recover. Also since it's an active redundant spare feature, everything will be up to date even when the replicate services/data kicks in, ensuring the minimum amount of downtime on our system. The best part is that this function is fully managed by Digital Ocean, meaning we won't have to worry about maintaining it in case of faults occurring in this feature.

Exception Prevention

We will be using Axios's built in error catching for catching errors ranging from invalid login information to invalid input parameters on the triage forms and more. Exceptions and errors will be caught by the try block and dealt with in the catch block to prevent fault from occurring in the system. This is an easy and effective way to prevent user from logging in with false credentials, and also to stop invalid data from corrupting the database, ensuring that the program will run smoothly with intended inputs and data.

Test Plan

Monitor

Login Function

  • Enter Login page, check if logs match program behavior.
  • Login with correct username and password, then check if logs match program behavior.
  • Login with correct username and incorrect password, then check if logs match program behavior.
  • Login with incorrect username and incorrect password, then check if logs match program behavior.
  • Exit Login page, check if logs match program behavior.

Triage Function

  • Enter Triage page, check that logs match program behavior.
  • Enter valid input into the triage form. Make sure test enough inputs so that all variation of responses will be outputted from the service (e.g., Take meds, Visit In-person, Contact Hotline), check that logs match program behavior.
  • Enter Invalid input into the triage form, check if logs match program behavior.
  • Try to progress with empty form, check if logs match program behavior.
  • Exit Triage page, check that logs match program behavior.

Waitlist Function

  • Login as Patient or Doctor, check if logs match program behavior.
  • Perform all tasks accessible by a doctor account, check if logs match program behavior.
  • Perform all tasks accessible by a Patient account, check if logs match program behavior.
  • While the program is running with an active non-empty waitlist, check if logs match the program. Make sure to check all other variation go waitlist as well (empty, full, inactive waitlist, etc.).
  • Exit as Doctor or Patient, make sure logs match program behavior.

Redundant Spare

For All Function

Overload each service with Mock users and data, to the point when the services cannot handle it anymore and stops working, and then:

  • Check if the replicate service will take over and whether it contains the mock data. See if the data contained is in acceptable quantity.
  • Check how long it takes for the mock user to be able to use the program normally again.
  • Check if we can still replace the current primary component (previously the replicate component) with the current replicate component (previously the primary component) manually.

Under Normal load, check if:

  • Data is backed daily as advertised
  • Replicate component will replace the primary component only if the primary component is unavailable.

Exception Prevention

Login Function

  • Attempt to login with correct username and incorrect password, see if the program responds in the intended way (Ask user to retry instead of crashing, etc.)
  • Attempt to login with incorrect username and password, see if the program responds in the intended way.

Triage Function

  • Enter invalid data paired with valid data into the Triage Form, see if the program responds correctly. (Reject the form instead of crashing, be stuck on loading, etc.)
  • Enter invalid data into the Triage Form, see if the program responds correctly.

Design Patterns

Proxy

Proxy is a structural design pattern that allows us to provide a substitute or placeholder for another object - it controls access to the original object, allowing us to perform something either before or after the request gets through to the original object [1]. There are a few different types of proxies, though, the one seen in our project is the protection proxy [2]. This design pattern type can be seen throughout all of our microservices, though, it is most prevalent in our Authentication microservice.

The protection proxy design pattern is used to control access to resources based on access rights. This design pattern was implemented in two different ways. First, through Flask - this framework provides an extensive set of tools, one of them is the "flask-jwt-extended". This tool was implemented in such a way that it requires users to sign in to the Authentication microservice to retrieve a token - this token unlocks access to various resources. The second implementation can be found in our database. When creating privileged accounts for medical staff, application admins will set a special flag. This flag allows our microservices to display information based on the users' role - e.g., with the Waitlist microservice, users with a Patient role are only able to see how many people are ahead of them, whereas, medical staff are able to see who is in the queue, their information, and the reason why.

Command

Command is a behavioural design pattern in which an object is used to encapsulate all information needed to perform an action or trigger an event at a later time [3, 4]. This design pattern can be found in our Triage microservice. The triage provides users with a form - users must submit this form to determine whether they should be added to the waitlist to visit the emergency department. The front end performs all of the basic error checking, meaning, the submission button only becomes available if all necessary information is present in the form. The basic error checking prevents database write errors and ensures that all API calls to the triage microservice will have all the information necessary - allowing our microservice to correctly determine the next steps for a patient.

State

State is a behavioural design pattern that lets an object alter its behaviour when its internal state changes. The State pattern suggests that we should create new classes for all possible states of an object and extract all state-specific behaviours into these classes. Instead of implementing all behaviours on its own, the original object, called context, stores a reference to one of the state objects that represents its current state and delegates all the state-related work to that object [5].

This design can be found in our Waitlist microservice. In this microservice, we have three different functions, get_waitlist(), enter_waitlist(), and remove_from_waitlist() - each of them is used for a state-specific behaviour. Because we're using the Flask framework, the context can be determined by how the API is called - the get_waitlist(), enter_waitlist(), and remove_from_waitlist() are triggered by GET, POST, and DELETE methods, respectively.

State Models

State Model 1 - Authentication

image


The Authentication State Model shows the different states the Authentication Microservice can be in. First you open the Authentication Portal. Next you wait for login data to be submitted from the user. Once the login data is submitted, it checks if the data is valid credentials. If it’s invalid it waits for valid data, and if it’s valid, it logs you into the application.


State Model 2 - Triage

image


The Triage State Model shows the different states the Triage Microservice can be in. First you login to your account. The next step is to collect user-inputted symptoms. Once the symptoms have been gathered, the Triage Microservice will evaluate them, and assess severity, and provide a recommendation to the user.


State Model 3 - Waitlist

image


The Waitlist State Model shows the different states the Waitlist Microservice can be in. First you login to your account. The next state is either entering you into the waitlist and displaying your waitlist position, or if you are already in the waitlist it just displays your position.


Activity Models

Activity Model 1 - Register

image


The Register Activity Diagram shows how the registration function works, from submitting registration information, to validating registration information, to creating the patient account to logging into the patient account, to displaying the account to the patient.


Activity Model 2 - Login

image


The Login Activity Diagram shows how the login function works, from submitting login information, to validating registration information, to logging into the patient account, to displaying the account to the patient.


Activity Model 3 - Triage

image


The Triage Activity Diagram shows how the Triage function works, from submitting triage information, to validating triage information, to returning the triage result, to displaying the triage result to the patient.


Activity Model 4 - Enter Waitlist

image


The EnterWaitlist Activity Diagram shows how the EnterWaitlist function works, from requesting to enter the waitlist, to validating the request, to entering the patient into the waitlist, to returning the waitlist information, to displaying the waitlist information to the patient.


Activity Model 5 - View Waitlist

image


The ViewWaitlist Activity Diagram shows how the ViewWaitlist function works, from requesting to view the waitlist, to validating the request, to returning the waitlist information, to displaying the waitlist information to the patient.


Contribution Summary

Rodrigo:

  • Migrate microservices to Digital Ocean
  • Connect auth and waitlist front-end to microservices
  • Brainstorm architectural tactics
  • Complete incremental construction section
  • Create and complete design patterns section

Ethan:

  • Created front-end UI for homepage, login, waitlist and triage
  • Created State machine diagram for authorization process

Amann

  • Created Protected routes
  • UI
  • Created authentication provider using the jwt token developed.
  • Brainstorm architectural tactics

Jacob

  • Created all of the Activity Diagrams
  • Created the Triage and Waitlist State Machine Diagrams
  • Added textual descriptions of all Activity Diagrams and State Machine Diagrams

Tim

  • Brainstorm architectural tactics
  • Create and complete Architectural Tactics - Availability section