Add in-progress 304 rock updates to mgr:openfd_objects #221

eduard-bagdasaryan · 2023-07-27T13:30:29Z

No description provided.

src/MemObject.h

src/stat.cc

This reverts commit 09e8c31.

eduard-bagdasaryan · 2023-07-31T20:34:07Z

src/MemObject.h

@@ -167,6 +167,13 @@ class MemObject

    SwapOut swapout;

+    /// whether the object is being updated on disk
+    // TODO: expand the scope to common swapouts as well
+    Lock swapoutInterest;


I checked separately that sizeof(swapoutInterest) is 16 (vs 4, if without virtual destructor), i.e., it keeps vtable pointer. That means that we should consider moving to a boolean variable instead (implying other related problems fixed first).

That means that we should consider moving to a boolean variable instead

I see several other alternatives to consider. I will enumerate all the current candidates for completeness sake:

Use boolean swapoutInterest (as you have suggested in this change request). Creates a problem for future update-during-swapout support. Implies prohibiting (at compile time) Ipc::StoreMapUpdate copying. AFAICT, that copying prohibition is not possible without losing features or serious refactoring!

Remove virtual table from Lock. I have not checked the details, but I doubt that is easy to do while preserving existing class hierarchy safety: Even if RefCount is the only child that needs a virtual destructor, prohibiting (at compile time) deletion of Lock pointers in all contexts may be impossible because Lock code can always delete a pointer to a Lock object.

Add a simple "Level" counter (that Lock will reuse). This is the simplest solution that I can see:

class Lock { ... private: mutable Level level; };

class MemObject: ... { ... Level swapoutInterest; };

Storing a pointer to the header updating job/state inside MemObject. This is similar to what this PR has implemented in the very beginning. The advantage of this design is that cache manager can (eventually) report details specific to the ongoing header updates. I have criticized this option earlier, in part, because it bloats MemObject (which is used for all non-SMP memory cache entries), but one could argue that we should not be responsible for MemObject design mistakes -- the class should be split into several classes so that idle memory cache objects do not have to keep state relevant to active transactions. Until that split is implemented, perhaps one has a right to add stuff to the "active transaction" part of MemObject, even if doing so bloats non-SMP memory caches. Moreover, all the solutions mentioned above increase MemObject memory footprint anyway.

Register header updating jobs in some new global std::set. The cache manager reporting code will use that global to report those updates (in addition to iterating through all MemObjects). If MemObject will be split as discussed in item 3, then there will be nothing wrong with iterating through all active objects, but it may take us a very long time to get there. Meanwhile, this solution avoids memory cache bloat present in earlier items. Perhaps we can even generalize this to report all AsyncJobs? That would be useful in triage of a variety of problems! Squid already reports all event handlers and all active requests. Why not all jobs?

Use existing MemObject::abortCallback as a "Store writing interest" signal. StoreEntry::registerAbortCallback() is meant to be used by StoreEntry writers. The header updating job is a StoreEntry writer. It is probably a bug that it does not registerAbortCallback() -- the entry may be aborted by a 3rd party while we are updating it, right? If we can start using this interface, we will fix that bug and improve reporting. Can we find a way to trigger that bug in the current code (i.e. to trigger a call to StoreEntry::abort() while a header upgrade is in progress)? Perhaps Store::Controller::syncCollapsed() is able to trigger such StoreEntry aborts?

Use existing Transients::hasWriter(StoreEntry) check to detect header updates. This solution requires creating or marking the corresponding "writing" Transients entry (and iterating through all transients when preparing the cache manager report). Similar to the above item, this is probably something correct SMP header updating code should do anyway!

Overall, I like the last two items because they feel like potentially important bug fixes, but I am worried that it would be difficult to implement either of them. I love the general usefulness and simplicity of the "report all jobs" variation in item 4. I suggest adding a mgr:jobs report that will ask each AsyncJob object to report its state, possibly using the existing status() method. I would do that in a new/dedicated PR/issue.

rousskov · 2023-08-03T13:56:19Z

src/MemObject.h

@@ -167,6 +167,13 @@ class MemObject

    SwapOut swapout;

+    /// whether the object is being updated on disk
+    // TODO: expand the scope to common swapouts as well
+    Lock swapoutInterest;


That means that we should consider moving to a boolean variable instead

I see several other alternatives to consider. I will enumerate all the current candidates for completeness sake:

Use boolean swapoutInterest (as you have suggested in this change request). Creates a problem for future update-during-swapout support. Implies prohibiting (at compile time) Ipc::StoreMapUpdate copying. AFAICT, that copying prohibition is not possible without losing features or serious refactoring!

Remove virtual table from Lock. I have not checked the details, but I doubt that is easy to do while preserving existing class hierarchy safety: Even if RefCount is the only child that needs a virtual destructor, prohibiting (at compile time) deletion of Lock pointers in all contexts may be impossible because Lock code can always delete a pointer to a Lock object.

Add a simple "Level" counter (that Lock will reuse). This is the simplest solution that I can see:

class Lock { ... private: mutable Level level; };

class MemObject: ... { ... Level swapoutInterest; };

Storing a pointer to the header updating job/state inside MemObject. This is similar to what this PR has implemented in the very beginning. The advantage of this design is that cache manager can (eventually) report details specific to the ongoing header updates. I have criticized this option earlier, in part, because it bloats MemObject (which is used for all non-SMP memory cache entries), but one could argue that we should not be responsible for MemObject design mistakes -- the class should be split into several classes so that idle memory cache objects do not have to keep state relevant to active transactions. Until that split is implemented, perhaps one has a right to add stuff to the "active transaction" part of MemObject, even if doing so bloats non-SMP memory caches. Moreover, all the solutions mentioned above increase MemObject memory footprint anyway.

Register header updating jobs in some new global std::set. The cache manager reporting code will use that global to report those updates (in addition to iterating through all MemObjects). If MemObject will be split as discussed in item 3, then there will be nothing wrong with iterating through all active objects, but it may take us a very long time to get there. Meanwhile, this solution avoids memory cache bloat present in earlier items. Perhaps we can even generalize this to report all AsyncJobs? That would be useful in triage of a variety of problems! Squid already reports all event handlers and all active requests. Why not all jobs?

Use existing MemObject::abortCallback as a "Store writing interest" signal. StoreEntry::registerAbortCallback() is meant to be used by StoreEntry writers. The header updating job is a StoreEntry writer. It is probably a bug that it does not registerAbortCallback() -- the entry may be aborted by a 3rd party while we are updating it, right? If we can start using this interface, we will fix that bug and improve reporting. Can we find a way to trigger that bug in the current code (i.e. to trigger a call to StoreEntry::abort() while a header upgrade is in progress)? Perhaps Store::Controller::syncCollapsed() is able to trigger such StoreEntry aborts?

Use existing Transients::hasWriter(StoreEntry) check to detect header updates. This solution requires creating or marking the corresponding "writing" Transients entry (and iterating through all transients when preparing the cache manager report). Similar to the above item, this is probably something correct SMP header updating code should do anyway!

Overall, I like the last two items because they feel like potentially important bug fixes, but I am worried that it would be difficult to implement either of them. I love the general usefulness and simplicity of the "report all jobs" variation in item 4. I suggest adding a mgr:jobs report that will ask each AsyncJob object to report its state, possibly using the existing status() method. I would do that in a new/dedicated PR/issue.

Add in-progress 304 rock updates to mgr:openfd_objects

09e8c31

rousskov requested changes Jul 27, 2023

View reviewed changes

src/MemObject.h Outdated Show resolved Hide resolved

src/stat.cc Outdated Show resolved Hide resolved

eduard-bagdasaryan added 2 commits July 31, 2023 15:10

Revert "Add in-progress 304 rock updates to mgr:openfd_objects"

6007085

This reverts commit 09e8c31.

Reworked with MemObject::swapoutInterest

81b6d9f

eduard-bagdasaryan commented Jul 31, 2023

View reviewed changes

rousskov requested changes Aug 3, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add in-progress 304 rock updates to mgr:openfd_objects #221

Add in-progress 304 rock updates to mgr:openfd_objects #221

eduard-bagdasaryan commented Jul 27, 2023

eduard-bagdasaryan Jul 31, 2023

rousskov Aug 3, 2023

rousskov Aug 3, 2023

Add in-progress 304 rock updates to mgr:openfd_objects #221

Are you sure you want to change the base?

Add in-progress 304 rock updates to mgr:openfd_objects #221

Conversation

eduard-bagdasaryan commented Jul 27, 2023

eduard-bagdasaryan Jul 31, 2023

Choose a reason for hiding this comment

rousskov Aug 3, 2023

Choose a reason for hiding this comment

rousskov Aug 3, 2023

Choose a reason for hiding this comment