Skip to content

Commit

Permalink
Only check for solr alias changes on a timer
Browse files Browse the repository at this point in the history
The implementation of `solr_cloud-connection` is purposefully chatty, not caching anything because during the admin cycle, you want to see your changes reflected immediatley and the number of operations is generally small.

My use of uncached values (checking the name of the collection underlying an alias) on what turned out to be basically every call was disastrous.

This PR creates an instance of [Concurrent::TimerTask](https://ruby-concurrency.github.io/concurrent-ruby/master/Concurrent/TimerTask.html) that runs the check,
 and updates if needed, every 20 seconds.

 Notable changes are:
   * All the logic about updates is moved into `config/initializers/solr_admin_cache.rb`
   * The formerly-recursive method of determining the underlying collection
     name (because it's legal, in general, to have aliass that point to other
     aliases) has been removed, since we just don't need it.
   * The cached values are stored in `Concurrent::Atom` instances in
     the Services module with everything else
   * `load_local_config.rb` basically just calls `#value` on the
     Services values.
   * The footer now shows, ridiculously, the time down to the second.
     This is purely to make testing easier, because seeing a change in
     the footer date is an easy way to know the change has been made.
  • Loading branch information
billdueber committed Sep 24, 2024
1 parent 436321b commit fdb8aa1
Show file tree
Hide file tree
Showing 6 changed files with 50 additions and 23 deletions.
2 changes: 2 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ gem "aws-sdk-s3", "~> 1.160"
gem "content_disposition", "~> 1.0"
gem "uppy-s3_multipart", "~> 1.2"

gem "concurrent-ruby"

# CORS
gem 'rack-cors'

Expand Down
1 change: 1 addition & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -455,6 +455,7 @@ DEPENDENCIES
bundler (~> 2.4.22)
canister
coffee-rails (~> 4.2)
concurrent-ruby
content_disposition (~> 1.0)
date_named_file
debase
Expand Down
2 changes: 1 addition & 1 deletion app/views/shared/_footer.html.erb
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
Data last refreshed <%= if Dromedary::Services[:looks_like_first_upload]
"never"
else
Dromedary.collection_creation_date.strftime("%A, %B %-e, %Y")
Dromedary.collection_creation_date_string
end
%>.
</p>
Expand Down
35 changes: 35 additions & 0 deletions config/initializers/solr_admin_cache.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
S = Dromedary::Services

# Set up some places to work
S.register(:hyp_to_bibid) { Concurrent::Atom.new(:no_hyp_to_bib_id_yet) }
S.register(:collection_creation_date) { Concurrent::Atom.new(:no_creation_date_yet) }
S.register(:underlying_collection_name) {Concurrent::Atom.new(:no_underlying_name_yet) }


# Update the underlying concurrent variables.
# If the collection name underlying the (presumed) alias we're working with changes,
# update the collection-specific data and reset our understanding of the
# current collection name.
def update_timeout_variables
Rails.logger.warn "################# CHECK FOR UPDATE ########################"
collection = S[:solr_current_collection]
actual_current_underlying_collection_name = collection.collection.name
expected_underlying_collection_name = S[:underlying_collection_name].value
if actual_current_underlying_collection_name != expected_underlying_collection_name
Rails.logger.warn "################# PERFORMING UPDATE ########################"
S[:hyp_to_bibid].reset MedInstaller::HypToBibId.get_from_solr(collection: collection)
S[:collection_creation_date].reset Dromedary.compute_collection_creation_date(actual_current_underlying_collection_name)
S[:underlying_collection_name].reset actual_current_underlying_collection_name
end
end

# The timer, with `run_now`, is supposed to run immediately, but I keep getting not-set-yet
# errors, so we'll run it once manually on startup.
update_timeout_variables

# Run the update method ever 20 seconds
collection_timer = Concurrent::TimerTask.new(execution_interval: 20, run_now: true) do
update_timeout_variables
end
# Need to call #execute to actually fire up the timer
collection_timer.execute
31 changes: 10 additions & 21 deletions config/load_local_config.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
module Dromedary
class << self
def logger
MedInstaller::Logger::LOGGER
Rails.logger || MedInstaller::Logger::LOGGER
end

# For whatever historical reasons, this uses the Ettin gem to load
Expand All @@ -26,34 +26,23 @@ def config
ENV["RAILS_ENV"]
else
"development"
end
end
@config = Dromedary::Services
end

def hyp_to_bibid(collection: Dromedary::Services[:solr_current_collection])
logger.info "Trying to get hyp_to_bibid for collection #{collection}"
current_real_collection_name = underlying_real_collection_name(coll: collection)
logger.info "Real collection name identified as #{current_real_collection_name}"
if @recorded_real_collection_name != current_real_collection_name
@hyp_to_bibid = MedInstaller::HypToBibId.get_from_solr(collection: collection)
@recorded_real_collection_name = current_real_collection_name
@collection_creation_date = nil
end
@hyp_to_bibid

end

# @param coll [SolrCloud::Alias]
def underlying_real_collection_name(coll: Dromedary::Services[:solr_current_collection])
return coll.name unless coll.alias?
underlying_real_collection_name(coll: coll.collection)
def hyp_to_bibid
Dromedary::Services[:hyp_to_bibid].value
end

def collection_creation_date(coll: Dromedary::Services[:solr_current_collection])
return @collection_creation_date if defined?(@collection_creation_date) && !@collection_creation_date.nil?

real_collection_name = underlying_real_collection_name(coll: coll)
@collection_creation_date = compute_collection_creation_date(real_collection_name)
def collection_creation_date
Dromedary::Services[:collection_creation_date].value
end

def collection_creation_date_string
collection_creation_date.strftime("%A, %B %-e, %Y at %H:%M:%S")
end

def compute_collection_creation_date(coll)
Expand Down
2 changes: 1 addition & 1 deletion indexer/main_indexing_rules.rb
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
provide "solr_writer.basic_auth_password", Dromedary::Services[:solr_password]
end

hyp_to_bibid = Dromedary.hyp_to_bibid(collection: Dromedary::Services[:solr_collection_to_index_into])
hyp_to_bibid = MedInstaller::HypToBibId.get_from_solr(collection: Dromedary::Services[:solr_collection_to_index_into])
bibset = MiddleEnglishDictionary::Collection::BibSet.new(filename: settings["bibfile"])

# Do a terrible disservice to traject and monkeypatch it to take
Expand Down

0 comments on commit fdb8aa1

Please sign in to comment.