Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds Rake tasks to easily load NYU data locally #352

Merged
merged 2 commits into from
Aug 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,3 +88,21 @@ Note: You'll know this step is necessary if an individual spec fails with an err
```plaintext
Blacklight::Exceptions::ECONNREFUSED: Connection refused - Unable to connect to Solr instance using #<RSolr::Client:0x0000000117d5e1a8 @uri=#<URI::HTTP http://127.0.0.1:8983/solr/sdr-core-test/>
```

### Loading NYU Data Locally

First, start up the Rails app if it's not already running:

```bash
$ bundle exec rake sdr:server
```

This will ensure Solr is up and running.

Then in another terminal run the following Rake task:

```bash
$ bundle exec rake sdr:load_nyu_data
```

Once completed, head to <http://localhost:3000/?search_field=all_fields> and you should have a lot more data to work with.
20 changes: 20 additions & 0 deletions lib/tasks/sdr.rake
Original file line number Diff line number Diff line change
Expand Up @@ -78,4 +78,24 @@ namespace :sdr do
end
end
end

desc 'Clone and index NYU data for local development'
task load_nyu_data: :environment do
exit unless Rails.env.development?

puts 'Removing existing edu.nyu repo...'
FileUtils.rm_rf('tmp/opengeometadata/edu.nyu')

puts 'Cloning edu.nyu repository...'
system 'bundle exec sdr-cli clone --repo=edu.nyu'

puts 'Deleting Solr index...'
Blacklight.default_index.connection.delete_by_query '*:*'
Blacklight.default_index.connection.commit

puts 'Indexing edu.nyu Aardvark files...'
system "bundle exec sdr-cli index --directory=\"tmp/opengeometadata/edu.nyu/metadata-aardvark/**/*.json\" --solr_url=\"#{Settings.SOLR_URL}\""

puts 'Done!'
end
end