Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR to Study Prod Updates #1164

Open
wants to merge 387 commits into
base: prod_20241218
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
387 commits
Select commit Hold shift + click to select a range
3c7fab9
change DeltaUrl.delete to to_delete
CarsonDavis Nov 16, 2024
a88f3f6
updated patterns to refresh url lists
CarsonDavis Nov 16, 2024
d7208f7
update console logs for fetch_and_replace_full_text
CarsonDavis Nov 16, 2024
75cde3b
add properties for delta_urls_count and included_urls_count
CarsonDavis Nov 16, 2024
868798b
update collection admin to show number of curated and delta urls
CarsonDavis Nov 16, 2024
1dbbdfb
added html for delta and curated columns in collection list
Nov 16, 2024
07f4608
Merge branch '1051-backend-model-changes-on-cosmos-to-hold-new-incomi…
Nov 16, 2024
503f7f5
Commented out delta and curated count on home page. 'delta url column…
Nov 18, 2024
86764a8
update the fetch_and_replace_full_text to migrate_dump_to_delta on im…
CarsonDavis Nov 18, 2024
6fba669
Merge branch '1051-backend-model-changes-on-cosmos-to-hold-new-incomi…
CarsonDavis Nov 18, 2024
2ffb5c2
update collection html to display delta url count
CarsonDavis Nov 19, 2024
c83e1c6
update pattern application to not create excess deltaurls
CarsonDavis Nov 19, 2024
ee5bf6b
Fixed delta url filter bug
Nov 19, 2024
7a02ec5
Merge branch '1051-backend-model-changes-on-cosmos-to-hold-new-incomi…
Nov 19, 2024
936bddc
temporarily enable non-integration with slack
CarsonDavis Nov 19, 2024
715a571
Merge branch '1051-backend-model-changes-on-cosmos-to-hold-new-incomi…
Nov 19, 2024
6d7148f
add readme instructions for tmux
CarsonDavis Nov 19, 2024
8243abe
Merge branch '1051-backend-model-changes-on-cosmos-to-hold-new-incomi…
CarsonDavis Nov 19, 2024
f442d2d
improve delta pattern tests
CarsonDavis Nov 19, 2024
2b593b9
remove print statements in collection promotion
CarsonDavis Nov 19, 2024
a3bc1bd
improve fulltext import test
CarsonDavis Nov 19, 2024
0f57932
remove deprecated tasks.py code
CarsonDavis Nov 19, 2024
7a33d3e
add initial curatedurl apis
CarsonDavis Nov 19, 2024
a5a408c
add TDAMM_TAG_CHOICES to collection_choice_fields
Kirandawadi Nov 20, 2024
e8b0cf0
delete is_tdamm switch functionality
Kirandawadi Nov 20, 2024
b65c693
add test cases for two column tags functionality
Kirandawadi Nov 15, 2024
0707500
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 15, 2024
1a2e792
merge 1051-backend-model-changes-on-cosmos-to-hold-new-incoming-urls-…
Kirandawadi Nov 20, 2024
0f0d407
Merge pull request #1073 from NASA-IMPACT/add_per_indicator_thresholding
CarsonDavis Nov 20, 2024
6112e9a
Merge branch 'dev' into 1051-backend-model-changes-on-cosmos-to-hold-…
CarsonDavis Nov 20, 2024
e45eeeb
refactor code for DeltaUrl model
Kirandawadi Nov 21, 2024
c2f164c
add cmr processing code and tests
CarsonDavis Nov 21, 2024
b955e67
move cmr processing into create_ej_dump
CarsonDavis Nov 21, 2024
94a2588
refactor ej processing and add readme
CarsonDavis Nov 21, 2024
ae0ff18
clarify code to only allow passage of ej classifications
CarsonDavis Nov 21, 2024
08124c4
remove references to default
CarsonDavis Nov 21, 2024
c2468ad
update config to include climate change threshold
CarsonDavis Nov 21, 2024
a6fe59c
update limitations to use weaknesses data
CarsonDavis Nov 21, 2024
6ee7333
update intended use to reference path names
CarsonDavis Nov 21, 2024
3c975fc
improve resolution processing code
CarsonDavis Nov 21, 2024
6a1906f
add spatial resolution tests and import error handling
CarsonDavis Nov 21, 2024
ff1484b
add tdamm_tag field to new serializers
Kirandawadi Nov 21, 2024
403c71d
update threshold processing to better handle not ej cases
CarsonDavis Nov 21, 2024
1e27e28
improve format and dataset name handling
CarsonDavis Nov 21, 2024
70ccb2c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 21, 2024
a976c9f
Fixes #1097
Nov 21, 2024
6634576
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 21, 2024
05ec13b
Updated_#1097
Nov 21, 2024
df4a1de
Fixes_Issue__#1097
Nov 21, 2024
03ce0e6
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 21, 2024
1ea5168
Merge pull request #1090 from NASA-IMPACT/1051-backend-model-changes-…
CarsonDavis Nov 21, 2024
89756bf
Include Api tests #1097
Nov 21, 2024
280e491
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 21, 2024
3c11143
Include_Api_tests #1097
Nov 21, 2024
5d39c26
update processing of projects to the format shortname - longname
CarsonDavis Nov 21, 2024
73ad90e
add cmr example for testing
CarsonDavis Nov 21, 2024
e370c75
add initial readme explaining the pattern system
CarsonDavis Nov 21, 2024
eafb2b5
add initial exclude tests
CarsonDavis Nov 23, 2024
6518c47
remove destination_server and add datasource
CarsonDavis Nov 23, 2024
888c53b
add readme explaining EJ api behavior
CarsonDavis Nov 23, 2024
1a5ae32
update query to explicitly handle 'combined' parameter
CarsonDavis Nov 23, 2024
9a20863
add api tests for EJ
CarsonDavis Nov 23, 2024
a630969
finalize delta exclude tests
CarsonDavis Nov 23, 2024
095d7a4
refactor apply logic
CarsonDavis Nov 24, 2024
63ef2b4
create an InclusionPatternBase and inherit from it
CarsonDavis Nov 24, 2024
034ce84
refactor include processing to override excludes
CarsonDavis Nov 24, 2024
bf67004
consolidate pattern readmes
CarsonDavis Nov 24, 2024
73b266a
add lifecycle readme
CarsonDavis Nov 24, 2024
d81534d
fix tests and refactor related_names for patterns
CarsonDavis Nov 24, 2024
808c1ed
fix related name reference in serializers
CarsonDavis Nov 24, 2024
85ff6e5
add management command to deduplicate urls
CarsonDavis Nov 25, 2024
edff281
correct error in migrate_urls_and_patterns
CarsonDavis Nov 25, 2024
c1b9fc4
remove print statements from promotion code
CarsonDavis Nov 25, 2024
754fc93
Added test_migration command
Nov 25, 2024
8056b22
add readme on pattern resolution
CarsonDavis Nov 25, 2024
a1bd63e
add code to remove duplicate patterns
CarsonDavis Nov 25, 2024
2895627
celeryworker_updates
Nov 25, 2024
5f8e7e1
latest
Nov 25, 2024
4747f59
latest
Nov 25, 2024
437cd65
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 25, 2024
41dd9e5
latest
Nov 25, 2024
be86389
latest_
Nov 25, 2024
3e0f399
move all Title logic into delta_patterns.py
CarsonDavis Nov 25, 2024
b378388
implement smallest set rule application for titles
CarsonDavis Nov 25, 2024
d87b355
implement pattern specificity across entire class
CarsonDavis Nov 25, 2024
3bb1acc
adjust ResolvedTitle imports
CarsonDavis Nov 26, 2024
0fcf2ea
handle network errors and arbitrary errors during title resolution
CarsonDavis Nov 26, 2024
77b1ec3
Fixes #1096
Nov 26, 2024
60fdac1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 26, 2024
4c7834f
add an overview readme to introduce the system
CarsonDavis Nov 26, 2024
e285697
Merge pull request #1109 from NASA-IMPACT/1105-improve-pattern-applic…
CarsonDavis Nov 26, 2024
e2650e5
update the pattern resolution readme
CarsonDavis Nov 26, 2024
8c19323
add note about distinct
CarsonDavis Nov 26, 2024
3393078
Merge branch 'dev' into branch_#1097
CarsonDavis Nov 26, 2024
c4c1bc1
improved doc strings and the errors
CarsonDavis Nov 26, 2024
5a9308c
Continue in case of missing records
Dec 2, 2024
cba90fa
prints for each processed and updated batch
Dec 2, 2024
5306d32
Merge pull request #1108 from NASA-IMPACT/1107-ej-integrate-original-…
CarsonDavis Dec 3, 2024
08e6070
improve cmr tests
CarsonDavis Dec 3, 2024
2e6cc06
Merge pull request #1102 from NASA-IMPACT/update_cmr_mappings
CarsonDavis Dec 3, 2024
1b23afa
API tests
Dec 3, 2024
a6c6d69
Added test command
Dec 3, 2024
a598171
improve processing of temporal extent to include ranges as well as si…
CarsonDavis Dec 3, 2024
8f0a3ee
Affected Delta URLs header added
Dec 3, 2024
be9dee4
rename fstring validator to validate_fstring
CarsonDavis Dec 3, 2024
caf0840
add tests for title resolution
CarsonDavis Dec 3, 2024
81a9ff9
update DeltaTitleErrors to not create duplicates
CarsonDavis Dec 3, 2024
82462f7
add tiebreaker logic to pattern specificity
CarsonDavis Dec 3, 2024
983e316
Merge pull request #1118 from NASA-IMPACT/1115-improve-title-processi…
CarsonDavis Dec 3, 2024
a739fd8
Merge pull request #1117 from NASA-IMPACT/3003-affected-urls
CarsonDavis Dec 3, 2024
57ec0c0
Merge pull request #1114 from NASA-IMPACT/3034-cosmos-api-test-cases
CarsonDavis Dec 3, 2024
49dda0c
add readme not about running management commands
CarsonDavis Dec 3, 2024
334f138
minor improvements to ej processing readme
CarsonDavis Dec 3, 2024
52b26a0
update process_ej_dump to use new model choice values
CarsonDavis Dec 3, 2024
d7620ad
Merge branch 'dev' into branch_#1097
CarsonDavis Dec 3, 2024
b91405d
refactor sinequa_api wrapper, test suites, and full_text import
CarsonDavis Dec 4, 2024
e54f94b
Merge pull request #1104 from NASA-IMPACT/branch_#1097
CarsonDavis Dec 4, 2024
5a34a3e
Updated page title to URLs
Dec 4, 2024
30601c9
Updated subtitle to collection name
Dec 4, 2024
0726e5f
Updated serializer to include the to_delete field
Dec 4, 2024
903b219
Merge pull request #1120 from NASA-IMPACT/3002-subtitle-updates
CarsonDavis Dec 4, 2024
b5beeb3
Refresh page on workflow status change
Dec 5, 2024
219cd1c
Merge pull request #1124 from NASA-IMPACT/2895-trigger-frontend-refre…
CarsonDavis Dec 5, 2024
f35a7de
Merge branch 'dev' into 1093-write-tests-for-two-column-tags-function…
Kirandawadi Dec 5, 2024
1308ebc
update template worker counts to 3
CarsonDavis Dec 5, 2024
1b70b52
Fix incorrect import
Kirandawadi Dec 5, 2024
e0ac9e4
merge migrations conflict
Kirandawadi Dec 5, 2024
74d02d7
add initial reindexing statuses
CarsonDavis Dec 5, 2024
8f87aef
Update sde_collections/models/candidate_url.py
CarsonDavis Dec 5, 2024
de20d47
add a TDAMM field Not TDAMM
CarsonDavis Dec 5, 2024
547401c
add tdamm tags to the api serializer tests
CarsonDavis Dec 5, 2024
b001d04
Merge pull request #1103 from NASA-IMPACT/1093-write-tests-for-two-co…
CarsonDavis Dec 6, 2024
2a6e64e
Merge branch 'dev' into 1055-add-additional-workflow-statuses-on-cosm…
CarsonDavis Dec 6, 2024
a41faca
add initial status documentation
CarsonDavis Dec 6, 2024
b8bdc4c
add initial migration to add reindexing statuses
CarsonDavis Dec 6, 2024
ad82236
add migration logic to set statuses
CarsonDavis Dec 6, 2024
1f70b32
add initial frontend fixes
CarsonDavis Dec 6, 2024
55c25f6
to_delete column visible on url page
Dec 6, 2024
986f416
fix the display of the reindexing pane and annotate html file for cla…
CarsonDavis Dec 6, 2024
1d90663
refactor collection list to use column names instead of indices
CarsonDavis Dec 6, 2024
04ba82c
switch to text insertion instead of html in collection list
CarsonDavis Dec 6, 2024
11a366f
Merge pull request #1125 from NASA-IMPACT/1055-add-additional-workflo…
CarsonDavis Dec 6, 2024
18bdaa4
Merge pull request #1121 from NASA-IMPACT/3007-view-deleted-urls-unde…
CarsonDavis Dec 6, 2024
ee7229c
change worker count in new_collection_template
CarsonDavis Dec 6, 2024
1b71c2d
update worker count in default scraper
CarsonDavis Dec 6, 2024
2b811b6
add automatic batch size reduction to sinequa_api
CarsonDavis Dec 7, 2024
2c0d361
add postgres to django dockerfile for better backup system
CarsonDavis Dec 9, 2024
97f756a
add initial commands for database restore and backup
CarsonDavis Dec 9, 2024
e59366a
refactor database_backup and include compression
CarsonDavis Dec 9, 2024
48c66b8
improve logic in get_backup_filename
CarsonDavis Dec 9, 2024
bbb0d4a
refactor and add tests for database restores
CarsonDavis Dec 10, 2024
a9e63bb
add database backup and restore information to main readme
CarsonDavis Dec 10, 2024
c889c15
Merge pull request #1127 from NASA-IMPACT/1126-managepy-command-for-d…
CarsonDavis Dec 10, 2024
7cd8c4d
Updated database restore command
Dec 10, 2024
b65e278
remove count calculations defined as properties
bishwaspraveen Dec 10, 2024
69950af
code to calculate the URL counts within the admin
bishwaspraveen Dec 10, 2024
09d7438
update all backup docstrings and readme with clearer volume mount info
CarsonDavis Dec 10, 2024
626285d
Merge pull request #1131 from NASA-IMPACT/3055-optmize-the-retrieval-…
CarsonDavis Dec 10, 2024
febee1b
Merge pull request #1130 from NASA-IMPACT/database_import_bug_fixes
CarsonDavis Dec 10, 2024
e934a9c
refactor status change logic and add indexing complete status
CarsonDavis Dec 10, 2024
623834e
initial addition of frontend status change
CarsonDavis Dec 10, 2024
5ca98a4
shorten display names for reindexing statuses
CarsonDavis Dec 10, 2024
b5c1c42
fix modal bug and add workflow status
CarsonDavis Dec 11, 2024
8de43c7
register reindexing_status in the field tracker and add curation prom…
CarsonDavis Dec 11, 2024
a8a3378
rename readme to pattern overview
CarsonDavis Dec 11, 2024
4e288ac
add readme for status triggers
CarsonDavis Dec 11, 2024
9f9654f
add overview readme to link all the documentation
CarsonDavis Dec 11, 2024
5efa64d
add initial tests for workflow status triggers
CarsonDavis Dec 11, 2024
85574aa
update fulltext tests to break signals
CarsonDavis Dec 11, 2024
99cfb9f
Merge pull request #1134 from NASA-IMPACT/1133-refactor-indexing-stat…
CarsonDavis Dec 11, 2024
f231ba8
Updated dockerignore and gitignore
Dec 11, 2024
2ea0494
Fixes #1112
Dec 11, 2024
1cd038b
add ignores for more venvs and backup files
CarsonDavis Dec 11, 2024
2208bdc
Merge pull request #1135 from NASA-IMPACT/update-dockerignore-gitignore
CarsonDavis Dec 11, 2024
018c2b7
change production docker volume permissions
CarsonDavis Dec 11, 2024
178b5bc
add promotion tests for overlapping title patterns and title changes
CarsonDavis Dec 11, 2024
0b83d17
prevent promotion from copying id's over to curatedurls
CarsonDavis Dec 11, 2024
430a1c1
add additional promotion tests
CarsonDavis Dec 11, 2024
93392df
add clarification about pattern behavior to the lifecycle readme
CarsonDavis Dec 11, 2024
6b73771
Merge pull request #1140 from NASA-IMPACT/1139-resolve-id-conflict-wh…
bishwaspraveen Dec 11, 2024
5ad6f4c
add explanatory commentary to the lifecycle readme
CarsonDavis Dec 12, 2024
6f528b8
Updated title pane to Delta URLs
Dec 12, 2024
56c61f6
Merge pull request #1141 from NASA-IMPACT/fix-homepage-pane
CarsonDavis Dec 12, 2024
5f21ace
write a draft testing guide
CarsonDavis Dec 12, 2024
04071e9
change to an open link for testing doc
CarsonDavis Dec 12, 2024
a0a80e9
rename manual testing readme
CarsonDavis Dec 12, 2024
434c9b0
Filters fixed
Dec 13, 2024
eddad84
refactor readme for unapply logic
CarsonDavis Dec 13, 2024
6a52eaf
update promotion code to treat empty stings and null values as meanin…
CarsonDavis Dec 13, 2024
ec471b0
correct examples 5 and 6 in unapply logic readme
CarsonDavis Dec 13, 2024
393402c
update Field modifier unapply to handle pattern overlaps
CarsonDavis Dec 13, 2024
48e0ac9
add dedicated test suite for field modifier unapply
CarsonDavis Dec 13, 2024
cf86eee
add tests for title pattern unapply
CarsonDavis Dec 13, 2024
85a65d0
update unapply logic for title patterns to include overlapping patterns
CarsonDavis Dec 13, 2024
aef7cdb
Merge pull request #1146 from NASA-IMPACT/1142-test-field-modified-un…
CarsonDavis Dec 13, 2024
c316960
Merge pull request #1145 from NASA-IMPACT/1144-fix-url-page-filters
CarsonDavis Dec 13, 2024
0e33fe5
add new field to reindexing statuses
CarsonDavis Dec 13, 2024
6651bae
Merge pull request #1148 from NASA-IMPACT/1147-add-new-status-for-re-…
CarsonDavis Dec 13, 2024
5922d01
Add documentation for PairedFieldDescriptor implementation
Kirandawadi Dec 16, 2024
3c8b985
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 16, 2024
633cc15
Conditional anchor updated for 0 Delta URLs
Dec 16, 2024
3e828b2
Javascript updated
Dec 18, 2024
0064403
Updated collection.py
Dec 18, 2024
a081574
Removed console log statements
Dec 18, 2024
60b9afd
Merge pull request #1161 from NASA-IMPACT/1154-delta-urls-with-a-coun…
CarsonDavis Dec 18, 2024
a1e29f9
fixed paging on excludes and includes tabs
bishwaspraveen Dec 18, 2024
26a8f99
Merge pull request #1163 from NASA-IMPACT/1153-pagination-limited-to-…
CarsonDavis Dec 18, 2024
9acfb14
remove assignment of reindexing finished status in migration file
CarsonDavis Dec 18, 2024
6bc1767
Merge branch 'dev' of github.com:NASA-IMPACT/COSMOS into dev
CarsonDavis Dec 18, 2024
7fcf45a
change name to COSMOS
CarsonDavis Dec 18, 2024
c8cb5cc
change name from SDE Indexing Helper to COSMOS
CarsonDavis Dec 18, 2024
01b9a1e
remove SVG SDE Indexing Helper logo and replace with simple COSMOS text
CarsonDavis Dec 18, 2024
d8250ee
remove <br> blocking buttons
CarsonDavis Dec 18, 2024
de941ce
test adding long timeout to traefik
CarsonDavis Dec 18, 2024
fb7a809
simplify timeout handling in traefik
CarsonDavis Dec 18, 2024
094e6b0
try a timeout of 300s
CarsonDavis Dec 18, 2024
5164e40
add timeout to the services layer
CarsonDavis Dec 18, 2024
b44a303
create a dedicated section for serversTransport
CarsonDavis Dec 18, 2024
c4fadd0
move the timeout to the services section
CarsonDavis Dec 18, 2024
7f68c9e
revert timeouts in traefik
CarsonDavis Dec 18, 2024
31b82d7
add long timeout middleware
CarsonDavis Dec 18, 2024
884e2c2
use timeouts instead of forwarding timeouts
CarsonDavis Dec 18, 2024
a34e780
remove middleware timeouts and add transport timeouts
CarsonDavis Dec 18, 2024
30bfbc3
update the gunicorn timeout to match the traefik
CarsonDavis Dec 18, 2024
c84972e
temporarily increase timeout to 10 minutes
CarsonDavis Dec 18, 2024
21a921a
column addition work in progress
Dec 19, 2024
5bd6847
rename to api_tests to test_sinequa_api
CarsonDavis Dec 19, 2024
9a83b9d
rename test_apis to test_url_apis
CarsonDavis Dec 19, 2024
0ad2dc7
rename test_views to test_ej_api
CarsonDavis Dec 19, 2024
fcc70eb
Merge pull request #1162 from NASA-IMPACT/1150-status-button-color-ma…
CarsonDavis Dec 19, 2024
9ebdae5
Merge pull request #1160 from NASA-IMPACT/1159-add-documentation-for-…
CarsonDavis Dec 19, 2024
e51ce45
add release notes file
CarsonDavis Dec 19, 2024
8df561a
Merge branch 'dev' of github.com:NASA-IMPACT/COSMOS into dev
CarsonDavis Dec 19, 2024
c7e74b2
Allow individual URL inclusion to override multi-URL excludes
Kirandawadi Dec 19, 2024
456c763
Modify serializers
Kirandawadi Dec 19, 2024
f8c2329
correct the count of ej spreadsheet values
CarsonDavis Dec 19, 2024
7348a76
update the code for deleting duplicates and migrating collections
CarsonDavis Dec 20, 2024
4294dfd
Added affected curated urls count on url pattern pages
Dec 24, 2024
bb7dde7
Merge branch 'dev' of https://github.com/NASA-IMPACT/COSMOS into 3000…
Dec 24, 2024
4053928
Match type pattern input added
Dec 28, 2024
9f6ed26
HTML updated
Dec 28, 2024
9635cf3
Rectified curated url link condition
Dec 31, 2024
e632d74
Curated URL button CSS condition
Dec 31, 2024
5ac8432
Fixed count bug in views
Dec 31, 2024
b507e37
Merge pull request #1172 from NASA-IMPACT/1157-specify-pattern-match-…
CarsonDavis Jan 7, 2025
2660973
Merge pull request #1170 from NASA-IMPACT/3000-add-curated-urls-colum…
CarsonDavis Jan 7, 2025
909c9a6
Removed curated_url_count from DeltaURLSerializer
Jan 8, 2025
efd3fd6
Merge pull request #1169 from NASA-IMPACT/1116-add-a-column-to-displa…
bishwaspraveen Jan 8, 2025
4083d7d
Merge pull request #1136 from NASA-IMPACT/1112-uniform-handling-of-er…
bishwaspraveen Jan 8, 2025
99a72f4
Merge pull request #1167 from NASA-IMPACT/1156-update-exclude-checkma…
bishwaspraveen Jan 8, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,13 @@
.pre-commit-config.yaml
.readthedocs.yml
.travis.yml
venv
.git

# ignore local python environments
venv
.venv

# prevent large backup files from being copied into the image
/backups
*.sql
*.gz
12 changes: 10 additions & 2 deletions .envs/.local/.django
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,17 @@ SINEQUA_CONFIGS_REPO_WEBAPP_PR_BRANCH='dummy_branch'
# Slack Webhook
# ------------------------------------------------------------------------------
SLACK_WEBHOOK_URL=''
LRM_USER=''
LRM_PASSWORD=''

#Server Credentials
#--------------------------------------------------------------------------------
LRM_DEV_USER=''
LRM_DEV_PASSWORD=''
XLI_USER=''
XLI_PASSWORD=''
LRM_QA_USER=''
LRM_QA_PASSWORD=''

#Server Tokens
#--------------------------------------------------------------------------------
LRM_DEV_TOKEN=''
XLI_TOKEN=''
9 changes: 4 additions & 5 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -292,8 +292,7 @@ config_generation/config.py
# Model's inference files
Document_Classifier_inference/model.pt

# Database backup
backup.json

# Prod backup
prod_backup-20240423.json
# Ignore Database Backup files
/backups
*.sql
*.gz
118 changes: 92 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ $ docker-compose -f local.yml build
```bash
$ docker-compose -f local.yml up
```

### Non-Docker Local Setup

If you prefer to run the project without Docker, follow these steps:
Expand Down Expand Up @@ -69,57 +68,103 @@ $ docker-compose -f local.yml run --rm django python manage.py createsuperuser
#### Creating Additional Users

Create additional users through the admin interface (/admin).
## Database Backup and Restore

COSMOS provides dedicated management commands for backing up and restoring your PostgreSQL database. These commands handle both compressed and uncompressed backups and work seamlessly in both local and production environments using Docker.

### Loading Fixtures
### Backup Directory Structure

To load collections:
All backups are stored in the `/backups` directory at the root of your project. This directory is mounted as a volume in both local and production Docker configurations, making it easy to manage backups across different environments.

- Local development: `./backups/`
- Production server: `/path/to/project/backups/`

If the directory doesn't exist, create it:
```bash
$ docker-compose -f local.yml run --rm django python manage.py loaddata sde_collections/fixtures/collections.json
mkdir backups
```

### Manually Creating and Loading a ContentTypeless Backup
Navigate to the server running prod, then to the project folder. Run the following command to create a backup:
### Creating a Database Backup

To create a backup of your database:

```bash
docker-compose -f production.yml run --rm --user root django python manage.py dumpdata --natural-foreign --natural-primary --exclude=contenttypes --exclude=auth.Permission --indent 2 --output /app/backups/prod_backup-20240812.json
# Create a compressed backup (recommended)
docker-compose -f local.yml run --rm django python manage.py database_backup

# Create an uncompressed backup
docker-compose -f local.yml run --rm django python manage.py database_backup --no-compress

# Specify custom output location within backups directory
docker-compose -f local.yml run --rm django python manage.py database_backup --output my_custom_backup.sql
```
This will have saved the backup in a folder outside of the docker container. Now you can copy it to your local machine.

The backup command will automatically:
- Detect your server environment (Production/Staging/Local)
- Use database credentials from your environment settings
- Generate a dated filename if no output path is specified
- Save the backup to the mounted `/backups` directory
- Compress the backup by default (can be disabled with --no-compress)

### Restoring from a Database Backup

To restore your database from a backup, it will need to be in the `/backups` directory. You can then run the following command:

```bash
mv ~/prod_backup-20240812.json <project_path>/prod_backup-20240812.json
scp sde:/home/ec2-user/sde_indexing_helper/backups/prod_backup-20240812.json prod_backup-20240812.json
# Restore from a backup (handles both .sql and .sql.gz files)
docker-compose -f local.yml run --rm django python manage.py database_restore backups/backup_file_name.sql.gz
```

Finally, load the backup into your local database:
The restore command will:
- Automatically detect if the backup is compressed (.gz)
- Terminate existing database connections
- Drop and recreate the database
- Restore all data from the backup
- Handle all database credentials from your environment settings

### Working with Remote Servers

When working with production or staging servers:

1. First, SSH into the appropriate server:
```bash
docker-compose -f local.yml run --rm django python manage.py loaddata prod_backup-20240812.json
# For production
ssh user@production-server
cd /path/to/project
```

### Loading the Database from an Arbitrary Backup
2. Create a backup on the remote server:
```bash
docker-compose -f production.yml run --rm django python manage.py database_backup
```

1. Build the project and run the necessary containers (as documented above).
2. Clear out content types using the Django shell:
3. Copy the backup from the remote server's backup directory to your local machine:
```bash
scp user@remote-server:/path/to/project/backups/backup_name.sql.gz ./backups/
```

4. Restore locally:
```bash
$ docker-compose -f local.yml run --rm django python manage.py shell
>>> from django.contrib.contenttypes.models import ContentType
>>> ContentType.objects.all().delete()
>>> exit()
docker-compose -f local.yml run --rm django python manage.py database_restore backups/backup_name.sql.gz
```

3. Load your backup database:
### Alternative Methods

While the database_backup and database_restore commands are the recommended approach, you can also use Django's built-in fixtures for smaller datasets:

```bash
$ docker cp /path/to/your/backup.json container_name:/path/inside/container/backup.json
$ docker-compose -f local.yml run --rm django python manage.py loaddata /path/inside/the/container/backup.json
$ docker-compose -f local.yml run --rm django python manage.py migrate
# Create a backup excluding content types
docker-compose -f production.yml run --rm django python manage.py dumpdata \
--natural-foreign --natural-primary \
--exclude=contenttypes --exclude=auth.Permission \
--indent 2 \
--output backups/prod_backup-$(date +%Y%m%d).json

# Restore from a fixture
docker-compose -f local.yml run --rm django python manage.py loaddata backups/backup_name.json
```
### Restoring the Database from a SQL Dump
If the JSON file is particularly large (>1.5GB), Docker might struggle with this method. In such cases, you can use SQL dump and restore commands as an alternative, as described [here](./SQLDumpRestoration.md).


Note: For large databases (>1.5GB), the database_backup and database_restore commands are strongly recommended over JSON fixtures as they handle large datasets more efficiently.

## Additional Commands

Expand Down Expand Up @@ -208,3 +253,24 @@ Eventually, job creation will be done seamlessly by the webapp. Until then, edit
- JavaScript: `/sde_indexing_helper/static/js`
- CSS: `/sde_indexing_helper/static/css`
- Images: `/sde_indexing_helper/static/images`


## Running Long Scripts on the Server
```shell
tmux new -s docker_django
```
Once you are inside, you can run dmshell or for example a managment command:

```shell
docker-compose -f production.yml run --rm django python manage.py deduplicate_urls
```

Later, you can do this to get back in.
```shell
tmux attach -t docker_django
```

To delete the session:
```shell
tmux kill-session -t docker_django
```
109 changes: 108 additions & 1 deletion SQLDumpRestoration.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,4 +82,111 @@ docker-compose -f local.yml up
docker-compose -f local.yml run --rm django python manage.py createsuperuser
```

8. Log in to the SDE Indexing Helper frontend to ensure that all data has been correctly populated in the UI.
8. Log in to the COSMOS frontend to ensure that all data has been correctly populated in the UI.



# making the backup

```bash
ssh sde
cat .envs/.production/.postgres
```

find the values for the variables:
POSTGRES_HOST=sde-indexing-helper-db.c3cr2yyh5zt0.us-east-1.rds.amazonaws.com
POSTGRES_PORT=5432
POSTGRES_DB=postgres
POSTGRES_USER=postgres
POSTGRES_PASSWORD=this_is_A_web_application_built_in_2023

```bash
docker ps
```

b3fefa2c19fb

note here that you need to put the
```bash
docker exec -t your_postgres_container_id pg_dump -U your_postgres_user -d your_database_name > backup.sql
```
```bash
docker exec -t container_id pg_dump -h host -U user -d database -W > prod_backup.sql
```

docker exec -t b3fefa2c19fb env PGPASSWORD="this_is_A_web_application_built_in_2023" pg_dump -h sde-indexing-helper-db.c3cr2yyh5zt0.us-east-1.rds.amazonaws.com -U postgres -d postgres > prod_backup.sql

# move the backup to local
go back to local computer and scp the file

```bash
scp sde:/home/ec2-user/sde_indexing_helper/prod_backup.sql .
```
scp prod_backup.sql sde_staging:/home/ec2-user/sde-indexing-helper
if you have trouble transferring the file, you can use rsync:
rsync -avzP prod_backup.sql sde_staging:/home/ec2-user/sde-indexing-helper/

# restoring the backup
bring down the local containers
```bash
docker-compose -f local.yml down
docker-compose -f local.yml up postgres
docker ps
```

find the container id

c11d7bae2e56

find the local variables from
cat .envs/.production/.postgres
POSTGRES_HOST=sde-indexing-helper-staging-db.c3cr2yyh5zt0.us-east-1.rds.amazonaws.com
POSTGRES_PORT=5432
POSTGRES_DB=sde_staging
POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres


```bash
docker exec -it <container id> bash
```
docker exec -it c11d7bae2e56 bash

## do all the database shit you need to


psql -U <POSTGRES_USER> -d <POSTGRES_DB>
psql -U postgres -d sde_staging
or, if you are on one of the servers:
psql -h sde-indexing-helper-staging-db.c3cr2yyh5zt0.us-east-1.rds.amazonaws.com -U postgres -d postgres

\c postgres
DROP DATABASE sde_staging;
CREATE DATABASE sde_staging;

# do the backup

```bash
docker cp prod_backup.sql c11d7bae2e56:/
docker exec -it c11d7bae2e56 bash
```

```bash
psql -U <POSTGRES_USER> -d <POSTGRES_DB> -f backup.sql
```
psql -U VnUvMKBSdkoFIETgLongnxYHrYVJKufn -d sde_indexing_helper -f prod_backup.sql

psql -h sde-indexing-helper-staging-db.c3cr2yyh5zt0.us-east-1.rds.amazonaws.com -U postgres -d postgres -f prod_backup.sql
pg_restore -h sde-indexing-helper-staging-db.c3cr2yyh5zt0.us-east-1.rds.amazonaws.com -U postgres -d postgres prod_backup.sql



docker down

docker up build

migrate

down

up
7 changes: 7 additions & 0 deletions compose/local/django/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -38,13 +38,20 @@ WORKDIR ${APP_HOME}

# Install required system dependencies
RUN apt-get update && apt-get install --no-install-recommends -y \
wget \
gnupg \
# psycopg2 dependencies
libpq-dev \
# Translations dependencies
gettext \
# pycurl dependencies
libcurl4-openssl-dev \
libssl-dev \
# PostgreSQL 15
&& sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt bullseye-pgdg main" > /etc/apt/sources.list.d/pgdg.list' \
&& wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - \
&& apt-get update \
&& apt-get install -y postgresql-15 postgresql-client-15 \
# cleaning up unused files
&& apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false \
&& rm -rf /var/lib/apt/lists/*
Expand Down
Loading
Loading