Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NC | NSFS | Add stat to bucket_namespace_cache #8527

Merged
merged 1 commit into from
Nov 21, 2024

Conversation

shirady
Copy link
Contributor

@shirady shirady commented Nov 13, 2024

Explain the changes

  1. Create config option NC_ENABLE_BUCKET_NS_CACHE_STAT_VALIDATION to enable/disable this addition.
  2. Add in read_bucket_sdk_info inside bucketspace FS property stat, and add a method stat_bucket that would only implemented in FS (as we stat cofig files, and in NB with DB we don't need this).
  3. Edit the _validate_bucket_namespace so before checking the validation by time, in case it is bucketspace FS we will compare the stat and return false (invalidate) in case the config file was changed.
  4. Changes in nc_coretest create the functions start_nsfs_process and stop_nsfs_process based on existing implementation to allow us to restart nsfs.
  5. Rename the function check_bucket_config to check_stat_bucket_storage_path that it would be less confusing.
  6. Edit the printings of "Could not log bucket operation" to have the details of where they were printed.
  7. Adding a basic test of running with a couple of forks (based on this comment).
  8. Change the default number of forks in the Ceph NSFS tests in the CI to 2 (see comment).
  9. Rename file test_bucketspace.js to test_nsfs_integration.js and the nc_core test log files from src/logfile to nsfs_integration_test_log.
  10. Small changes in s3_bucket_logging.js regarding bucket deletion: bucket existed and bucket that did not exist (see comment) - see NC | NSFS | Bucket Logging After Delete Bucket #8540.

Note: the changes are based on the suggesting written as a comment.

Issues: Fixed #8391

  1. Currently, bucket configuration is saved in file, but when using the bucket_namespace_cache it expires after 1 minute, see

    noobaa-core/config.js

    Lines 637 to 638 in e1bf29e

    // Object SDK bucket cache expiration time
    config.OBJECT_SDK_BUCKET_CACHE_EXPIRY_MS = 60000;
    We want to invalidate a bucket config not only by the time criteria.

GAPs:

  1. We would probably want to do the same on account_cache and account_id_cache, but currently it has lower priority than the bucket_namespace_cache.
  2. This PR is only in NC deployment, maybe we want something like that in Containerized (from what I understand we don't invalidate the items in the cache after every "make_changes" on bucket operation) see MCGI-274

Testing Instructions:

Automatic Test:

Please run: sudo NC_CORETEST=true node ./node_modules/mocha/bin/mocha ./src/test/unit_tests/test_nc_with_a_couple_of_forks.js
Note: if you want to check the setup changing and the restart please change this line: coretest.setup(setup_options); to coretest.setup({});

Manual Test:

  1. Create an account with the CLI: sudo node src/cmd/manage_nsfs account add --name <account-name> --new_buckets_path /tmp/nsfs_root1 --access_key <access-key> --secret_key <secret-key> --uid <uid> --gid <gid>
    Note: before creating the account need to give permission to the new_buckets_path: chmod 777 /tmp/nsfs_root1, chmod 777 /tmp/nsfs_root1.
  2. Start the NSFS server with: sudo node src/cmd/nsfs --debug 5
    Notes:
  • I Change the config.NSFS_CHECK_BUCKET_BOUNDARIES = false; //SDSD because I’m using the /tmp/ and not /private/tmp/.
  1. Create the alias for S3 service:alias nc-user-1-s3=‘AWS_ACCESS_KEY_ID=<access-key> AWS_SECRET_ACCESS_KEY=<secret-key> aws --no-verify-ssl --endpoint-url https://localhost:6443’.
  2. Check the connection to the endpoint and try to list the buckets (should be empty): nc-user-1-s3 s3 ls; echo $?
  3. Add bucket to the account using AWS CLI: nc-user-1-s3 s3 mb s3://check-bucket (check-bucket is the bucket name in this example)
  4. Create the content for the body: echo hello > mykey.txt
  5. Put object: nc-user-1-s3 s3api put-object --bucket check-bucket --key mykey --body mykey.txt.

Test changing bucket config file on the cache behaior:
8. Add the line: console.log('SDSD config file was changed'); after if (!same_stat) { // config file of bucket was changed and restart the server (ctrl +c and rerun with sudo node src/cmd/nsfs --debug 5).
9. List the objects in the bucket (twice - we don't expect to see the added printings): nc-user-1-s3 s3 ls s3://check-bucket.
10. Edit the bucket config file, for example: sudo node src/cmd/manage_nsfs bucket update --name check-bucket --fs_backend GPFS (it was '' and we change it to GPFS).
11. List the objects in the bucket: nc-user-1-s3 s3 ls s3://check-bucket (we expect to see the added printings in the logs).
12. Change the config. NC_ENABLE_BUCKET_NS_CACHE_STAT_VALIDATION = false; and restart the server (ctrl +c and rerun with sudo node src/cmd/nsfs --debug 5).
13. List the objects in the bucket: nc-user-1-s3 s3 ls s3://check-bucket
14. Edit the bucket config file, for example: sudo node src/cmd/manage_nsfs bucket update --name check-bucket --fs_backend '' (it was GPFS and we change it to '').
15. List the objects in the bucket: nc-user-1-s3 s3 ls s3://check-bucket (we don't expect to see the added printings in the logs).

Test bucket create and delete (due to initial failure in Ceph test test_bucket_create_delete):
16. Create a bucket: nc-user-1-s3 s3 mb s3://bucket-del-01
17. Delete the created bucket immediately: nc-user-1-s3 s3 rb s3://bucket-del-01 (would see the printings "_validate_bucket_namespace got an error [Error: No such file or directory] { code: 'ENOENT', context: 'Stat _path=/etc/noobaa.conf.d/buckets/bucket-del-01.json ' } ignoring...")

  • Doc added/updated
  • Tests added

src/sdk/object_sdk.js Outdated Show resolved Hide resolved
src/sdk/bucketspace_fs.js Show resolved Hide resolved
config.js Outdated Show resolved Hide resolved
config.js Outdated Show resolved Hide resolved
src/sdk/bucketspace_fs.js Outdated Show resolved Hide resolved
src/sdk/bucketspace_fs.js Outdated Show resolved Hide resolved
src/sdk/object_sdk.js Outdated Show resolved Hide resolved
src/sdk/object_sdk.js Outdated Show resolved Hide resolved
@shirady shirady force-pushed the nsfs-nc-bucket-namespace-with-stat branch from 9670d02 to bcda0bc Compare November 14, 2024 11:44
src/sdk/object_sdk.js Outdated Show resolved Hide resolved
@shirady shirady force-pushed the nsfs-nc-bucket-namespace-with-stat branch from bcda0bc to 578cbce Compare November 17, 2024 08:13
@shirady
Copy link
Contributor Author

shirady commented Nov 17, 2024

Hi,
@romayalon @jackyalbo
After adding these changes we would see a printing of Could not log bucket operation in the action of "nsfs-ceph-s3-tests" (see logs, for example).

~ grep 'Could not log bucket operation' ./Downloads/0_nsfs-ceph-s3-tests.txt | wc -l
     260

As I understand it happens since after every test, we delete the bucket, and now the item is invalid in the cache, and we fail in get_bucket_by_name during the read_bucket_sdk_info that happens in send_bucket_op_logs.

@pull-request-size pull-request-size bot added size/L and removed size/M labels Nov 17, 2024
@shirady shirady force-pushed the nsfs-nc-bucket-namespace-with-stat branch from e6d8cde to 701a905 Compare November 19, 2024 07:28
@shirady shirady requested a review from romayalon November 19, 2024 08:10
src/sdk/config_fs.js Outdated Show resolved Hide resolved
src/test/unit_tests/test_nc_with_a_couple_of_forks.js Outdated Show resolved Hide resolved
src/sdk/bucketspace_fs.js Outdated Show resolved Hide resolved
src/sdk/bucketspace_fs.js Outdated Show resolved Hide resolved
@jackyalbo
Copy link
Contributor

Hi, @romayalon @jackyalbo After adding these changes we would see a printing of Could not log bucket operation in the action of "nsfs-ceph-s3-tests" (see logs, for example).

~ grep 'Could not log bucket operation' ./Downloads/0_nsfs-ceph-s3-tests.txt | wc -l
     260

As I understand it happens since after every test, we delete the bucket, and now the item is invalid in the cache, and we fail in get_bucket_by_name during the read_bucket_sdk_info that happens in send_bucket_op_logs.

We discussed, and it seems that now after the operation is finished in the case of delete_bucket the bucket can't be read anymore, so the log message for this op can't be written. I offered to take the reading from the cache out of the send_bucket_op_logs function and to read the bucket info only once for every op. But as it needs to be validated and tested @shirady will open a gap for this issue.

@shirady
Copy link
Contributor Author

shirady commented Nov 19, 2024

Regarding comment1 and comment2:
I changed the printing of "Could not log bucket operation" so it will be easier to detect from where it came, and not we can see (in the logs):

➜ ~ grep 'Could not log bucket operation' ./Downloads/logs_31062666344/0_nsfs-ceph-s3-tests.txt  | wc -l
     260
➜  ~ grep 'Could not log bucket operation (after operation delete_bucket)' ./Downloads/logs_31062666344/0_nsfs-ceph-s3-tests.txt  | wc -l
     255
➜  ~ grep 'Could not log bucket operation (after handle_error)' ./Downloads/logs_31062666344/0_nsfs-ceph-s3-tests.txt  | wc -l
       5

@shirady shirady force-pushed the nsfs-nc-bucket-namespace-with-stat branch from ecb3129 to 7195892 Compare November 19, 2024 15:34
@shirady shirady requested a review from romayalon November 19, 2024 16:20
@romayalon
Copy link
Contributor

@shirady can you rename test_bucketspace.js to test_nsfs_integration.js and src/logfile to nsfs_integration_test_log?

@shirady
Copy link
Contributor Author

shirady commented Nov 20, 2024

Regarding comment1 and comment2: I changed the printing of "Could not log bucket operation" so it will be easier to detect from where it came, and not we can see (in the logs):

➜ ~ grep 'Could not log bucket operation' ./Downloads/logs_31062666344/0_nsfs-ceph-s3-tests.txt  | wc -l
     260
➜  ~ grep 'Could not log bucket operation (after operation delete_bucket)' ./Downloads/logs_31062666344/0_nsfs-ceph-s3-tests.txt  | wc -l
     255
➜  ~ grep 'Could not log bucket operation (after handle_error)' ./Downloads/logs_31062666344/0_nsfs-ceph-s3-tests.txt  | wc -l
       5

After adding a small change in file src/endpoint/s3/s3_bucket_logging.js and increasing the number of forks, you can see in the logs "nsfs-ceph-s3-tests" (link):

  1. The number of forks is 2:
~ grep -i forks /Users/shiradymnik/Downloads/logs_31118484677/0_nsfs-ceph-s3-tests.txt
2024-11-20T13:22:07.1211102Z + main@./src/test/system_tests/ceph_s3_tests/run_ceph_nsfs_test_on_test_container.sh:33 config='{"ALLOW_HTTP":true, "ENDPOINT_FORKS":2}'
2024-11-20T13:22:07.1212987Z + main@./src/test/system_tests/ceph_s3_tests/run_ceph_nsfs_test_on_test_container.sh:34 echo '{"ALLOW_HTTP":true, "ENDPOINT_FORKS":2}'
2024-11-20T13:22:07.2997467Z nsfs: config.json= { ALLOW_HTTP: true, ENDPOINT_FORKS: 2 }
2024-11-20T13:22:07.2998430Z nsfs: merged config.json= { ALLOW_HTTP: true, ENDPOINT_FORKS: 2 }
2024-11-20T13:22:08.8729005Z nsfs: setting up ... { fs_root: '', http_port: 80, https_port: 443, https_port_sts: -1, https_port_iam: -1, metrics_port: 7004, backend: '', forks: 2, access_key: undefined, secret_key: undefined, uid: 0, gid: 0, nsfs_config_root: '/etc/noobaa.conf.d' }
2024-11-20T13:22:09.1440036Z nsfs: config.json= { ALLOW_HTTP: true, ENDPOINT_FORKS: 2 }
2024-11-20T13:22:09.1440725Z nsfs: merged config.json= { ALLOW_HTTP: true, ENDPOINT_FORKS: 2 }
2024-11-20T13:22:09.1469722Z nsfs: config.json= { ALLOW_HTTP: true, ENDPOINT_FORKS: 2 }
2024-11-20T13:22:09.1470321Z nsfs: merged config.json= { ALLOW_HTTP: true, ENDPOINT_FORKS: 2 }
2024-11-20T13:22:10.9775549Z nsfs: setting up ... { fs_root: '', http_port: 80, https_port: 443, https_port_sts: -1, https_port_iam: -1, metrics_port: 7004, backend: '', forks: 2, access_key: undefined, secret_key: undefined, uid: 0, gid: 0, nsfs_config_root: '/etc/noobaa.conf.d' }
2024-11-20T13:22:10.9923585Z nsfs: setting up ... { fs_root: '', http_port: 80, https_port: 443, https_port_sts: -1, https_port_iam: -1, metrics_port: 7004, backend: '', forks: 2, access_key: undefined, secret_key: undefined, uid: 0, gid: 0, nsfs_config_root: '/etc/noobaa.conf.d' }
2024-11-20T13:22:17.2997377Z nsfs: config.json= { ALLOW_HTTP: true, ENDPOINT_FORKS: 2 }
2024-11-20T13:22:17.2998127Z nsfs: merged config.json= { ALLOW_HTTP: true, ENDPOINT_FORKS: 2 }
2024-11-20T13:22:21.6182381Z nsfs: config.json= { ALLOW_HTTP: true, ENDPOINT_FORKS: 2 }
2024-11-20T13:22:21.6183023Z nsfs: merged config.json= { ALLOW_HTTP: true, ENDPOINT_FORKS: 2 }
  1. No more printings of "Could not log bucket operation"
~ grep -i 'Could not log bucket operation' /Users/shiradymnik/Downloads/logs_31118484677/0_nsfs-ceph-s3-tests.txt
➜  ~

@shirady shirady requested a review from alphaprinz November 21, 2024 06:30
1. Create config option NC_ENABLE_BUCKET_NS_CACHE_STAT_VALIDATION to enable/disable this addition.
2. Add in read_bucket_sdk_info inside bucketspace FS property stat, and add a method stat_bucket that would only implemented in FS (as we stat cofig files, and in NB with DB we don't need this).
3. Edit the _validate_bucket_namespace so before checking the validation by time, in case it is bucketspace FS we will compare the stat and return false (invalidate) in case the config file was changed.
4. Changes in nc_coretest create the functions start_nsfs_process and stop_nsfs_process based on existing implementation to allow us to restart nsfs.
5. Rename the function check_bucket_config to check_stat_bucket_storage_path that it would be less confusing.
6. Edit the printings of "Could not log bucket operation" to have the details of where they were printed.
7. Adding a basic test of running with a couple of forks.
8. Change the default number of forks in the Ceph NSFS tests in the CI to 2.
9. Rename file test_bucketspace.js to test_nsfs_integration.js and the nc_core test log files from src/logfile to nsfs_integration_test_log.
10. Small changes in s3_bucket_logging.js regarding bucket deletion: bucket existed and bucket that did not exist - see NC | NSFS | Bucket Logging After Delete Bucket noobaa#8540.

Signed-off-by: shirady <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NC | NSFS | bucket_namespace_cache Is Not Updated Between Forks
5 participants