Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git cinnnabar fsck fails on searchfox indexer's gecko-dev checkout with Sha1 mismatch for file browser/config/version.txt #274

Closed
asutherland opened this issue Jun 20, 2021 · 4 comments

Comments

@asutherland
Copy link

Basically reposting the details of https://bugzilla.mozilla.org/show_bug.cgi?id=1716167#c4 cc @staktrace

Here's the cinnabar version dump:

0.5.8a
module-hash: ce3b0259e8dc6f943528e175597a3152f5cf1a24
helper-hash: b8923f9ad6cd68864e7cf0eec5e88a00aafcda84

Here's the log excerpt, and the fast import is uploaded to bugzilla at https://bugzilla.mozilla.org/attachment.cgi?id=9227980

Checking 235 changeset heads
Loading 651889 manifests
Checking 365 manifest heads
Checking 22346 filesfatal: Missing data
fast-import: dumping crash report to git/.git/fast_import_crash_14440
Checking 22447 filesTraceback (most recent call last):
  File "/home/ubuntu/git-cinnabar/cinnabar/util.py", line 999, in run
    retcode = func(args)
  File "/home/ubuntu/git-cinnabar/cinnabar/cmd/fsck.py", line 376, in fsck
    return fsck_quick(args.force)
  File "/home/ubuntu/git-cinnabar/cinnabar/cmd/fsck.py", line 302, in fsck_quick
    if not GitHgHelper.check_file(hg_file, *hg_fileparents):
  File "/home/ubuntu/git-cinnabar/cinnabar/helper.py", line 311, in check_file
    with self.query(b'check-file', hg_sha1, *parents) as stdout:
  File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/home/ubuntu/git-cinnabar/cinnabar/helper.py", line 198, in query
    wrapper.write(b'%s %s\n' % (name, b' '.join(args)))
IOError: [Errno 32] Broken pipe
Sha1 mismatch for file browser/config/version.txt
  revision eb289cb40f790c0a612b2db711ce3c336f97a840
  with parent 00705a34f53c9c6808691de3686bdd1c71c5b6f3
@asutherland asutherland changed the title git cinnnabar fsck fails on searchfox indexer's gecko-dev checkout git cinnnabar fsck fails on searchfox indexer's gecko-dev checkout with Sha1 mismatch for file browser/config/version.txt Jun 20, 2021
@glandium
Copy link
Owner

Can you create a git bundle of refs/cinnabar/metadata and put it somewhere I can download it?

@asutherland
Copy link
Author

I've also invited you to the people.mozilla.org searchfox LDAP group so that if you want to directly work on one of the VMs, you can. The instructions are at https://github.com/mozsearch/mozsearch/blob/master/docs/aws.md#setting-up-aws-locally and mozilla-central should always be checked out on web-servers with tags like 'cfile: config1.json', 'channel: release1'. The most recent web-server is going to be the newest one, and that shouldn't be used to avoid impacting the site, but the older ones are just backups. The git repo can be found at ~/index/mozilla-central/git on those machines. The caveat is the web-servers are not very powerful and are backed by S3 which means I/O isn't amazing. The indexers do use SSD's though and can be manually triggered (use the "dev" channel ideally), although if you see one alive, you can just poke around on it too, with the data being in /mnt/index-scratch/mozilla-central/ until completed when the data moves onto /index (S3) but there will be symlinks created as part of the migration.

@glandium
Copy link
Owner

Ok, so this is a side effect of how grafting worked in the past, combined with #249 and the fact that gecko-dev is missing the esr10 branch. Adding esr10 to gecko-dev would fix the problem. Alternatively, you can add a remote for esr10 to your repo, like you have for the other esr branches, git remote update esr10 and fsck will work after that. Now that gecko-dev uses git-cinnabar, even if gecko-dev adds esr10 after that, you won't have different git sha1s.

https://s3.us-west-2.amazonaws.com/searchfox.repositories/gecko.tar (8.3G) is the full "git" directory that searchfox downloads in each config1 indexer run, updates, and then re-uploads/

Note that you could save a lot in the size of that tar if you didn't include the checkout, which you obviously can do manually after downloading the contents of the .git directory.

@asutherland
Copy link
Author

Thanks very much for the analysis and solution!

Note that you could save a lot in the size of that tar if you didn't include the checkout, which you obviously can do manually after downloading the contents of the .git directory.

Yeah, that's an inefficiency we could optimize in https://github.com/mozsearch/mozsearch-mozilla/blob/master/mozilla-central/upload

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants