-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to ElasticSearch 6 #5609
Comments
Slightly blocked by travis-ci/apt-source-safelist#379 - not a hard block though, we can get around it. |
Added an upstream patch travis-ci/apt-source-safelist#385 |
@bqbn are there any ops requirements in regards to timing when this has to happen on our end? |
We haven't started any work regarding ES6 upgrade yet. We can probably target this Q1/Q2 2019 from ops perspective. We'd certainly want to do this after we are done with the UTF8mb4 upgrade (https://bugzilla.mozilla.org/show_bug.cgi?id=1479111), which is slated to happen early Q1. |
Going to give this a try on travis to see where we are. I think we want do this soon, but after Python 3 and possibly after Django 2.2 too. |
What are those going to happen the earliest? And it looks like ops may not be able to start working on this in Q1 after all. |
Python 3: We were thinking this week or the next. Sadly we missed the tag so it's probably going to be next week. Maybe we can try to target early Q2 for Elasticsearch 6 ? Not sure how much work there is to do yet both on the dev and ops sides. |
Good news: there aren't a lot of changes to make, so my branch is almost done, just a couple tests to adjust, but the code is ready. Bad news: Removal of mapping types hurt us. My branch has a refactor of stats code to deal with this, but this requires splitting the stats index into 2 separate indexes, each containing a single type (one with update counts, one with download counts). This means that we need to come up with a transition plan. A brutal approach would be to do a full reindex of stats on a new cluster with new code while serving the pages with the old cluster and old code (how long does that take?). An alternative would be to have ops figure out a way to move the mapping to the new indexes without requiring a full reindex, but I'm not sure what's possible here. |
There will be a new cluster for sure. Because current cluster is running v5, we have not planned to upgrade v5 to v6 on the current cluster, but planned to build a new cluster that runs v6. Last time when we upgraded from v1 to v5, we created a new cluster that ran v5, and ran
Not sure how to do this yet at present. But it sounds like it requires data migration from the current cluster (v5) to the new cluster (v6). Nevertheless, we'd prefer the way we did it last time, i.e. by running |
Ok, that works. I didn't know how much time it would take. Worth noting that ES6 is (mostly) compatible with ES5 clusters as long as you don't create a new index, but I'm not sure it buys us anything here. |
@muffinresearch @diox @bqbn I just found this issue and wondered how it's priority is especially since support for the latest ES 5 version ended in March last year. Looking at https://www.elastic.co/support/eol upgrading to ES6 will only be a step-stone to eventually land on ES7 as even ES6 isn't supported too long in the future as well. |
I'd like to do it as soon as possible, we just need to find the time. We'd need to refresh my branch but it was almost good to go. If we wait for https://github.com/orgs/mozilla/projects/116 it might greatly simplify the migration: we might end up removing all individual stats storage on our side and that would mean not having to care about the issue about mapping vs types I mentioned earlier (that would leave only add-ons in our ES cluster) |
How does by late March sound to everyone? I also need to do some research and test and see how different new version is configuration wise. Also, version wise, how about we upgrade it to v6.8.x this time? I mainly don't want to skip a version at this point. |
We've built a v6.8.6 ES cluster in -dev environment. Any suggestion how we proceed to switch to that cluster? Hopefully, there is a way for us to run the new version in -dev and -stage for a week or two before upgrading -prod. :) |
I was hoping the new stats storage would be ready in time, but that looks unlikely at this point, since we haven't started the work on it. Because we need a new data structure, it's going to be difficult to have on dev & stage for more than a week - keeping the code compatible with both data structures is not trivial - I don't think we have the developer time to do it. My plan was:
(*) requires doing what you were talking about in a comment above |
OK, the plan is about the same as what I have in mind except that it has a tighter schedule. I think it's still reasonable though and we should try it. One question I have, is it correct that only the cron jobs trigger the writes to the ES cluster? For example, if I disable the cron jobs on the admin instance, is it right that no other tasks will write to the ES cluster? Meanwhile, I wrote a draft deployment plan based on our last upgrade, https://docs.google.com/document/d/1pP7KK6RWXBKTjLkKh5NnfrUaHbS1mAGonOij-8zGJ8U/edit?usp=sharing @diox @EnTeQuAk can you take a look and comment as needed? Thanks. |
No, writes to the ES cluster are done by celery tasks which can be triggered by a bunch of things, like someone saving an add-on for instance. Shutting down the cron helps though, as it disables add-on auto-approvals (a source of add-on changes) and disables stats indexing (also it only happens once a day anyway). |
Because I haven't had much time to work on this yet, and the stats project have started getting some traction, we're back on waiting on it for a while to see if that would help us. Being able to get rid of the stats related indexes would greatly simplify the migration (I suspect it would make updating my branch and fixing tests almost trivial) and give us the ability to roll back if things go wrong. |
@diox this requires some regression testing on search, I presume? |
Yes. I already did some when it landed yesterday and everything seemed to work. One area that needs special attention are statistics. |
Search seems to works as before and add-on stats (i.e users, ratings - if available) are still present in search results and on the add-on detail page. @diox the statistics dashboard shows a continuous loading indicator for each add-on on -dev I've verified with - see example. Do you think the upgrade might have caused this? |
@AlexandraMoga did those add-ons have working statistics before ? stats on dev are coming from actual requests from users that have their Firefox configured to hit dev's versioncheck, so very few add-ons will have working stats (and those which do will have very low numbers) |
@diox as far as I remember, even if the add-on had no stats, the graphs would still load, although empty. Now they are all stuck in a loading state. |
This was caused by #7615, which has been fixed by #7630 You can now see it working on add-ons that do have some data, like https://addons-dev.allizom.org/en-US/firefox/addon/awesome-screenshot-plus-/statistics/usage/?last=7 |
Yep, stats are showing up now. Here's another add-on on -dev that has some actual stats to show: https://addons-dev.allizom.org/en-US/firefox/addon/view-page-archive-cache/statistics/?last=90 |
Just an FYI, the following deprecation warnings are logged by the new ES cluster,
Something to consider for the next upgrade. :) |
The text was updated successfully, but these errors were encountered: