Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test sharding / parallelization #439

Open
vojtajina opened this issue Mar 29, 2013 · 43 comments
Open

Test sharding / parallelization #439

vojtajina opened this issue Mar 29, 2013 · 43 comments

Comments

@vojtajina
Copy link
Contributor

Allow splitting a single test suite into multiple chunks, so that it can be executed in multiple browsers and therefore executed on multiple machines or at least using multiple cores.

@shteou
Copy link

shteou commented Jul 10, 2013

Hey, I saw Issue #412 and it's exactly the sort of functionality I am looking for.

Has there been any movement on this?
If there aren't any plans, do you have any high level thoughts on how you would like this to be implemented?

Cheers.

@vojtajina
Copy link
Contributor Author

I definitely wanna do this, it's hard to say when...

We need a way to group files.
It should be dynamic (not that developers have to manually group them), so that it's possible to scale (easily change the number of file groups - the number of browsers where we execute in parallel).
When using a dependency management system (eg. Closure's goog.require, goog.provide; see karma-closure), this will be much easier and simplified (because we have can figure out the dependency graph and therefore only load the files that are really needed).
Probably "label" test files and then split these test files (assuming each test file has same amount of test; later we can add some sort of cost/labeling).

Web-server has to understand these groups
Web-server, when serving context.html has to understand these groups and also know what browser is requesting.
Probably additional argument to "execute" message to the client(browser), and the client, when refreshing the iframe uses this "groupId" as a query param, eg. context.html?group=2.
This grouping should be probably done in fileList (or once the file list changes; similar to what karma-closure does now; we should make the "file_list_modified" event async).

Currently the resolved files object looks something like:

{
  served: [
    // File objects (metadata, where the file is stored, timestamps, etc.)
  ],
  included: [
    // File objects
  ]
}

After this change, it would be:

{
  served: [
    // File objects (I'm thinking we might make this map rather than an array)
  ],
  included: {
    0: [
      // files included in the group=0 browser
    ],
    1: [
      // files included in the group=1 browser
    ]
  }

@vojtajina
Copy link
Contributor Author

@shteou Would you be interested in working on this ? I would definitely help you...

@vojtajina
Copy link
Contributor Author

Also, does it make sense ?

I mentioned karma-closure couple of times, that is this plugin: https://github.com/karma-runner/karma-closure
It is very "hacky" (the way how it hooks into Karma; if you had multiple plugins like this, it would end up really badly;-)), but it does interesting things - it basically analyze the dependencies and therefore can generate list of included files, based on the dependencies. So before this "resolving" we would group the test files and karma-closure would resolve each group separately...

@EtaiG
Copy link

EtaiG commented Dec 27, 2014

I'd just like to note that we (Wix) tried this out. danyshaanan created a test case for this.
He enabled karma to split up the loaded tests and run them in several child processes.

It didn't work out too well, since many of our tests require loading a large part of our code, and therefore the setup time and loading of all the packages for each child process was too costly. Depending on the number of child processes, it usually ended up running slower, though it did come close to running in the same amount of time.

We tried it out when we were at about ~2,000 tests. We now have over 3,500 tests in this project, so it might be worth revisiting this.

If anyone else is working on this or has another angle for this, we are also more than happy to help.

@jbnicolai
Copy link

@EtaiG I have not started working on this, but as my project currently exceeds 3000 tests as well it's becoming something I want to invest some time into as well.

@danyshaanan
Copy link

A note about dynamic creation of groups (@vojtajina) - We should be aware of how this affects tests that happen to have side effects. Imagine tests B, C, and D, and a naive alphabetical devision of tests into two groups - {B,C} and {D}. Lets say C has a devastating side affect and I'm adding test A, hence changing the grouping to {A,B} and {C,D}. Now D will fail, just because the grouping changed.

Of course, tests shouldn't have side effects, but this case is bound to happen, and might be very confusing to users.

@EtaiG
Copy link

EtaiG commented Jan 11, 2015

I think we can ignore this case and let people who encounter it deal with
their own problems.
We can also enable the API for this to allow the consumer to decide the
grouping.
On Jan 11, 2015 6:16 PM, "Dany Shaanan" [email protected] wrote:

A note about dynamic creation of groups (@vojtajina
https://github.com/vojtajina) - We should be aware of how this affects
tests that happen to have side effects. Imagine tests B, C, and D, and a
naive alphabetical devision of tests into two groups - {B,C} and {D}. Lets
say C has a devastating side affect and I'm adding test A, hence changing
the grouping to {A,B} and {C,D}. Now D will fail, just because the grouping
changed.

Of course, tests shouldn't have side effects, but this case is bound to
happen, and might be very confusing to users.


Reply to this email directly or view it on GitHub
#439 (comment).

@scriby
Copy link

scriby commented Mar 30, 2015

+1

We have a large number of tests at work, and sharding would be very beneficial. As you said, there shouldn't be side effects between tests, and for anyone who doesn't want to remove the side effects, I'd say they just don't get to run their tests in parallel :)

As long as the sharding is opt-in I think the confusion should be manageable.

@LFDM
Copy link

LFDM commented Apr 20, 2015

Hey @EtaiG and @danyshaanan, very interested in this experiment you mentioned. Is this code accessible somewhere? I'd very much like to experiment with this a bit - maybe your work could give me a headstart!

@danyshaanan
Copy link

@LFDM We have nothing to share at the moment, but I'm just about to rewrite a smarter version of it in the coming couple of weeks. I'll try to do so in a way I'll be able to share. Feel free to ping me about this in a week or so if I'll not post anything by then.

@AlanFoster
Copy link

👍 Sounds like a great idea to me; Would love to see any progress updates on this @danyshaanan!

@park9140
Copy link

👍 this would be great.

@ghost
Copy link

ghost commented Sep 8, 2015

@danyshaanan! any news?

@danyshaanan
Copy link

@aaichlmayr :
yeah, bad ones - It didn't seem to work out that well; This feature's spec is non trivial, therefore the implementation was not as clean and I would hope, and the benefits were not convincing enough to go ahead with it, so we scrapped this plan.

@ghost
Copy link

ghost commented Sep 9, 2015

Thanks for the info

@booleanbetrayal
Copy link

As long as the sharding is opt-in I think the confusion should be manageable.

Totally agree. Think it's fine to launch as an experimental feature with this requirement. Would love to see this land and would be happy to help bug-hunt, etc.

@FezVrasta
Copy link

Hi, has the feature been shipped already?

@navels
Copy link

navels commented Apr 11, 2016

👍 anyone making headway on this?

@presidenten
Copy link

So is this a feature yet?

@presidenten
Copy link

I dont get it? Did I do somethink wrong? What did I miss? Why the thumbs down?

@FezVrasta
Copy link

If you reply to an issue, all the subscribed people get an email and a notification. If you just want to add a +1 on the issue, do so adding a thumb up reaction to the first post (or to the one with more upvotes), in this way you don't flood the whole list of subscribers.

@Florian-R
Copy link

@dignifiedquire Could you lock this one like #1320 with a help:wanted label? Thanks!

@pauldraper
Copy link

pauldraper commented Dec 13, 2016

When using a dependency management system this will be much easier and simplified

True, but it's a hack to get these systems to work in karma in the first place right? I'm tempted to put that consideration aside for now.

I agree with the suggestions made by @vojtajina #439 (comment) (even if they are three and a half years old :)

I thinking

module.exports = function(config) {
  config.set({
    files: [
      'lib/lib1.js',
      'lib/lib2.js',
      {'other/**/*.js', included: false},
      {pattern: 'app/**/*.js', grouped: true},
      {patttern: 'more-app/**/*.js', grouped: true}, 
    ],
  });
}

And then the resolved object would be

{
  served: [
    'other/file1.js',
    'other/files/file2.js'
  ],
  included: {
    common: [
      'lib/lib1.js',
      'lib/lib2.js'
    ],
    groups: {
      0: {
        'app/file1.js',
        'app/file2.js',
      },
      1: {
        'app/file3.js',
        'more-app/file.js',
      }
    }
  }
}

We could reuse concurrency, though a default of Infinity is bad -- most commonly we want to run as many tests as we have cores.

We'd probably want a groups config. I could divide my code into 10 groups, and run with concurrency 3 until they are all done. As @EtaiG pointed out, there is a balance between fine-grained scheduling for better utilization, and overhead of loading common files.

@habermeier
Copy link

I'd hate to have to group tests by hand. What if we had the system uses the regular configuration as a starting point, and have it build up and refine an optimal parallel test plan over time? Along the way it might be able to discover any dependency chains (that shouldn't be there, but might be). It could flag those as "todo" items for developers, but could work around those should it discover them. Whatever it does, it'd be good for it to be able to deal with changes in the test code gracefully, so it would not have to recompute the whole thing all over again when a single test is added (or removed).

I'm sure the computational complexity would be enormous for getting at the very best configuration, but maybe some rough heuristics would get us reasonably close.

@FezVrasta
Copy link

really, just copy what jest does. it's fine

@pauldraper
Copy link

pauldraper commented Jan 2, 2017

I'd hate to have to group tests by hand.

Not sure if I understand this right, but I wasn't suggesting that.

{pattern: 'app/**/*.js', grouped: true},
{pattern: 'more-app/**/*.js', grouped: true}, 

grouped just means "these are the files that are eligible for sharding", as opposed to the common library files that are in all the tests groups. Karma then generates a number of arbitrary groups automatically from those locations. In the example, the generated groups were app/file1.js,app/file2.js and app/file3.js,more-app/file.js.

I suggest a groups config option for the number of groups. It can be tuned to weigh scheduling efficiency against startup overhead.

@brandonros
Copy link

Is this dead in the water?

@vikerman
Copy link

vikerman commented Jun 6, 2017

Hello - Here is my proposal for running tests in parallel.

This is a very simple sharding strategy - But it should provide speedup just by using multiple processors on the machine. This is meant mostly for local development and not much for CI runs(where remote CI setup costs far outweigh the speed gains of parallelization)

karma.js changes:

  • The root Karma URL can take in 3 extra URL parameters - shardId, shardIndex, totalShards
  • This is passed on to the context iframe

Jasmine(or Mocha etc.) Adapter changes:

  • In the context page, the Jasmine(or mocha) adapter can process the shardIndex and totalShards parameters
  • The adapter passes the shardId, shardIndex and totalShards in the "info" object it uses while connecting back to the Karma sever(See section below on how it's used)
  • The adapter walks the suite/spec tree and collects all leaf specs in an array
  • The adapter uses a very simple sharding strategy - It runs the subset of tests
    [(shardIndex/totalShards * totalTests) -> ((shardIndex + 1)/totalShards * totalTests)]

Karma server changes:

  • The server now has logic to wait for all shards to connect before starting a test run
  • This is so that the test execution doesn't immediately start when the browser instance corresponding to the first shard connects and run once more when the rest of the shards connect.
  • Server uses "shardId" and "totalShards" to determine whether enough number of sharded browsers have connected with the same "shardId"

Chrome(or other launcher) changes:

  • "ChromeSharded" (and "ChromeHeadlesSharded etc.) is a new type of launcher that launches Chrome with N different tabs - each with the same shardId and totalShards and the appropriate shardIndex
  • Default number of shards = Number of processors on the machine
  • Overriden by some Launcher flag / Environment variable

Reporter changes: (I need to flush this out more. Any ideas welcome here)

  • No changes in initial version - For local runs reporter output doesn't matter? Prints error from any of the shards on the console
  • Ideally reporter can collate all results from all the shards into a single report

@brandonros
Copy link

I'm going to try to get this done this weekend. I don't know anything about this project or the codebase, but I think a ton of people would be saved a ton of time if I can figure something out. Maybe different tabs running on different ports? Could get hairy...

@littleninja
Copy link

@brandonros rock on! I'm in a similar boat--haven't contributed to the project before but would also be interested if I can help with a sharding feature. Would you be willing to create a fork to get the ball rolling? I don't know--do people usually create "WIP - xyz feature" pull request to help rally effort?

@brandonros
Copy link

brandonros commented Aug 3, 2017

So, here's what happens at a high level:

  1. your package calls Karma with a configuration describing a framework like Jasmine

  2. A browser is opened and pointed to the local Karma server, which serves up some HTML/JS listing your files to test, and an iframe, then hands it off to the framework

I actually think it would make more sense to inject into Jasmine, because I was able to boil down exactly where the tests are executed. However, I ran into an issue where it didn't like that I was trying to execute different suites at the same time.

So, I came up with multiple tabs:

http://localhost:9876/?shardIndex=0&numShards=4
http://localhost:9876/?shardIndex=1&numShards=4
http://localhost:9876/?shardIndex=2&numShards=4
http://localhost:9876/?shardIndex=3&numShards=4

at https://github.com/brandonros/karma/commit/40cf8eb79be7af9892448a16da5b6578cd3dd983

It is still very early. All it does is allow you to chunk the test files (I hardcoded that they have to contain Spec) across multiple tabs. I'll try to update if I make any worthwhile progress on a more complete package.

Edit: I just tested it and it doesn't really work. Karma kind of falls apart as soon as you start serving different tabs different tests. I'm not sure if the architecture of Karma really supports parallel/concurrent testing. Even if I was able to work through the bugs and make the multi-tab approach, I'd need an event and logic to go with it to wait until all browsers are idle.

Edit 2: Something already existed for gulp, but I am still stuck on Grunt, so I came up with this. Hopefully it will help somebody as a rough draft for their Gruntfile.js. The improvement really isn't that great because loading all of the files in 2, 3, 4, 8 tabs isn't the best. I am going to try WebWorkers next.

function setupConcurrentTestTasks() {
    var shards = Array.apply(null, {length: 4});
    var specFiles = glob.sync('source/js/**/*Spec.js');
    var chunkSize = Math.ceil(specFiles.length / shards.length);
    var chunkedSpecFiles = specFiles.chunk(chunkSize);

    shards.forEach(function(shard, index) {
      gruntConfig.copy['karmaSource-' + index] = {
        src: 'karma.conf.tpl.js',
        dest: 'target/karma-' + index + '.conf.source.js',
        options: {
          process: function(content) {
            var sourceFiles = [
              'source/js/**/*.module.js',
              'source/js/app.js',
              '<%= ngtemplates.utils.dest %>',
              '<%= ngtemplates.components.dest %>',
              '<%= ngtemplates.framework.dest %>',
              'source/js/**/!(*Spec).js'
            ].concat(chunkedSpecFiles[index]);

            return filterKarmaConfig(sourceFiles, content);
          }
        }
      };

      gruntConfig.copy['karmaDist-' + index] = {
        src: 'karma.conf.tpl.js',
        dest: 'target/karma-' + index + '.conf.dist.js',
        options: {
          process: function(content) {
            var distFile = [
              'target/dist/js/<%= pkg.name %>-<%= pkg.version %>.min.js'
            ].concat(chunkedSpecFiles[index]);

            return filterKarmaConfig(distFile, content);
          }
        }
      };

      gruntConfig.karma['debug-' + index] = {
        configFile: 'target/karma-' + index + '.conf.source.js',
        options: {
         preprocessors: debugPrep,
          singleRun: grunt.option('singleRun'),
          browsers: noBrowsers ? [] : ['<%= karmaBrowser %>']
        }
      };

      gruntConfig.karma['dist-' + index] = {
        configFile: 'target/karma-' + index + '.conf.dist.js',
        preprocessors: {},
        reporters: ['progress', 'junit', 'threshold'],
        options: {
          browsers: noBrowsers ? [] : ['<%= karmaBrowser %>']
        }
      };

      gruntConfig.concurrent.source.tasks.push('karma-source-' + index);
      gruntConfig.concurrent.dist.tasks.push('karma-dist-' + index);

      grunt.registerTask('karma-source-' + index, [
        'copy:karmaSource-' + index,
        'wiredep:test',
        'karma:debug-' + index
      ]);

      grunt.registerTask('karma-dist-' + index, [
        'copy:karmaDist-' + index,
        'wiredep:target',
        'wiredep:test',
        'karma:dist-' + index
      ]);
    });
  }

  grunt.registerTask('testSourceConcurrently', [
    'clean',
    'writeConfig',
    'copy:test',
    'copy:stage',
    'ngtemplates',
    'concurrent:source'
  ]);

  grunt.registerTask('testDistConcurrently', [
    'compile',
    'concurrent:dist'
  ]);

@rschuft
Copy link

rschuft commented Oct 12, 2017

I created a plugin that automatically shards tests across the listed browsers (e.g. if you want two sets you list two browsers... browsers: ['ChromeHeadless', 'ChromeHeadless']). It doesn't achieve one of the concerns listed in this thread: run tests at the same time. It forces concurrency: 1. It does however fix the memory problems of having too many specs loading in a single browser and it correctly works with coverage reporting.

karma-sharding

UPDATE: Version 4.0.0 of karma-sharding supports parallel browser execution and no longer forces concurrency to 1.

@joeljeske
Copy link

I have also created a plugin, karma-parallel similar to the one that @rschuft made but is a bit different. It supports running specs in different browser instances by splitting up the commonly used describe as opposed to splitting up the spec files themselves. This allows you to run tests in parallel, even after using a bundler such as karma-webpack or karma-browserify. It also supports testing in multiple types of browsers.

karma-parallel

@guilhermejcgois
Copy link

@joeljeske do u have tested with angular project using ng-cli? we have one and your plugin seems to be interesting to us, will give a try

@joeljeske
Copy link

I do not use an ng-cli project on a regular basis, but I have done basic testing on an ng-cli project and it works just fine. Please log any issues if you run into to something.

One note; it is not yet tested with coverage reporting. It would likely be best to disable coverage reporting when using karma-parallel. Coverage reporting is a future feature of the plugin.

@rschuft
Copy link

rschuft commented Jan 24, 2018

The karma-sharding plugin doesn't play well with the ng-cli because of the way webpack bundles the specs together before the middleware engages with it. Hopefully @joeljeske can bypass that limitation with his approach.

@joeljeske
Copy link

@guilhermejcgois, the latest release of karma-parallel is tested and compliant with ng-cli projects. Code coverage support was just introduced. Would love to hear feedback on your experience with it.

@intellix
Copy link

intellix commented May 31, 2018

My results from karma-parallel on MBP 15" Early 2013 (8 cores): joeljeske/karma-parallel#1 (comment)

1x --------------
real 1m25.659s
user 1m42.960s
sys 0m10.470s

2x --------------
real 1m15.996s
user 2m1.312s
sys 0m13.601s

3x --------------
real 1m11.746s
user 2m23.417s
sys 0m18.218s

4x --------------
real 1m13.970s
user 2m36.800s
sys 0m19.698s

7x just timed out

Was expecting at least 2x perf but seems not to really make a difference. Is anyone doing sharding/parallelism and actually seeing positive results? Would be nice to see

@wbhob
Copy link

wbhob commented May 10, 2019

I'm having an issue where, when one shard disconnects, it simply stops that group of tests and moves on to the next – then, it returns exit code 0. Any thoughts on what may be causing this?

Karma ^3.1.4
Jasmine ^2.99
karma-parallel ^0.3.1

@rschuft
Copy link

rschuft commented May 10, 2019 via email

@wbhob
Copy link

wbhob commented May 10, 2019

I’ve tried Chrome and ChromeHeadless

@juanmendes
Copy link

juanmendes commented Aug 12, 2019

@guilhermejcgois, the latest release of karma-parallel is tested and compliant with ng-cli projects. Code coverage support was just introduced. Would love to hear feedback on your experience with it.

@joeljeske We use karma-parallel and we're getting false positives because one of the shards isn't running all the tests. There are test problems, but karma-parallel hides them because it doesn't fail the tests. After a browser disconnects, typically a shard is restarted from the beginning. However, s sometimes they restart and run just one of the tests. It definitely has to do with the size of the project. We're running 3500 tests where over 200 of them are ng component DOM testing. I've created an issue here joeljeske/karma-parallel#42

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests