Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using xdist to run pytest in parallel #6620

Merged
merged 22 commits into from
Dec 1, 2024
Merged

Using xdist to run pytest in parallel #6620

merged 22 commits into from
Dec 1, 2024

Conversation

unkcpz
Copy link
Member

@unkcpz unkcpz commented Nov 19, 2024

Running tests in parallel using pytest-xdist reduce the time of ci-codes (2 cores).

  • ci-code / tests (3.9): ~28m -> ~15m

  • ci-code / tests (3.12): ~20m -> 12m

  • ci-code / presto: ~11m -> 6m

  • Fix all failed tests, see if it possible.

  • Bring the inconsistent test back to run.

@unkcpz unkcpz marked this pull request as draft November 19, 2024 23:44
Copy link

codecov bot commented Nov 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.89%. Comparing base (ef60b66) to head (f7d3072).
Report is 147 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6620      +/-   ##
==========================================
+ Coverage   77.51%   77.89%   +0.39%     
==========================================
  Files         560      567       +7     
  Lines       41444    42180     +736     
==========================================
+ Hits        32120    32852     +732     
- Misses       9324     9328       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@unkcpz unkcpz marked this pull request as ready for review November 20, 2024 07:40
@unkcpz unkcpz requested a review from danielhollas November 20, 2024 07:43
Copy link
Collaborator

@danielhollas danielhollas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very cool, thanks for doing this!

The PR itself looks good, but I have running into errors when I try to run locally. I'll need to investigate a bit.

requirements/requirements-py-3.12.txt Show resolved Hide resolved
)
def test_all_node_fields(group, name, data_regression):
@pytest.fixture
def available_entry_points():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is much better. Maybe the name of the fixture should be more specific, since you're only returning node and data entry points. Perhaps node_and_data_entry_points?

@unkcpz
Copy link
Member Author

unkcpz commented Nov 21, 2024

Run with my 16 cores, I found some issues that storage is accessed at the same time.
The problem is caused in tests/tools/archive/test_simple.py::test_base_data_nodes didn't clean the storage before the test that cause the shared storage has unsealed nodes.

    def test_load_backend_if_not_loaded_load_once(manager, monkeypatch):
        """Test :meth:`aiida.cmdline.utils.decorators.load_backend_if_not_loaded` calls ``get_profile_storage`` once."""
        mocked = mock.Mock()
    
        # This test assumes the ``load_backend_if_not_loaded`` uses ``get_profile_storage`` to load the profile, so we need
        # to first check that this is the case. If this changes, this first test will fail alerting that it needs to be
        # adapted.
        with monkeypatch.context() as context:
            context.setattr(manager.__class__, 'get_profile_storage', mocked)
            load_backend_if_not_loaded()
>           assert mocked.call_count == 1
E           AssertionError: assert 0 == 1
E            +  where 0 = <Mock id='125376138318368'>.call_count

tests/cmdline/utils/test_decorators.py:82: AssertionError

The typical error of not using tmp_path fixture is:

FAILED tests/transports/test_all_plugins.py::test_put_get_empty_string_file[core.ssh] - OSError: Error during mkdir of 'tmp_try' from folder '/tmp', maybe you don't have the permissions to do it, or the dire
ctory already exists? (Failure)

More tests randomly failed, should be fixed

ERROR tests/tools/archive/orm/test_groups.py::test_nodes_in_group - sqlalchemy.orm.exc.DetachedInstanceError: Instance <DbUser at 0x73ad11c478f0> is not bound to a Session; attribute refresh operation cannot
 proceed (Background on this error at: https://sqlalche.me/e/20...
ERROR tests/cmdline/commands/test_status.py::test_status_no_profile - TimeoutError: disconnect after 10 seconds
FAILED tests/cmdline/commands/test_rabbitmq.py::test_revive - AssertionError: Traceback (most recent call last):
FAILED tests/orm/test_querybuilder.py::TestBasic::test_tuples - AssertionError: assert 11 == 1
FAILED tests/tools/archive/orm/test_authinfo.py::test_create_all_no_authinfo - aiida.tools.archive.exceptions.ExportValidationError: All ProcessNodes must be sealed before they can be exported. Node(s) with PK(s): 61 is/are not se
aled.
FAILED tests/tools/archive/orm/test_authinfo.py::test_create_all_with_authinfo - aiida.tools.archive.exceptions.ExportValidationError: All ProcessNodes must be sealed before they can be exported. Node(s) with PK(s): 61 is/are not 
sealed.

@unkcpz
Copy link
Member Author

unkcpz commented Nov 29, 2024

Hi @danielhollas, for tests of creating an archive, it randomly failed because it tries to use the shared DB where may have unsealed data nodes.
I think it would be possible to use SQlite backend that having a DB only for a test here, correct?

@pytest.mark.usefixtures('aiida_localhost')
def test_create_all_no_authinfo(tmp_path):
    """Test archive creation that does not include authinfo."""
    filename1 = tmp_path / 'export1.aiida'
    create_archive(None, filename=filename1, include_authinfos=False)
    with get_format().open(filename1, 'r') as archive:
        assert archive.querybuilder().append(orm.AuthInfo).count() == 0


@pytest.mark.usefixtures('aiida_localhost')
def test_create_all_with_authinfo(tmp_path):
    """Test archive creation that does include authinfo."""
    filename1 = tmp_path / 'export1.aiida'
    create_archive(None, filename=filename1, include_authinfos=True)
    with get_format().open(filename1, 'r') as archive:
        assert archive.querybuilder().append(orm.AuthInfo).count() == 1

EDIT: never mind, I think I can just put the aiida_profile_clean fixture. The storage is per test and will reset with the fixture.

@unkcpz unkcpz requested a review from danielhollas November 29, 2024 15:46
@unkcpz
Copy link
Member Author

unkcpz commented Nov 29, 2024

I test with both my -n 16 in my laptop and -n 32 in my workstation multiple times. All test passed 😎

@@ -353,6 +353,7 @@ def test_graph_node_identifiers(self, node_id_type, monkeypatch, file_regression
# The order of certain output lines can be randomly ordered so we split the file in lines, sort, and then join
# them into a single string again. The node identifiers generated by the engine are of the form ``N{pk}`` and
# they also clearly vary, so they are replaced with the ``NODE`` placeholder.
string = '\n'.join(sorted(graph.graphviz.source.strip().split('\n')))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the sort happened first, then the order may depend on the pk number and the test fail with different string where lines order changed. I sort the lines after replace the pk number.

@unkcpz
Copy link
Member Author

unkcpz commented Nov 29, 2024

In my workstation, it failed now sometime with SSH banner, but I think it is because the SSH connection is too many and probably I have some sshd limitation setup on the machine. But I think it is fine.

@danielhollas
Copy link
Collaborator

This looks great, thanks so much for figuring out all the failing tests. 💪

Can you please post the exact SSH error that you are getting? I'd like to try out locally as well.

If you decide to merge this now, please create an issue as a follow up for the SSH problem. I guess we're unlikely to hit it in CI, but would be good to resolve it.

@unkcpz unkcpz merged commit 090dc1c into aiidateam:main Dec 1, 2024
41 of 50 checks passed
@unkcpz unkcpz deleted the xdist branch December 1, 2024 01:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants