Add getter and setter methods for compile_backend across accelerators. #5299

vshekhawat-hlab · 2024-03-19T10:02:52Z

Add getter and setter methods for compile_backend across accelerators, which provide a mechanism to retrieve the compile backend. These APIs handle user-defined backend selection and raise a ValueError with informative error messages for unsupported backends.

…ks in compile tests.

vshekhawat-hlab · 2024-03-19T10:08:20Z

Hi @loadams ,

Added support for retrieving the preferred compile backend from the accelerator. However, I'm unsure about the preferred backend for each accelerator. Therefore, I've set the default to 'inductor'. Could you please review this?

tjruwase · 2024-03-19T13:53:02Z

@vshekhawat-hlab, thanks for adding this API. I think it is a great start to addressing generalization issue raised by @delock. However, there is a bit more work required to make this consistent with

DeepSpeed/deepspeed/runtime/compiler.py

Line 56 in 3dd3d51

class CompileConfig(DeepSpeedConfigModel):

.
Also, I think there are few questions that need resolving. I will post them below as separate comment to reduce clutter.

tjruwase · 2024-03-19T14:03:24Z

Since users can specify compile_backend (string or Callable) through either ds_config or set_backend(), which value takes precedence between user specification and preferred/default backend in accelerator class?

I think user specification should have higher precedence.

tjruwase · 2024-03-19T14:08:15Z

Assuming user specification has higher precedence, how can users select the accelerator preferred backend without having to read the code? I see two options:

Users can implicitly select the accelerator preferred backend by leaving backend undefined in ds_config, and by not using set_backend() API, or
Alternatively, users can explicitly use a special backend name, e.g., "accelerator".

tjruwase · 2024-03-19T14:09:40Z

What is the expected behavior if user specifies backend that is not supported by accelerator? I see three options:

Fail immediately with meaningful error message.
Disable compilation with warning message.
Switch to accelerator preferred backend with warning message.

tjruwase · 2024-03-19T14:10:18Z

@delock, @tohtana, @umchand, @vshekhawat-hlab, @loadams, I will appreciate your thoughts on the above questions. Thanks!

vshekhawat-hlab · 2024-03-20T05:00:25Z

Thanks @tjruwase for review.

My views on above:
Query1) Yes, i think user specification should get higher precedence.
Query3) I think option 1, as the user is expected to run with the user-specified backend, that is the reason user passed the backend. However option3 is also good option.

Updated get_compile_backend, let me know your views on this.

def get_compile_backend(self, backend=None):
    supported_backends = torch._dynamo.list_backends()
    if backend is None:
        return preferred_backend
    elif backend in supported_backends:
        return backend
    else:
        raise ValueError(f"{backend} not supported by {self.device_name()}. Supported Backends are {supported_backends }")

`

loadams · 2024-03-20T15:53:32Z

Thanks @tjruwase for review.

My views on above: Query1) Yes, i think user specification should get higher precedence. Query3) I think option 1, as the user is expected to run with the user-specified backend, that is the reason user passed the backend. However option3 is also good option.

Updated get_compile_backend, let me know your views on this.
def get_compile_backend(self, backend=None):
    supported_backends = torch._dynamo.list_backends()
    if backend is None:
        return preferred_backend
    elif backend in supported_backends:
        return backend
    else:
        raise ValueError(f"{backend} not supported by {self.device_name()}. Supported Backends are {supported_backends }") 
`

I agree on this, I think the best option if the user specifies a backend that is not supported is to fail immediately with a meaningful error message.

And I think having it be implicit except if they override in ds_config or via set_backend() would then take precedence.

delock · 2024-03-21T06:40:02Z

Agree that if user explicitly specifiy backend then should fail immedately. Also leaving backend undefined as indication for default value is a sound behavior.

Refactored the get_compile_backend method to handle user backend selection more efficiently. If no backend is specified, it now defaults to accelerator preferred backend. Additionally, improved error handling for unsupported backends by providing informative error messages.

vshekhawat-hlab · 2024-03-28T11:45:28Z

@tjruwase , @loadams , @delock ,

Can you please review the change. Is the current change is okay?

…ackend

loadams · 2024-04-15T20:36:36Z

@tjruwase , @loadams , @delock ,

Can you please review the change. Is the current change is okay?

Thanks @vshekhawat-hlab, we will work on reviewing this.

tjruwase · 2024-04-22T19:25:37Z

@vshekhawat-hlab, apologies for the delay. Based on all the feedback, the main change needed in my mind is to split into a get_compile_backend() and set_compile_backend(), so that the get_ API can be side effect-free.

get_compile_backend() should return the current backend of the accelerator.
Each accelerator can initialize the current backend as appropriate, probably to the default/preferred value.
set_compile_backend() allows users to modify the backend. It should perform the necessary sanity checks, including those performed by current get_ API

@vshekhawat-hlab, @tohtana, @umchand, @delock, @loadams, I will appreciate your thoughts on the above.

loadams · 2024-04-23T00:18:42Z

@vshekhawat-hlab, apologies for the delay. Based on all the feedback, the main change needed in my mind is to split into a get_compile_backend() and set_compile_backend(), so that the get_ API can be side effect-free.

get_compile_backend() should return the current backend of the accelerator.

Each accelerator can initialize the current backend as appropriate, probably to the default/preferred value.

set_compile_backend() allows users to modify the backend. It should perform the necessary sanity checks, including those performed by current get_ API

@vshekhawat-hlab, @tohtana, @umchand, @delock, @loadams, I will appreciate your thoughts on the above.

I like that approach, I think it makes the most sense to match existing APIs and provide default behavior and a normal way to override it.

vshekhawat-hlab · 2024-04-23T04:11:10Z

Hi @tjruwase,

Agree with comment.

Couple of questions:

In set API, user has to pass backend, None args can't be allowed here. Right?
Do you see any more checks that can be added in set API?

Code snippet with get and set API.

    def get_compile_backend(self):
        return self.compile_backend

    def set_compile_backend(self, backend):
        supported_backends = torch._dynamo.list_backends()
        if backend in supported_backends:
            self.compile_backend = backend
        else:
            raise ValueError(
                f"{backend} not supported by {self.device_name()}. Supported Backends are {supported_backends}")

tjruwase · 2024-04-23T13:49:03Z

In set API, user has to pass backend, None args can't be allowed here. Right?

Yes, it makes sense to reject None backend.

tjruwase · 2024-04-23T13:51:52Z

@vshekhawat-hlab, the proposed snippet looks good to me. I suspect that individual accelerators might later apply optimizations such as caching the supported backends list in the constructor. But your current proposal is a simple and clean start. Thanks for helping.

vshekhawat-hlab · 2024-04-24T05:54:50Z

@vshekhawat-hlab, the proposed snippet looks good to me. I suspect that individual accelerators might later apply optimizations such as caching the supported backends list in the constructor. But your current proposal is a simple and clean start. Thanks for helping.

Updated the PR as discussed. Please review.

microsoft#5299) Add getter and setter methods for `compile_backend` across accelerators, which provide a mechanism to retrieve the compile backend. These APIs handle user-defined backend selection and raise a `ValueError` with informative error messages for unsupported backends. --------- Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Logan Adams <[email protected]>

Added get_compile_backend API to accelrators to avoid accelrator chec…

d272298

…ks in compile tests.

vshekhawat-hlab requested review from mrwyattii, tjruwase and loadams as code owners March 19, 2024 10:02

vshekhawat-hlab changed the title ~~Added get_compile_backend API in accelrators.~~ Added get_compile_backend API in accelerator. Mar 19, 2024

tjruwase mentioned this pull request Mar 19, 2024

Supporting custom backend for all accelerator. #5298

Closed

vshekhawat-hlab and others added 3 commits March 21, 2024 11:28

Merge branch 'master' into vshekhawat/accelerator_compile_backend

6b5065b

Merge branch 'master' into vshekhawat/accelerator_compile_backend

15357c4

vshekhawat-hlab and others added 2 commits April 3, 2024 14:41

Merge branch 'microsoft:master' into vshekhawat/accelerator_compile_b…

e8781d8

…ackend

Merge branch 'master' into vshekhawat/accelerator_compile_backend

d92138b

loadams requested review from tohtana and umchand April 15, 2024 20:37

Merge branch 'master' into vshekhawat/accelerator_compile_backend

7e5d718

vshekhawat-hlab added 2 commits April 24, 2024 08:38

Add getter and setter methods for compile_backend across accelerators.

057f1c3

Merge branch 'master' into vshekhawat/accelerator_compile_backend

660e6e2

vshekhawat-hlab changed the title ~~Added get_compile_backend API in accelerator.~~ Add getter and setter methods for compile_backend across accelerators. Apr 24, 2024

Fix pre-commit checks for accelrators.

9335035

tjruwase approved these changes Apr 24, 2024

View reviewed changes

loadams approved these changes Apr 24, 2024

View reviewed changes

loadams added this pull request to the merge queue Apr 24, 2024

Merged via the queue into microsoft:master with commit fa8458b Apr 24, 2024
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add getter and setter methods for compile_backend across accelerators. #5299

Add getter and setter methods for compile_backend across accelerators. #5299

vshekhawat-hlab commented Mar 19, 2024 •

edited

Loading

vshekhawat-hlab commented Mar 19, 2024

tjruwase commented Mar 19, 2024

tjruwase commented Mar 19, 2024

tjruwase commented Mar 19, 2024 •

edited

Loading

tjruwase commented Mar 19, 2024

tjruwase commented Mar 19, 2024

vshekhawat-hlab commented Mar 20, 2024 •

edited

Loading

loadams commented Mar 20, 2024

delock commented Mar 21, 2024

vshekhawat-hlab commented Mar 28, 2024

loadams commented Apr 15, 2024

tjruwase commented Apr 22, 2024

loadams commented Apr 23, 2024

vshekhawat-hlab commented Apr 23, 2024 •

edited

Loading

tjruwase commented Apr 23, 2024

tjruwase commented Apr 23, 2024

vshekhawat-hlab commented Apr 24, 2024

Add getter and setter methods for compile_backend across accelerators. #5299

Add getter and setter methods for compile_backend across accelerators. #5299

Conversation

vshekhawat-hlab commented Mar 19, 2024 • edited Loading

vshekhawat-hlab commented Mar 19, 2024

tjruwase commented Mar 19, 2024

tjruwase commented Mar 19, 2024

tjruwase commented Mar 19, 2024 • edited Loading

tjruwase commented Mar 19, 2024

tjruwase commented Mar 19, 2024

vshekhawat-hlab commented Mar 20, 2024 • edited Loading

loadams commented Mar 20, 2024

delock commented Mar 21, 2024

vshekhawat-hlab commented Mar 28, 2024

loadams commented Apr 15, 2024

tjruwase commented Apr 22, 2024

loadams commented Apr 23, 2024

vshekhawat-hlab commented Apr 23, 2024 • edited Loading

tjruwase commented Apr 23, 2024

tjruwase commented Apr 23, 2024

vshekhawat-hlab commented Apr 24, 2024

vshekhawat-hlab commented Mar 19, 2024 •

edited

Loading

tjruwase commented Mar 19, 2024 •

edited

Loading

vshekhawat-hlab commented Mar 20, 2024 •

edited

Loading

vshekhawat-hlab commented Apr 23, 2024 •

edited

Loading