Adding synchronize to cupy implementations #335

adarshyoga · 2024-02-12T19:39:17Z

Have you provided a meaningful PR description?

Multiple successive cupy kernel invocations are queued to the same cuda stream, but are executed asynchronously with host code. cuda.stream.synchronize calls are need to ensure that the kernel code completes execution before the data is accessed by the host. This PR adds these synchronize calls to the kernels.

Report from dpbench execution generated after this change is below.

Summary of current implementation

input_size benchmark problem_preset cupy
0 20MB black_scholes S Success
1 8KB gpairs S Success
2 1MB l2_norm S Success
3 8MB pairwise_distance S Success
4 1MB pca S Success
5 7MB rambo S Success

Summary of current implementation

input_size benchmark problem_preset cupy
0 5GB black_scholes M 97.05ms
1 512KB gpairs M 60.02ms
2 8GB l2_norm M 493.05ms
3 8GB pairwise_distance M 399.43ms
4 1GB pca M 133.48ms
5 1GB rambo M 60.23ms

… accurately

oleksandr-pavlyk · 2024-02-13T22:12:04Z

FYI, here is how CuPy's own benchmark does it https://github.com/cupy/cupy/blob/main/examples/gemm/utils.py#L12-L22:

def benchmark(func, args, n_run):
    times = []
    for _ in range(n_run):
        start = cp.cuda.Event()
        end = cp.cuda.Event()
        start.record()
        func(*args)
        end.record()
        end.synchronize()
        times.append(cp.cuda.get_elapsed_time(start, end))  # milliseconds
    return times

Or https://github.com/cupy/cupy/blob/main/examples/cg/cg.py#L10-L17:

@contextlib.contextmanager
def timer(message):
    cupy.cuda.Stream.null.synchronize()
    start = time.time()
    yield
    cupy.cuda.Stream.null.synchronize()
    end = time.time()
    print('%s:  %f sec' % (message, end - start))

But what is being proposed in this PR works just the same.

oleksandr-pavlyk

LGTM! I like the change to use cp instead of np in cupy implementation files.

adding synchronize to cupy implementations to capture execution times…

ba7c7a6

… accurately

adarshyoga self-assigned this Feb 12, 2024

adarshyoga enabled auto-merge February 13, 2024 07:16

diptorupd requested a review from oleksandr-pavlyk February 13, 2024 21:58

oleksandr-pavlyk approved these changes Feb 13, 2024

View reviewed changes

adarshyoga merged commit 0646f1e into main Feb 13, 2024
70 of 71 checks passed

adarshyoga deleted the cupy_sync_add branch February 13, 2024 22:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding synchronize to cupy implementations #335

Adding synchronize to cupy implementations #335

adarshyoga commented Feb 12, 2024 •

edited

Loading

oleksandr-pavlyk commented Feb 13, 2024

oleksandr-pavlyk left a comment

Adding synchronize to cupy implementations #335

Adding synchronize to cupy implementations #335

Conversation

adarshyoga commented Feb 12, 2024 • edited Loading

Summary of current implementation

Summary of current implementation

oleksandr-pavlyk commented Feb 13, 2024

oleksandr-pavlyk left a comment

Choose a reason for hiding this comment

adarshyoga commented Feb 12, 2024 •

edited

Loading