Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changes to support CUDA build w/o GPU in runtime #13674

Open
wants to merge 14 commits into
base: development
Choose a base branch
from

Conversation

gilpazintel
Copy link

No description provided.

@gilpazintel gilpazintel reopened this Jan 12, 2025
@gilpazintel gilpazintel requested a review from Nir-Az January 13, 2025 16:32
@gilpazintel gilpazintel marked this pull request as ready for review January 13, 2025 16:33
#endif

namespace librealsense
{
void unpack_z16_y8_from_sr300_inzi( uint8_t * const dest[], const uint8_t * source, int width, int height, int actual_size)
void unpack_z16_y8_from_sr300_inzi(uint8_t* const dest[], const uint8_t* source, int width, int height, int actual_size)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we try to avoid unneeded code changes, I saw it in several files.
Please go over formatting changes line and revert to only see the new actual changes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, will do. these changes are following invoking the coding formatting I got from you. but will revert the changes which are not part of the fix. NP

in += count;
}
#endif
#ifndef RS2_USE_CUDA
for (int i = 0; i < count; ++i) *out_ir++ = *in++ >> 2;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we want this code if no Cuda is found?

Copy link
Author

@gilpazintel gilpazintel Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we want to run this code if CUDA is selected AND nvidia GPU available - fixed the code.

in += count;
}
#endif
#ifndef RS2_USE_CUDA
for (int i = 0; i < count; ++i) *out_ir++ = *in++ << 6;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

Copy link
Author

@gilpazintel gilpazintel Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we want to run this code if CUDA is selected AND nvidia GPU available - fixed the code.

@@ -23,6 +23,7 @@
#include "proc/sse/sse-pointcloud.h"
#endif
#include "proc/neon/neon-pointcloud.h"
#include "rsutils/rsutilgpu.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change the file path to
#include <rsutils/accelerators/gpu.h>

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

rscuda::split_frame_y16_y16_from_y12i_cuda(dest, count, reinterpret_cast<const rscuda::y12i_pixel_mipi*>(source));
}
#endif
#ifndef RS2_USE_CUDA
split_frame(dest, count, reinterpret_cast<const y12i_pixel_mipi*>(source),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want this when CUDa is not available right?
This is pure MIPI (Jetson code) so we need to make sure we run on Jetson with D457 too.
We also need specifically to check Y16 depth format is working

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made a check on vtgjetson1, the Y16 is working fine.

@@ -27,12 +32,12 @@ namespace librealsense
y12i_to_y16y16_mipi::y12i_to_y16y16_mipi(int left_idx, int right_idx)
: y12i_to_y16y16_mipi("Y12I to Y16L Y16R Transform", left_idx, right_idx) {}

y12i_to_y16y16_mipi::y12i_to_y16y16_mipi(const char * name, int left_idx, int right_idx)
y12i_to_y16y16_mipi::y12i_to_y16y16_mipi(const char* name, int left_idx, int right_idx)
Copy link
Collaborator

@Nir-Az Nir-Az Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert changes at lines 35,37,40...

rscuda::split_frame_y16_y16_from_y12i_cuda(dest, count, reinterpret_cast<const rscuda::y12i_pixel*>(source));
}
#endif
#ifndef RS2_USE_CUDA
split_frame(dest, count, reinterpret_cast<const y12i_pixel*>(source),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this right?
You can try to check if Y16 is working in USB and see the result

Copy link
Author

@gilpazintel gilpazintel Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed the code. can you explain about the USB? how to check it? thanks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check on a linux machine with / without cuda that Y16 format works

#endif
}

y12i_to_y16y16::y12i_to_y16y16(int left_idx, int right_idx)
: y12i_to_y16y16("Y12I to Y16L Y16R Transform", left_idx, right_idx) {}

y12i_to_y16y16::y12i_to_y16y16(const char * name, int left_idx, int right_idx)
y12i_to_y16y16::y12i_to_y16y16(const char* name, int left_idx, int right_idx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert please..also line 31,34.. please go over all the PR

rscuda::split_frame_y8_y8_from_y8i_cuda(dest, count, reinterpret_cast<const y8i_pixel*>(source));
}
#endif
#ifndef RS2_USE_CUDA
split_frame(dest, count, reinterpret_cast<const y8i_pixel*>(source),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same
Check Y8 IR

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed code. Y8 is working on vtgjetson1 (still having issue with vtgjetson4 - CUDA driver version is insufficient for CUDA runtime version_ - WIP

@@ -0,0 +1,18 @@
// License: Apache 2.0. See LICENSE file in root directory.
// Copyright(c) 2023 Intel Corporation. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2025

#pragma once

#ifdef __cplusplus
extern "C" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the extern "c" ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a must, removed.

@@ -0,0 +1,43 @@
// License: Apache 2.0. See LICENSE file in root directory.
// Copyright(c) 2023 Intel Corporation. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2025

{
cudaGetDeviceCount(&gpuDeviceCount);
retVal = gpuDeviceCount > 0;
if (retVal == false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to get this log only once and not on every check right?
I would call this function at initialization process and log it and here only return true/false

Copy link
Author

@gilpazintel gilpazintel Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code modified. we will log only once (when gpuDeviceCout == -1). once it is initialized to 0 (as no GPU or due to an error) or >0, we won't log anymore.

@gilpazintel gilpazintel reopened this Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants