Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: gpuminer initialization #1384

Merged
merged 3 commits into from
Jan 16, 2025

Conversation

brianp
Copy link
Collaborator

@brianp brianp commented Jan 15, 2025

Description

The GPU miner hardware monitor was being initialized too early causing the app never to recognize the gpu miner later on. Fix the initialization sequence, and only append the gpu miner data to telemetry on demand instead of on telemetry initialization.

Motivation and Context

GPU Miners were being left undetected on systems that previously detected GPU.

Fixes #1378

How Has This Been Tested?

Manually

@brianp brianp requested a review from MCozhusheck January 15, 2025 11:17
@brianp
Copy link
Collaborator Author

brianp commented Jan 15, 2025

@MCozhusheck can you also test the event sending? We want to make sure events are sent on the early startup steps without GPU miner information, but are also send with GPU information later on.

The GPU miner hardware monitor was being initialized too early causing
the app never to recognize the gpu miner later on. Fix the
initialization sequence, and only append the gpu miner data to telemetry
on demand instead of on telemetry initialization.
@brianp brianp force-pushed the fix/gpu-miner-telemetry-sequence branch from 6397126 to 39c3553 Compare January 15, 2025 11:23
@MCozhusheck
Copy link
Collaborator

Event before GPU detection lacks field cpu_name

{
    "app_id": "nBG9^pIoE%$HiIz4LwUJ",
    "cpu_name": "Unknown",
    "created_at": {
        "nanos_since_epoch": 399415213,
        "secs_since_epoch": 1736951393
    },
    "event_name": "checking-latest-version-gpuminer",
    "event_value": {
        "percentage": 20,
        "service": "gpuminer"
    },
    "gpu_name": "NVIDIA GeForce RTX 2060",
    "os": "Linux",
    "user_id": "v3,G7Z49mgdwdA2knf8rvWTWk3yP57G,0.8.42,47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU",
    "version": "0.8.42"
}

Same case for evet sent after GPU detection

{
  "app_id": "nBG9^pIoE%$HiIz4LwUJ",
  "cpu_name": "Unknown",
  "created_at": {
    "nanos_since_epoch": 903781686,
    "secs_since_epoch": 1736951393
  },
  "event_name": "waiting-for-minotari-node-to-start",
  "event_value": {
    "percentage": 35,
    "service": "minotari_node"
  },
  "gpu_name": "NVIDIA GeForce RTX 2060",
  "os": "Linux",
  "user_id": "v3,G7Z49mgdwdA2knf8rvWTWk3yP57G,0.8.42,47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU",
  "version": "0.8.42"
}

I think CPU hardware detection needs same treatment as GPU where each event sent we ask hardware status. I am not sure which change made this but I can see my events from 2 days ago with correct cpu_name

{
  "app_id": "9tBm6bKcWBV4PbCudiws",
  "cpu_name": "Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz",
  "created_at": {
    "nanos_since_epoch": 917006039,
    "secs_since_epoch": 1736764883
  },
  "event_name": "waiting-for-minotari-node-to-start",
  "event_value": {
    "percentage": 35,
    "service": "minotari_node"
  },
  "gpu_name": "NVIDIA GeForce RTX 2060",
  "os": "Linux",
  "user_id": "v3,G7Z49mgdwdA2knf8rvWTWk3yP57G,0.8.42,47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU",
  "version": "0.8.42"
}

@MCozhusheck
Copy link
Collaborator

I think this should be at the beginning of setup_inner for hardware status to give CPU and GPU name

HardwareStatusMonitor::current().initialize().await?;

@brianp
Copy link
Collaborator Author

brianp commented Jan 15, 2025

It's interesting that the "Event before GPU detection" successfully has the GPU name?

I suppose this wasn't the first run. So GPU had already successfully been detected.

@brianp
Copy link
Collaborator Author

brianp commented Jan 15, 2025

I think this should be at the beginning of setup_inner for hardware status to give CPU and GPU name
HardwareStatusMonitor::current().initialize().await?;

The problem with moving it before the binary downloaders, is that if the GPU hasn't previously been detected (such as first run) then it won't get detected until next restart. For the entire app.

@brianp brianp closed this Jan 15, 2025
@brianp brianp deleted the fix/gpu-miner-telemetry-sequence branch January 15, 2025 16:02
@brianp brianp restored the fix/gpu-miner-telemetry-sequence branch January 15, 2025 16:02
@brianp brianp reopened this Jan 15, 2025
@brianp
Copy link
Collaborator Author

brianp commented Jan 15, 2025

Updated cpu to be ondemand as well.

I'm okay if they aren't present as metrics during the initial syncing.

@MCozhusheck
Copy link
Collaborator

Now the `cpu_name' have appeared back in the events.

{
  "app_id": "9$D%!%Nnebgqy6M$lVlB",
  "cpu_name": "Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz",
  "created_at": {
    "nanos_since_epoch": 500875082,
    "secs_since_epoch": 1737017543
  },
  "event_name": "waiting-for-minotari-node-to-start",
  "event_value": {
    "percentage": 35,
    "service": "minotari_node"
  },
  "gpu_name": "NVIDIA GeForce RTX 2060",
  "os": "Linux",
  "user_id": "v3,G7Z49mgdwdA2knf8rvWTWk3yP57G,0.8.42,47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU",
  "version": "0.8.42"
}

@brianp brianp merged commit 1b17298 into tari-project:main Jan 16, 2025
8 checks passed
@brianp brianp deleted the fix/gpu-miner-telemetry-sequence branch January 16, 2025 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[0.8.42] Can't detect GPU Miner (locally, and on beta builds)
2 participants