Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance inefficiency with dictionary loading and single-process limitation on powerful GPUs, workarounds, side effects #118

Open
FFAMax opened this issue Dec 22, 2024 · 0 comments

Comments

@FFAMax
Copy link

FFAMax commented Dec 22, 2024

I’ve noticed some inefficiencies in hashcat when working with powerful GPUs or multi-GPU setups:

  • Dictionary Loading Overhead: When using large or numerous dictionaries, a significant amount of time is spent loading the dictionary into memory. This causes the GPU to idle for extended periods while waiting for the next dictionary to load, leading to suboptimal utilization. For high-performance GPUs, the time spent computing is much smaller compared to the time spent waiting for the dictionary to load.

  • Single-Process Limitation: Hashcat allows only one process per user to run at a time. This makes it impossible to run multiple processes in parallel to balance GPU usage more efficiently. For example, while one process is idling (loading a dictionary), another could be actively computing, thereby better utilizing the GPU resources.

  • Workaround and Its Issues: A workaround involves running multiple hashcat processes from different user accounts. While this approach helps achieve better GPU utilization by overlapping dictionary loading and computation, it introduces other problems:

    • Systems detect this parallel behavior as suspicious or abnormal, leading to client blocking.

Suggestion:

  • Improving the way dictionaries are managed and loaded could significantly enhance performance for high-end setups. Options like preloading dictionaries or asynchronous loading while computation continues might be worth exploring. Additionally, allowing multiple processes per user, or implementing internal multi-threading to overlap loading and computation, could address these limitations.

Postscript:
For example, in rented computational setups or large rigs, these inefficiencies can lead to wasted resources and money. Instead of achieving full utilization, hardware often spends about fifty percent of the time idling due to the inability to fully load and utilize the system. This effectively wastes part of the user’s investment, as the expected output doesn’t match the costs.

This issue becomes even more critical in setups with specific use cases. Consider scenarios where computational rigs are part of energy-efficient homes. Here, the heat generated by computations is repurposed to warm living spaces. If the system idles for significant periods due to these inefficiencies, the installation’s purpose is undermined, making it ineffective and forcing users to abandon such applications.

Another scenario involves volunteers using rented commercial resources. These users are willing to spend their own funds on powerful GPU rigs but find that much of the time their equipment is idle due to frequent dictionary reloads. This reduces the return on investment and discourages participation in such projects.

Addressing the above issues would unlock the potential for more efficient and sustainable setups while making it viable for users to fully utilize their resources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant