Quadratic Unconstrained Binary Optimization refers to the problem to find the binary variable assignment
subject to no constraints.
The solvers for QUBO will be benchmarked using Maximum Independent Set workloads. Given an undirected graph
The MIS problem has a natural QUBO formulation: for each node
The MIS problem is NP-hard and intractable, and therefore any solver system approximates a solution. Therefore, the cost of the QUBO formulation (
The benchmark's workload complexity is defined such that it can automatically grow over time, as neuromorphic systems mature and are able to support larger problems.
- Number of nodes, spaced in a pseudo-geometric progression: 10, 25, 50, 100, 250, 500, 1000, 2500, 5000, ...
- Density of edges: 1%, 5%, 10%, 25%
- Problem seeds: 0, 1, 2, 3, 4 are allowed for tuning. At evaluation time, for official results NeuroBench will announce five seeds for submission. Unofficial results may use seeds which are randomly generated at runtime.
Each workload will be associated with a target optimality, which is the minimum cost found using a conventional solver algorithm.
Small QUBO workloads with fewer than 50 nodes will be solved to global optimality, corresponding to the true maximum independent set.
Larger workloads cannot be reasonably globally solved. The DWave Tabu CPU sampler will be used with 100 reads and 50 restarts, and the QUBO solution with the best cost (best-known solution, BKS) found will set the target optimality for the tuning workload seeds. For evaluation workload seeds, the same method will be used to set the target optimality. NeuroBench will provide target optimalities for workloads up to 5000 nodes. Submissions are encouraged to continue scaling up the workload size along the pattern to demonstrate the capacity of their systems. The first group that tackles workloads of an unprecedented size should provide the benchmark solutions via a pull request.
The dataset workload generator and scripts for the DWave Tabu sampler to compute optimal costs are available in the QUBO sub-directory of this repo.
This code is expected to be used as the front-end data generator for all submissions. Submitters may take the general graph descriptions and modify the structure (e.g., load into custom data structures), this is not included in the reported results. The code also supports arbitrary workload generation, which is expected to be used to measure the largest supported workload size of the system (see Task and Metrics).
Based on the BKS found for each workload, the BKS-Gap optimality score of the solution found by the SUT is defined as
$$
\text{BKS-Gap} = (\frac{c - c_{\mathrm{target}}} {c_\mathrm{target}})\ ,
$$
where
The QUBO task can be solved under two task scenarios, using the same workload generator as described above. The first scenario is fixed-timeout (FT), and the second scenario is time-to-solution (TTS).
Given each workload, the benchmark should report the BKS-Gap of the solution found by the SUT after a fixed runtime. The time begins after the graph has been loaded into the SUT.
Timeouts spread across orders of magnitude (
Importantly, the QUBO solver needs a module to measure the cost of its solutions, and this module should be considered as part of the SUT, thus its power must be included in the benchmarking results.
Given each workload, the benchmark should report the SUT latency and energy to meet BKS-Gap thresholds of 0.1, 0.05, and 0.01. If the SUT fails to reach the score thresholds, no result is reported.
For latency, the time begins when the workload graph has been loaded to the system, and ends when the solver has detected target optimality.
Energy measurements identify the energy required to reach each threshold, and is calculated as the average power consumed since the start of the workload (loaded to the system), multiplied by the latency to reach the score threshold.
As is the case for the FT scenario, the solver needs to include a module that tracks the cost of considered solutions, to identify if it has reached the target optimality. The computational demand and energy for this module must be included in the benchmarking results.
For both task scenarios, the power measurement should include all computational components, memory, and power domains which are active during the workload processing.
Ultimately, as neuromorphic systems are not matured, a singular power measurement method cannot be applied to all submissions. It is the responsibility of the submitter to faithfully capture all active processing components for their system and fairly outline their methodology in the report. Official submissions are subject to potential audits during which an auditor will inspect the methodology and request additions or revisions to the results and report if necessary.
No other constraints are placed on the algorithmic implementation, nor the hardware setup of the solver system. The system may process the workload using any internal representation, but it must produce the binary vector
Algorithm and system parameter tuning is restricted, see Tuning below.
Separately from the benchmark dataset, the submission should also report the largest supported workload size for the given system, at both 1% edge density and 100% edge density (fully connected). The result are the greatest values
Submitters should only use problem seeds 0, 1, 2, 3, 4 for tuning the parameters of their system and algorithm. Submitters may use workloads with a different number of nodes and/or density than those present in the dataset, though only the values included in the dataset should be reported as official results.
Submitters are not allowed to individually tune their system and algorithmic parameters based on each workload. Any parameters must be some constant value across all workloads, or they may be some function of the workload number of nodes and edge density. The function must not depend on the random seeds used to create the workload or to run the solver, i.e., it must provide the same algorithmic parameters for all problem seeds for a given number of nodes and density. The same function must further be applicable to arbitrary workload sizes and edge densities.