-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
template of surrogate model and acquisition function user guide, to b…
…e modified
- Loading branch information
Showing
2 changed files
with
263 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,136 @@ | ||
# Surrogate Model | ||
|
||
(TBA...) | ||
## 1. Introduction | ||
|
||
The `obsidian.surrogates` submodule is a key component of the Obsidian Bayesian optimization library. It provides a collection of surrogate models used to approximate the objective function in the optimization process. These surrogate models are essential for efficient exploration of the parameter space and for making informed decisions about which points to evaluate next. | ||
|
||
## 2. Available Surrogate Models | ||
|
||
The `obsidian.surrogates` submodule offers several types of surrogate models: | ||
|
||
1. **Gaussian Process (GP)**: The default surrogate model, suitable for most optimization tasks. | ||
2. **Mixed Gaussian Process (MixedGP)**: A GP model that can handle mixed continuous and categorical input spaces. | ||
3. **Deep Kernel Learning GP (DKL)**: A GP model with a neural network feature extractor. | ||
4. **Flat GP**: A GP model with non-informative or no prior distributions. | ||
5. **Prior GP**: A GP model with custom prior distributions. | ||
6. **Multi-Task GP (MTGP)**: A GP model for multi-output optimization. | ||
7. **Deep Neural Network (DNN)**: A dropout neural network model. | ||
|
||
## 3. How to Use Surrogate Models | ||
|
||
To use a surrogate model in your optimization process, you typically don't need to interact with it directly. The Obsidian optimizer will handle the creation and management of the surrogate model. However, if you need to create a surrogate model manually, you can do so using the `SurrogateBoTorch` class: | ||
|
||
```python | ||
from obsidian.surrogates import SurrogateBoTorch | ||
from obsidian.parameters import ParamSpace, Target | ||
|
||
# Define your parameter space | ||
param_space = ParamSpace([...]) # Define your parameters here | ||
|
||
# Create a surrogate model (default is GP) | ||
surrogate = SurrogateBoTorch(model_type='GP') | ||
|
||
# Fit the model to your data | ||
surrogate.fit(X, y) | ||
|
||
# Make predictions | ||
mean, std = surrogate.predict(X_new) | ||
``` | ||
|
||
## 4. Customization Options | ||
|
||
### 4.1 Model Selection | ||
|
||
You can choose different surrogate models by specifying the `model_type` parameter when creating a `SurrogateBoTorch` instance. Available options are: | ||
|
||
- `'GP'`: Standard Gaussian Process | ||
- `'MixedGP'`: Mixed input Gaussian Process | ||
- `'DKL'`: Deep Kernel Learning GP | ||
- `'GPflat'`: Flat (non-informative prior) GP | ||
- `'GPprior'`: Custom prior GP | ||
- `'MTGP'`: Multi-Task GP | ||
- `'DNN'`: Dropout Neural Network | ||
|
||
### 4.2 Hyperparameters | ||
|
||
You can pass custom hyperparameters to the surrogate model using the `hps` parameter: | ||
|
||
```python | ||
surrogate = SurrogateBoTorch(model_type='GP', hps={'your_custom_param': value}) | ||
``` | ||
|
||
### 4.3 Custom GP Models | ||
|
||
The submodule provides several custom GP implementations: | ||
|
||
- `PriorGP`: A GP with custom prior distributions | ||
- `FlatGP`: A GP with non-informative or no prior distributions | ||
- `DKLGP`: A GP with a neural network feature extractor | ||
|
||
### 4.4 Custom Neural Network Model | ||
|
||
The `DNN` class provides a customizable dropout neural network model. You can adjust parameters such as dropout probability, hidden layer width, and number of hidden layers. | ||
|
||
## 5. Examples | ||
|
||
### 5.1 Using a standard GP surrogate | ||
|
||
```python | ||
from obsidian.surrogates import SurrogateBoTorch | ||
from obsidian.parameters import ParamSpace, Target | ||
import pandas as pd | ||
|
||
# Define your parameter space | ||
param_space = ParamSpace([...]) # Define your parameters here | ||
|
||
# Assume X and y are your input features and target variables | ||
X = pd.DataFrame(...) | ||
y = pd.Series(...) | ||
|
||
surrogate = SurrogateBoTorch(model_type='GP') | ||
surrogate.fit(X, y) | ||
|
||
# Make predictions | ||
X_new = pd.DataFrame(...) | ||
mean, std = surrogate.predict(X_new) | ||
``` | ||
|
||
### 5.2 Using a Mixed GP for categorical and continuous variables | ||
|
||
```python | ||
surrogate = SurrogateBoTorch(model_type='MixedGP') | ||
# cat_dims should be a list of indices for categorical variables in your input space | ||
surrogate.fit(X, y, cat_dims=[0, 2]) # Assuming columns 0 and 2 are categorical | ||
``` | ||
|
||
### 5.3 Using a DNN surrogate | ||
|
||
```python | ||
# The 'hps' parameter allows you to customize the DNN architecture | ||
surrogate = SurrogateBoTorch(model_type='DNN', hps={'p_dropout': 0.1, 'h_width': 32, 'h_layers': 3}) | ||
surrogate.fit(X, y) | ||
``` | ||
|
||
## 6. Advanced Usage | ||
|
||
### 6.1 Saving and Loading Models | ||
|
||
You can save and load surrogate models using the `save_state()` and `load_state()` methods: | ||
|
||
```python | ||
# Save model state | ||
state = surrogate.save_state() | ||
|
||
# Load model state | ||
loaded_surrogate = SurrogateBoTorch.load_state(state) | ||
``` | ||
|
||
### 6.2 Model Evaluation | ||
|
||
You can evaluate the performance of a surrogate model using the `score()` method: | ||
|
||
```python | ||
loss, r2_score = surrogate.score(X_test, y_test) | ||
``` | ||
|
||
This concludes the user guide for the `obsidian.surrogates` submodule. For more detailed information, please refer to the source code and docstrings in the individual files. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,131 @@ | ||
# Acquisition Function | ||
|
||
(TBA...) | ||
## 1. Introduction | ||
|
||
The `obsidian.acquisition` submodule is a crucial component of the Obsidian Bayesian optimization library. It provides acquisition functions that guide the optimization process by determining which points in the parameter space should be evaluated next. These acquisition functions balance exploration of uncertain areas and exploitation of promising regions, which is key to efficient optimization. | ||
|
||
## 2. Key Components | ||
|
||
The acquisition submodule includes several acquisition functions, both standard and custom implementations: | ||
|
||
### 2.1 Standard Acquisition Functions | ||
|
||
- Expected Improvement (EI) | ||
- Probability of Improvement (PI) | ||
- Upper Confidence Bound (UCB) | ||
- Noisy Expected Improvement (NEI) | ||
- Expected Hypervolume Improvement (EHVI) | ||
- Noisy Expected Hypervolume Improvement (NEHVI) | ||
|
||
### 2.2 Custom Acquisition Functions | ||
|
||
- qMean: Optimizes for the maximum value of the posterior mean | ||
- qSpaceFill: Optimizes for the maximum value of minimum distance between a point and the training data | ||
|
||
## 3. Understanding Acquisition Functions | ||
|
||
### 3.1 Expected Improvement (EI) | ||
|
||
EI calculates the expected amount by which we will improve upon the current best observed value. | ||
|
||
Mathematical formulation: | ||
``` | ||
EI(x) = E[max(f(x) - f(x+), 0)] | ||
``` | ||
where f(x+) is the current best observed value. | ||
|
||
Example usage: | ||
```python | ||
from obsidian.optimizer import BayesianOptimizer | ||
|
||
optimizer = BayesianOptimizer(X_space=param_space) | ||
X_suggest, eval_suggest = optimizer.suggest(acquisition=['EI']) | ||
``` | ||
|
||
### 3.2 Upper Confidence Bound (UCB) | ||
|
||
UCB balances exploration and exploitation by selecting points with high predicted values or high uncertainty. | ||
|
||
Mathematical formulation: | ||
``` | ||
UCB(x) = μ(x) + β * σ(x) | ||
``` | ||
where μ(x) is the predicted mean, σ(x) is the predicted standard deviation, and β is a parameter that controls the exploration-exploitation trade-off. | ||
|
||
Example usage: | ||
```python | ||
X_suggest, eval_suggest = optimizer.suggest(acquisition=[{'UCB': {'beta': 2.0}}]) | ||
``` | ||
|
||
### 3.3 Noisy Expected Improvement (NEI) | ||
|
||
NEI is a variant of EI that accounts for noise in the observations, making it more suitable for real-world problems with measurement uncertainty. | ||
|
||
Example usage: | ||
```python | ||
X_suggest, eval_suggest = optimizer.suggest(acquisition=['NEI']) | ||
``` | ||
|
||
## 4. Advanced Usage | ||
|
||
### 4.1 Multi-Objective Optimization | ||
|
||
For multi-objective optimization problems, you can use specialized acquisition functions: | ||
|
||
```python | ||
X_suggest, eval_suggest = optimizer.suggest(acquisition=['NEHVI']) | ||
``` | ||
|
||
### 4.2 Customizing Acquisition Functions | ||
|
||
Some acquisition functions accept parameters to customize their behavior. These can be specified in the `suggest` method: | ||
|
||
```python | ||
X_suggest, eval_suggest = optimizer.suggest( | ||
acquisition=[{'EI': {'inflate': 0.01}}] | ||
) | ||
``` | ||
|
||
### 4.3 Custom Acquisition Functions | ||
|
||
If you need to implement a custom acquisition function, you can extend the `MCAcquisitionFunction` class from BoTorch: | ||
|
||
```python | ||
from botorch.acquisition import MCAcquisitionFunction | ||
import torch | ||
|
||
class CustomAcquisition(MCAcquisitionFunction): | ||
def forward(self, X): | ||
posterior = self.model.posterior(X) | ||
mean = posterior.mean | ||
std = posterior.variance.sqrt() | ||
return (mean + 0.1 * std).sum(dim=-1) # Example custom acquisition logic | ||
``` | ||
|
||
## 5. Comparing Acquisition Functions | ||
|
||
Different acquisition functions have different strengths: | ||
|
||
- EI and PI are good for exploiting known good regions but may underexplore. | ||
- UCB provides a tunable exploration-exploitation trade-off. | ||
- NEI and NEHVI are robust to noisy observations. | ||
- qMean is purely exploitative and can be useful in the final stages of optimization. | ||
- qSpaceFill is purely explorative and can be useful for initial space exploration. | ||
|
||
## 6. Best Practices | ||
|
||
1. Choose appropriate acquisition functions based on your problem characteristics (e.g., noise level, number of objectives). | ||
2. For noisy problems, consider using noise-aware acquisition functions like NEI or NEHVI. | ||
3. Experiment with different acquisition functions to find the best performance for your specific problem. | ||
4. When using UCB, carefully tune the beta parameter to balance exploration and exploitation. | ||
5. For multi-objective problems, EHVI and NEHVI are often good choices. | ||
6. Consider using a sequence of acquisition functions, starting with more exploratory ones and moving to more exploitative ones as the optimization progresses. | ||
|
||
## 7. Common Pitfalls | ||
|
||
1. Using EI or PI in noisy problems, which can lead to overexploitation of noisy observations. | ||
2. Setting UCB's beta parameter too high (over-exploration) or too low (over-exploitation). | ||
3. Using single-objective acquisition functions for multi-objective problems. | ||
4. Not accounting for constraints when selecting acquisition functions. | ||
|
||
This concludes the user guide for the `obsidian.acquisition` submodule. For more detailed information, please refer to the source code and docstrings in the individual files. |