Tuning search space should not be large

Recently, did an experiment and used `rastrigin` function to figure out how well Ray works when we increase the dimensionality of search space. `rastrigin` has a lot of local minima, but it's global minimum is at 0 [Fig. 1].
The results of the experiment shows that as the dimensionality of the search space increases, the results deteriorate. It raised my concern if Ray is able to find best config when there are a lot of tunable hyperparameters. Maybe we need to define some trimming tools to trim the search space and reduce its dimensionally, or we can set some hyperparameters constant, especially those we think we might not get any benefit from.

![image](https://github.com/PathologyDataScience/glimr/assets/65801650/29ca3227-9203-43f3-bae3-094c29166182)


**Code:**

```
from ray import train, tune
from ray.tune.schedulers import PopulationBasedTraining
import numpy as np

# rastrigin function.
def rastrigin(config):
    x = list(config.values())
    n = len(x)
    score = 10*n + sum([xi**2 - 10*np.cos(2*np.pi*xi) for xi in x])
    return {"score": score}

# plot rastrigin in 3D 
# Note: it's global minimum is at zeros.
x = np.linspace(-5.12, 5.12, 100)
y = np.linspace(-5.12, 5.12, 100)
X, Y = np.meshgrid(x, y)
Z = rastrigin({"a":X, "b":Y})

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(X, Y, Z['score'], cmap='viridis')
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')


# run ray experiment to find global minimum of n-D rastrigin func.
max_dim = 10
scores = []
for d in range(1, max_dim):
  
  search_space = {f"var_{i}": tune.quniform(-2, 2, 0.05) for i in range(d)}
  scheduler = PopulationBasedTraining(
      time_attr="training_iteration", 
      hyperparam_mutations=search_space, 
      metric="score", 
      mode="max",
  )

  tuner = tune.Tuner(rastrigin, 
                    param_space=search_space,
                    tune_config=tune.TuneConfig(
                          num_samples=50,
                          scheduler=scheduler,
                      ),)

  results = tuner.fit()
  scores.append(results.get_best_result(metric="score", mode="min").metrics['score']) 
```

The resulting loss value vs the dimensionality of search space.
<img width="400" alt="Screenshot 2023-12-13 at 13 34 22" src="https://github.com/PathologyDataScience/glimr/assets/65801650/8479812b-fae6-400e-bd5b-d261a176960a">



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tuning search space should not be large #61

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Tuning search space should not be large #61

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions