Keras Cortex Tutorial¶

This tutorial demonstrates how to use KerasCortex, a meta-estimator that uses Large Language Models (LLMs) to iteratively improve neural network architectures through self-reflection and iterative improvement.

Overview¶

KerasCortex wraps a base Keras estimator and uses an LLM via DSPy to suggest modifications to the build_model method. It iteratively attempts these suggestions, evaluates their performance on validation data, and keeps the best-performing model architecture. Each step, it reasons about changes it should make to the model before re-fitting it.

Prerequisites¶

To run the full tutorial, you'll need:

An OpenAI API key (or access to another LLM via DSPy)
JAX, TensorFlow, or PyTorch backend for Keras
The dspy library for LLM orchestration

!!! Warning

This module is a work in progress. It is not yet ready for production use.
This is highly experimental and likely to overfit.

In [1]:

Copied!

%load_ext dotenv
%dotenv # ensure OPENAI_API_KEY is set in .env
%load_ext dotenv
%dotenv # ensure OPENAI_API_KEY is set in .env

In [2]:

Copied!





import os

os.environ["KERAS_BACKEND"] = "jax"
os.environ["OPENAI_API_KEY"] = "sk-proj-..."

import polars as pl
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split

from centimators.model_estimators import MLPRegressor
from centimators.keras_cortex import KerasCortex
import os

os.environ["KERAS_BACKEND"] = "jax"
os.environ["OPENAI_API_KEY"] = "sk-proj-..."

import polars as pl
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split

from centimators.model_estimators import MLPRegressor
from centimators.keras_cortex import KerasCortex

In [3]:

Copied!





X, y = make_regression(
    n_samples=10000,
    n_features=100,
    noise=0.1,
    random_state=42,
)

X = pl.DataFrame(X)
y = pl.Series(y)

# train / val / test split  (60 / 20 / 20)
X_train, X_tmp, y_train, y_tmp = train_test_split(X, y, test_size=0.4, random_state=42)
X_val, X_test, y_val, y_test = train_test_split(
    X_tmp, y_tmp, test_size=0.5, random_state=42
)

print(X_train.shape, X_val.shape, X_test.shape)
X, y = make_regression(
    n_samples=10000,
    n_features=100,
    noise=0.1,
    random_state=42,
)

X = pl.DataFrame(X)
y = pl.Series(y)

# train / val / test split  (60 / 20 / 20)
X_train, X_tmp, y_train, y_tmp = train_test_split(X, y, test_size=0.4, random_state=42)
X_val, X_test, y_val, y_test = train_test_split(
    X_tmp, y_tmp, test_size=0.5, random_state=42
)

print(X_train.shape, X_val.shape, X_test.shape)

(6000, 100) (2000, 100) (2000, 100)

In [4]:

Copied!





base_mlp = MLPRegressor(
    hidden_units=(64, 32),
    dropout_rate=0.1,
)

cortex = KerasCortex(
    base_estimator=base_mlp, n_iterations=6, lm="openai/gpt-4o-mini", verbose=True
)
base_mlp = MLPRegressor(
    hidden_units=(64, 32),
    dropout_rate=0.1,
)

cortex = KerasCortex(
    base_estimator=base_mlp, n_iterations=6, lm="openai/gpt-4o-mini", verbose=True
)

In [5]:

Copied!





cortex.fit(
    X_train,
    y_train,
    validation_data=(X_val, y_val),
    epochs=10,
    batch_size=516,
)
cortex.fit(
    X_train,
    y_train,
    validation_data=(X_val, y_val),
    epochs=10,
    batch_size=516,
)

WARNING:2025-05-23 02:23:28,521:jax._src.xla_bridge:909: An NVIDIA GPU may be present on this machine, but a CUDA-enabled jaxlib is not installed. Falling back to cpu.

Epoch 1/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 64ms/step - loss: 38910.3633 - mse: 38910.3633
Epoch 2/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 37584.1914 - mse: 37584.1914  
Epoch 3/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 38222.5898 - mse: 38222.5898
Epoch 4/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 38282.4961 - mse: 38282.4961
Epoch 5/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 37073.2383 - mse: 37073.2383
Epoch 6/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 36388.0781 - mse: 36388.0781
Epoch 7/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 36612.1328 - mse: 36612.1328
Epoch 8/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 34292.5352 - mse: 34292.5352
Epoch 9/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 33463.0625 - mse: 33463.0625
Epoch 10/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 31842.3281 - mse: 31842.3281
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 26ms/step

--- Iteration 1 ---
Reasoning: 
To improve the validation metrics, particularly the R2 score, we can consider several modifications to the model architecture and training process. First, we can experiment with adding more layers or increasing the number of units in the existing layers to allow the model to learn more complex patterns. Additionally, we can adjust the activation function to a non-linear one like 'relu' or 'swish' which often performs better in practice. Furthermore, we can implement batch normalization after each dense layer to stabilize and accelerate training. Lastly, we can consider using a different optimizer like 'Adam' with a learning rate schedule to improve convergence.
Suggested code: 
def build_model(self):
    """Construct a simple MLP with the configured hyper-parameters."""
    inputs = layers.Input(shape=(self._n_features_in_,), name="features")
    x = inputs
    for units in self.hidden_units:
        x = layers.Dense(units, activation='relu')(x)
        if self.dropout_rate > 0:
            x = layers.Dropout(self.dropout_rate)(x)
        x = layers.BatchNormalization()(x)  # Adding batch normalization
    outputs = layers.Dense(self.output_units, activation="linear")(x)
    self.model = models.Model(inputs=inputs, outputs=outputs, name="mlp_regressor")

    self.model.compile(
        optimizer='adam',  # Using Adam optimizer
        loss=self.loss_function,
        metrics=self.metrics,
    )

    return self
Epoch 1/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 78ms/step - loss: 37700.0586 - mse: 37700.0586
Epoch 2/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 37608.2930 - mse: 37608.2930  
Epoch 3/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 37138.8477 - mse: 37138.8477
Epoch 4/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 37910.6211 - mse: 37910.6211
Epoch 5/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 37415.6562 - mse: 37415.6562
Epoch 6/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 36536.8359 - mse: 36536.8359
Epoch 7/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 35623.8125 - mse: 35623.8125
Epoch 8/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 35256.6914 - mse: 35256.6914
Epoch 9/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 34580.7734 - mse: 34580.7734
Epoch 10/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 34898.6445 - mse: 34898.6445
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 31ms/step
No improvement (0.1016 <= 0.1919), keeping best code.

--- Iteration 2 ---
Reasoning: 
To improve the validation metrics, particularly the R2 score, we can consider several modifications to the model architecture and training process. The current model uses a simple MLP structure with ReLU activations and dropout, which may not be sufficient for capturing complex patterns in the data. 

1. **Activation Function**: Experimenting with different activation functions, such as Leaky ReLU or ELU, can help mitigate issues with dying neurons and improve learning.
2. **Layer Configuration**: Adding more layers or increasing the number of units in existing layers can enhance the model's capacity to learn complex relationships.
3. **Regularization**: Adjusting dropout rates or adding L2 regularization can help prevent overfitting, which is crucial for improving validation metrics.
4. **Learning Rate**: Fine-tuning the learning rate of the optimizer can lead to better convergence during training.

Given these considerations, I suggest modifying the activation function to Leaky ReLU, increasing the number of units in the hidden layers, and adjusting the dropout rate if necessary.
Suggested code: 
def build_model(self):
    """Construct a simple MLP with the configured hyper-parameters."""
    inputs = layers.Input(shape=(self._n_features_in_,), name="features")
    x = inputs
    for units in self.hidden_units:
        x = layers.Dense(units, activation='leaky_relu')(x)  # Changed to Leaky ReLU
        if self.dropout_rate > 0:
            x = layers.Dropout(self.dropout_rate)(x)
        x = layers.BatchNormalization()(x)  # Adding batch normalization
    outputs = layers.Dense(self.output_units, activation="linear")(x)
    self.model = models.Model(inputs=inputs, outputs=outputs, name="mlp_regressor")

    self.model.compile(
        optimizer='adam',  # Using Adam optimizer
        loss=self.loss_function,
        metrics=self.metrics,
    )

    return self
Epoch 1/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 80ms/step - loss: 38861.8711 - mse: 38861.8711
Epoch 2/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 37264.9961 - mse: 37264.9961  
Epoch 3/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 37546.3281 - mse: 37546.3281
Epoch 4/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 37600.1172 - mse: 37600.1172
Epoch 5/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 36725.0469 - mse: 36725.0469
Epoch 6/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 36916.5859 - mse: 36916.5859
Epoch 7/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 35638.0977 - mse: 35638.0977
Epoch 8/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 35082.1836 - mse: 35082.1836
Epoch 9/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 34401.2461 - mse: 34401.2461
Epoch 10/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 33381.0312 - mse: 33381.0312
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step
No improvement (0.1285 <= 0.1919), keeping best code.

--- Iteration 3 ---
Reasoning: 
To improve the validation metrics (R2), we can consider several modifications to the model architecture and training process. The current model uses Leaky ReLU activation, which is a good choice, but we can experiment with adding more complexity to the model by increasing the number of layers or units. Additionally, we can adjust the dropout rate to prevent overfitting, especially if the model is too complex for the dataset. We can also consider using a different optimizer or adjusting the learning rate for better convergence. Finally, we can add more batch normalization layers to stabilize the learning process.
Suggested code: 
def build_model(self):
    """Construct a simple MLP with the configured hyper-parameters."""
    inputs = layers.Input(shape=(self._n_features_in_,), name="features")
    x = inputs
    for units in self.hidden_units:
        x = layers.Dense(units, activation='leaky_relu')(x)  # Keeping Leaky ReLU
        if self.dropout_rate > 0:
            x = layers.Dropout(self.dropout_rate)(x)
        x = layers.BatchNormalization()(x)  # Adding batch normalization
    # Adding an additional Dense layer for more complexity
    x = layers.Dense(64, activation='leaky_relu')(x)  # Additional layer
    if self.dropout_rate > 0:
        x = layers.Dropout(self.dropout_rate)(x)
    outputs = layers.Dense(self.output_units, activation="linear")(x)
    self.model = models.Model(inputs=inputs, outputs=outputs, name="mlp_regressor")

    self.model.compile(
        optimizer='adam',  # Using Adam optimizer
        loss=self.loss_function,
        metrics=self.metrics,
    )

    return self
Epoch 1/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 88ms/step - loss: 38137.1836 - mse: 38137.1836
Epoch 2/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 38227.8203 - mse: 38227.8203  
Epoch 3/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 38262.1445 - mse: 38262.1445
Epoch 4/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 37551.5586 - mse: 37551.5586
Epoch 5/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 35559.9570 - mse: 35559.9570
Epoch 6/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 33512.2930 - mse: 33512.2930
Epoch 7/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 30176.9766 - mse: 30176.9766
Epoch 8/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 26692.1738 - mse: 26692.1738
Epoch 9/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 23776.6172 - mse: 23776.6172
Epoch 10/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 20573.6172 - mse: 20573.6172
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 30ms/step
Improvement! New validation score: 0.6164 > 0.1919

--- Iteration 4 ---
Reasoning: 
To improve the validation metrics (R2), we can consider several modifications to the model architecture and training process. The current model uses Leaky ReLU activations, which is good, but we can experiment with adding more complexity and regularization. The addition of more layers or units can help the model learn better representations. Additionally, we can adjust the learning rate of the Adam optimizer to ensure that the model converges more effectively. Implementing early stopping during training can also help prevent overfitting, which is crucial for improving validation metrics.
Suggested code: 
def build_model(self):
    """Construct a simple MLP with the configured hyper-parameters."""
    inputs = layers.Input(shape=(self._n_features_in_,), name="features")
    x = inputs
    for units in self.hidden_units:
        x = layers.Dense(units, activation='leaky_relu')(x)  # Keeping Leaky ReLU
        if self.dropout_rate > 0:
            x = layers.Dropout(self.dropout_rate)(x)
        x = layers.BatchNormalization()(x)  # Adding batch normalization
    # Adding an additional Dense layer for more complexity
    x = layers.Dense(128, activation='leaky_relu')(x)  # Increased units for additional layer
    if self.dropout_rate > 0:
        x = layers.Dropout(self.dropout_rate)(x)
    outputs = layers.Dense(self.output_units, activation="linear")(x)
    self.model = models.Model(inputs=inputs, outputs=outputs, name="mlp_regressor")

    self.model.compile(
        optimizer='adam',  # Using Adam optimizer
        loss=self.loss_function,
        metrics=self.metrics,
    )

    return self
Epoch 1/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 87ms/step - loss: 40391.3164 - mse: 40391.3164
Epoch 2/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 38161.0781 - mse: 38161.0781 
Epoch 3/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 37445.9609 - mse: 37445.9609
Epoch 4/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 36245.1328 - mse: 36245.1328
Epoch 5/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 34253.0195 - mse: 34253.0195
Epoch 6/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 30194.2812 - mse: 30194.2812
Epoch 7/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 26387.5801 - mse: 26387.5801
Epoch 8/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 22368.8633 - mse: 22368.8633
Epoch 9/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 16535.9961 - mse: 16535.9961
Epoch 10/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 12131.1436 - mse: 12131.1436
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 29ms/step
Improvement! New validation score: 0.8433 > 0.6164

--- Iteration 5 ---
Reasoning: 
To improve validation metrics, particularly R2, we can consider several modifications to the model architecture and training process. The current model has a relatively simple structure with a few dense layers and dropout for regularization. However, the performance log indicates that the model's complexity may not be sufficient to capture the underlying patterns in the data. 

1. **Increase Model Complexity**: Adding more layers or increasing the number of units in existing layers can help the model learn more complex representations.
2. **Adjust Activation Functions**: While Leaky ReLU is a good choice, experimenting with other activation functions like 'relu' or 'swish' might yield better results.
3. **Learning Rate Adjustment**: Fine-tuning the learning rate of the Adam optimizer can significantly impact convergence and performance.
4. **Regularization Techniques**: Besides dropout, we could also consider L2 regularization on the dense layers to prevent overfitting.

Given these considerations, I suggest increasing the number of units in the additional dense layer and experimenting with a different activation function.
Suggested code: 
def build_model(self):
    """Construct a simple MLP with the configured hyper-parameters."""
    inputs = layers.Input(shape=(self._n_features_in_,), name="features")
    x = inputs
    for units in self.hidden_units:
        x = layers.Dense(units, activation='relu')(x)  # Changed to ReLU for potential better performance
        if self.dropout_rate > 0:
            x = layers.Dropout(self.dropout_rate)(x)
        x = layers.BatchNormalization()(x)  # Adding batch normalization
    # Adding an additional Dense layer with increased units for more complexity
    x = layers.Dense(256, activation='relu')(x)  # Increased units for additional layer
    if self.dropout_rate > 0:
        x = layers.Dropout(self.dropout_rate)(x)
    outputs = layers.Dense(self.output_units, activation="linear")(x)
    self.model = models.Model(inputs=inputs, outputs=outputs, name="mlp_regressor")

    self.model.compile(
        optimizer='adam',  # Using Adam optimizer
        loss=self.loss_function,
        metrics=self.metrics,
    )

    return self
Epoch 1/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 2s 89ms/step - loss: 38065.1797 - mse: 38065.1797
Epoch 2/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 37779.0117 - mse: 37779.0117 
Epoch 3/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 37103.6758 - mse: 37103.6758
Epoch 4/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 36028.3477 - mse: 36028.3477
Epoch 5/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 33991.7109 - mse: 33991.7109
Epoch 6/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 30745.7578 - mse: 30745.7578
Epoch 7/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 24856.4180 - mse: 24856.4180
Epoch 8/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 18043.0879 - mse: 18043.0879
Epoch 9/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 12032.8086 - mse: 12032.8086
Epoch 10/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 7566.1025 - mse: 7566.1025
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 32ms/step
Improvement! New validation score: 0.8788 > 0.8433

--- Iteration 6 ---
Reasoning: 
To improve the validation metrics (R2), we can consider several modifications to the model architecture. The current model has a relatively high number of units in the last dense layer (256), which may lead to overfitting, especially if the dataset is not large enough. Reducing the complexity of the model by decreasing the number of units in the last layer can help generalize better. Additionally, we can introduce a more advanced activation function like 'swish' or 'gelu' which may provide better performance than 'leaky_relu' or 'relu'. We can also consider adding more dropout layers or increasing the dropout rate to further prevent overfitting.
Suggested code: 
def build_model(self):
    """Construct a simple MLP with the configured hyper-parameters."""
    inputs = layers.Input(shape=(self._n_features_in_,), name="features")
    x = inputs
    for units in self.hidden_units:
        x = layers.Dense(units, activation='swish')(x)  # Changed to Swish activation
        if self.dropout_rate > 0:
            x = layers.Dropout(self.dropout_rate)(x)
        x = layers.BatchNormalization()(x)  # Adding batch normalization
    # Reducing the number of units in the additional Dense layer
    x = layers.Dense(128, activation='swish')(x)  # Decreased units for additional layer
    if self.dropout_rate > 0:
        x = layers.Dropout(self.dropout_rate)(x)
    outputs = layers.Dense(self.output_units, activation="linear")(x)
    self.model = models.Model(inputs=inputs, outputs=outputs, name="mlp_regressor")

    self.model.compile(
        optimizer='adam',  # Using Adam optimizer
        loss=self.loss_function,
        metrics=self.metrics,
    )

    return self
Epoch 1/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 89ms/step - loss: 38398.5625 - mse: 38398.5625
Epoch 2/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 38273.7734 - mse: 38273.7734 
Epoch 3/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 37329.9453 - mse: 37329.9453
Epoch 4/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 37087.1641 - mse: 37087.1641
Epoch 5/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 34895.7227 - mse: 34895.7227
Epoch 6/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 31744.3477 - mse: 31744.3477
Epoch 7/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 28108.6406 - mse: 28108.6406
Epoch 8/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 24307.8398 - mse: 24307.8398
Epoch 9/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 20269.2207 - mse: 20269.2207
Epoch 10/10
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 16637.2168 - mse: 16637.2168
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 36ms/step
No improvement (0.6662 <= 0.8788), keeping best code.

Out[5]:

KerasCortex(base_estimator=MLPRegressor(output_units=1,
                                        optimizer=<class 'keras.src.optimizers.adam.Adam'>,
                                        learning_rate=0.001,
                                        loss_function='mse',
                                        metrics=['mse'],
                                        model=None,
                                        distribution_strategy=None,
                                        hidden_units=(64, 32),
                                        activation='relu',
                                        dropout_rate=0.1),
            lm=<dspy.clients.lm.LM object at 0x7f9c4414daf0>, n_iterations=6,
            verbose=True)

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.