Evaluators
NumerBlox offers evaluators for both Numerai Classic and Numerai Signals.
Common Metrics
For both NumeraiClassicEvaluator
and NumeraiSignalsEvaluator
you can set a custom metrics_list
with all metrics you want to compute.
By default, metrics will include ["mean_std_sharpe", "apy", "max_drawdown", "calmar_ratio"]
All valid metrics for metrics_list
are:
-
"mean_std_sharpe" -> Mean, standard deviation and Sharpe ratio based on Corrv2 (Numerai Correlation).
-
"apy" -> Annual Percentage Yield.
-
"max_drawdown" -> Max drawdown.
-
"calmar_ratio" -> Calmar Ratio.
-
"autocorrelation" -> Autocorrelation (1st order).
-
"max_feature_exposure" -> Max feature exposure.
-
"smart_sharpe" -> Smart Sharpe.
-
"legacy_mean_std_sharpe" -> Mean, standard deviation and Sharpe ratio based on legacy model contribution.
-
"fn_mean_std_sharpe" -> Feature Neutral mean, standard deviation and Sharpe ratio (can take some time to compute).
-
"tb200_mean_std_sharpe" -> Mean, standard deviation and Sharpe ratio based on TB200.
-
"tb500_mean_std_sharpe" -> Mean, standard deviation and Sharpe ratio based on TB500.
The following metrics only work if benchmark_cols
are defined in full_evaluation
:
-
"mc_mean_std_sharpe" -> Mean, standard deviation and Sharpe ratio based on model contribution.
-
"corr_with" -> Correlation with benchmark predictions.
-
"ex_diss_pearson" (alias "ex_diss") -> Exposure Dissimilarity to benchmark predictions using Pearson correlation.
-
"ex_diss_spearman" -> Exposure Dissimilarity to benchmark predictions using Spearman correlation. Will be slower compared to "ex_diss_pearson".
-
"churn" -> Churn is a statistic describing how the alpha scores of a signal changes over time.
-
"tb200_churn" -> Churn based on TB200.
-
"tb500_churn" -> Churn based on TB500.
Numerai Classic specific metrics
NumeraiClassicEvaluator
can also compute FNCv3. If you want to compute this add fncv3_mean_std_sharpe
to the metrics_list
.
from numerblox.evaluation import NumeraiClassicEvaluator, FAST_METRICS
# Validation DataFrame to compute metrics on
# Should have at least era_col, pred_cols and target_col columns.
val_df = ...
evaluator = NumeraiClassicEvaluator(era_col="era", metrics_list=FAST_METRICS)
metrics = evaluator.full_evaluation(val_df,
pred_cols=["prediction"],
target_col="target",
benchmark_cols=["benchmark1", "benchmark2"])
Numerai Signals specific metrics
NumeraiSignalsEvaluator
offers Numerai Signals diagnostics scores. This is a special operation as it calls on Numerai servers and needs additional authentication, so it is not included in full_evaluation
.
Example of how to get diagnostic scores for Numerai Signals:
from numerblox.misc import Key
from numerblox.evaluation import NumeraiSignalsEvaluator
evaluator = NumeraiSignalsEvaluator()
# A Numerai Signals model name you use.
model_name = "MY_MODEL"
# NumerBlox Key for accessing the Numerai API
key = Key(pub_id="Hello", secret_key="World")
# DataFrame with validation data containing prediction, date, ticker and data_type columns
val_df = pd.DataFrame()
evaluator.get_neutralized_corr(val, model_name=model_name, key=key, corr_col="validationRic")
# Returns a Pandas DataFrame with validationRic.
Custom functions
In addition to the default metrics, evaluators can be augmented with custom metrics. This can be done by defining a dictionary of functions and arguments.
The custom function dictionary should have the following structure:
{
"func1": # Metric name
{
"func": custom_function, # Function to call
"args": { # General arguments (can be any type)
"dataf": "dataf",
"some_arg": "some_arg",
},
"local_args": ["dataf"] # List of local variables to use/resolve
},
"func2":
{
"func": custom_function2,
"args": {
"dataf": "dataf",
"some_arg": "some_arg",
},
"local_args": ["dataf"]
},
(...)
}
-
The main keys (
func1
andfunc2
in the example) will be the metric key names for the output evaluation DataFrame. -
The
func
key should be a function that takes in the arguments defined inargs
as keyword arguments.func
should be a callable function or class (i.e. class that implements__call__
). -
The
args
key should be a dictionary with arguments to pass tofunc
. The values of the dictionary can be any type. Arguments that you want resolved as local variables should be defined as strings (seelocal_args
explanation). -
The
local_args
key should be a list of strings that refer to variables that exist locally in the evaluation_one_col function. These local variables will be resolved to local variables forfunc
. This allows you to use evaluation_one_col variables likedataf
,pred_col
,target_col
,col_stats
,mean
,per_era_numerai_corrs
, etc.
Example of how to use custom functions in NumeraiClassicEvaluator
:
from numerblox.evaluation import NumeraiClassicEvaluator
def residuals(dataf, target_col, pred_col, val: int):
""" Simple dummy func: mean of residuals. """
return np.mean(dataf[target_col] - dataf[pred_col] + val)
custom_functions = {
"residuals": {
# Callable function
"func": residuals,
"args": {
# String referring to local variables
"dataf": "dataf",
"pred_col": "pred_col",
"target_col": "target_col",
# Static argument
"val": 0.0001,
},
# List of local variables to use/resolve
"local_args": ["dataf", "pred_col", "target_col"]
},
}
evaluator = NumeraiClassicEvaluator(custom_functions=custom_functions)
# In evaluator residuals(dataf=dataf, pred_col="prediction", target_col="target", val="0.0001) is called.
metrics = evaluator.full_evaluation(val_df,
pred_cols=["prediction"],
target_col="target")
# metrics will contain a "residuals" column.