Postprocessing

Prediction manipulation.

Overview

The postprocessing procedure is similar to preprocessing. Preprocessors manipulate and/or add feature columns, while postprocessors manipulate and/or add prediction columns.

Every postprocessor should inherit from BasePostProcessor. A postprocessor should take a NumerFrame as input and output a NumerFrame. One or more new prediction column(s) with prefix prediction are added or manipulated in a postprocessor.

0. BasePostProcessor

Some characteristics are particular to Postprocessors, but not suitable to put in the Processor base class. This functionality is implemented in BasePostProcessor.


source

BasePostProcessor

 BasePostProcessor (final_col_name:str)

Base class for postprocessing objects.

Postprocessors manipulate or introduce new prediction columns in a NumerFrame.

1. Common postprocessing steps

We invite the Numerai community to develop new postprocessors so that everyone can benefit from new insights and research. This section implements commonly used postprocessing for Numerai.

1.0. Tournament agnostic

Postprocessing that works for both Numerai Classic and Numerai Signals.

1.0.1. Standardization

Standardizing is an essential step in order to reliably combine Numerai predictions. It is a default postprocessor for ModelPipeline.


source

Standardizer

 Standardizer (cols:list=None)

Uniform standardization of prediction columns. All values should only contain values in the range [0…1].

:param cols: All prediction columns that should be standardized. Use all prediction columns by default.

# Random DataFrame
test_features = [f"prediction_{l}" for l in "ABCDE"]
df = pd.DataFrame(np.random.uniform(size=(100, 5)), columns=test_features)
df["target"] = np.random.normal(size=100)
df["date"] = [0, 1, 2, 3] * 25
test_dataf = NumerFrame(df)
std = Standardizer()
std.transform(test_dataf).get_prediction_data.head(2)
✅ Finished step Standardizer. Output shape=(100, 7). Time taken for step: 0:00:00.004771. ✅
prediction_A prediction_B prediction_C prediction_D prediction_E
0 0.12 0.40 0.04 0.72 0.04
1 0.92 0.48 0.96 0.92 0.72

1.0.2. Ensembling

Multiple prediction results can be ensembled in multiple ways. We provide the most common use cases here.

1.0.2.1. Simple Mean


source

MeanEnsembler

 MeanEnsembler (final_col_name:str, cols:list=None,
                standardize:bool=False)

Take simple mean of multiple cols and store in new col.

:param final_col_name: Name of new averaged column. final_col_name should start with “prediction”.

:param cols: Column names to average.

:param standardize: Whether to standardize by era before averaging. Highly recommended as columns that are averaged may have different distributions.

1.0.2.2. Donate’s formula

This method for weighted averaging is mostly suitable if you have multiple models trained on a time series cross validation scheme. The first models will be trained on less data so we want to give them a lower weighting compared to the later models.

Source: Yirun Zhang in his winning solution for the Jane Street 2021 Kaggle competition. Based on a paper by Donate et al.


source

DonateWeightedEnsembler

 DonateWeightedEnsembler (final_col_name:str, cols:list=None)

Weighted average as per Donate et al.’s formula Paper Link: https://doi.org/10.1016/j.neucom.2012.02.053 Code source: https://www.kaggle.com/gogo827jz/jane-street-supervised-autoencoder-mlp

Weightings for 5 folds: [0.0625, 0.0625, 0.125, 0.25, 0.5]

:param cols: Prediction columns to ensemble. Uses all prediction columns by default.

:param final_col_name: New column name for ensembled values.

# Random DataFrame
#| include: false
test_features = [f"prediction_{l}" for l in "ABCDE"]
df = pd.DataFrame(np.random.uniform(size=(100, 5)), columns=test_features)
df["target"] = np.random.normal(size=100)
df["era"] = range(100)
test_dataf = NumerFrame(df)

For 5 folds, the weightings are [0.0625, 0.0625, 0.125, 0.25, 0.5].

w_5_fold = [0.0625, 0.0625, 0.125, 0.25, 0.5]
donate = DonateWeightedEnsembler(
    cols=test_dataf.prediction_cols, final_col_name="prediction"
)
ensembled = donate(test_dataf).get_prediction_data
assert ensembled["prediction"][0] == np.sum(
    [w * elem for w, elem in zip(w_5_fold, ensembled[test_features].iloc[0])]
)
ensembled.head(2)
🍲 Ensembled '['prediction_A', 'prediction_B', 'prediction_C', 'prediction_D', 'prediction_E']' with 
DonateWeightedEnsembler and saved in 'prediction' 🍲
✅ Finished step DonateWeightedEnsembler. Output shape=(100, 8). Time taken for step: 0:00:00.004307. ✅
prediction_A prediction_B prediction_C prediction_D prediction_E prediction
0 0.809680 0.821740 0.673158 0.130708 0.946340 0.691955
1 0.665325 0.402088 0.454365 0.820944 0.091936 0.374713

1.0.2.3. Geometric Mean

Take the mean of multiple prediction columns using the product of values.

More info on Geometric mean: - Wikipedia - Investopedia


source

GeometricMeanEnsembler

 GeometricMeanEnsembler (final_col_name:str, cols:list=None)

Calculate the weighted Geometric mean.

:param cols: Prediction columns to ensemble. Uses all prediction columns by default.

:param final_col_name: New column name for ensembled values.

geo_mean = GeometricMeanEnsembler(final_col_name="prediction_geo")
ensembled = geo_mean(test_dataf).get_prediction_data
ensembled.head(2)
🍲 Ensembled '['prediction_A', 'prediction_B', 'prediction_C', 'prediction_D', 'prediction_E']' with 
GeometricMeanEnsembler and saved in 'prediction_geo' 🍲
✅ Finished step GeometricMeanEnsembler. Output shape=(100, 9). Time taken for step: 0:00:00.031692. ✅
prediction_A prediction_B prediction_C prediction_D prediction_E prediction prediction_geo
0 0.809680 0.821740 0.673158 0.130708 0.946340 0.691955 0.560664
1 0.665325 0.402088 0.454365 0.820944 0.091936 0.374713 0.391302

1.0.3. Neutralization and penalization

1.0.3.1. Feature Neutralization

Classic feature neutralization (subtracting linear model from scores).

New column name for neutralized values will be {pred_name}_neutralized_{PROPORTION}. pred_name should start with 'prediction'.

Optionally, you can run feature neutralization on the GPU using cupy by setting cuda=True. Make sure you have cupy installed with the correct CUDA Toolkit version. More information: docs.cupy.dev/en/stable/install.html

Detailed explanation of Feature Neutralization by Katsu1110


source

FeatureNeutralizer

 FeatureNeutralizer (feature_names:list=None, pred_name:str='prediction',
                     proportion:float=0.5, suffix:str=None, cuda=False)

Classic feature neutralization by subtracting linear model.

:param feature_names: List of column names to neutralize against. Uses all feature columns by default.

:param pred_name: Prediction column to neutralize.

:param proportion: Number in range [0…1] indicating how much to neutralize.

:param suffix: Optional suffix that is added to new column name.

:param cuda: Do neutralization on the GPU

Make sure you have CuPy installed when setting cuda to True.

Installation docs: docs.cupy.dev/en/stable/install.html

testv1_dataf = create_numerframe("test_assets/mini_numerai_version_2_data.parquet")
testv1_dataf.loc[:, "prediction"] = np.random.uniform(size=len(testv1_dataf))
testv1_dataf.head(2)
id era data_type feature_intelligence1 feature_intelligence2 feature_intelligence3 feature_intelligence4 feature_intelligence5 feature_intelligence6 feature_intelligence7 ... feature_wisdom39 feature_wisdom40 feature_wisdom41 feature_wisdom42 feature_wisdom43 feature_wisdom44 feature_wisdom45 feature_wisdom46 target prediction
0 n000315175b67977 era1 train 0.0 0.5 0.25 0.00 0.5 0.25 0.25 ... 1.0 0.75 0.5 0.75 0.50 1.0 0.50 0.75 0.50 0.612091
1 n0014af834a96cdd era1 train 0.0 0.0 0.00 0.25 0.5 0.00 0.00 ... 1.0 0.00 0.0 0.75 0.25 0.0 0.25 1.00 0.25 0.636917

2 rows × 315 columns

ft = FeatureNeutralizer(
    feature_names=test_dataf.feature_cols, pred_name="prediction", proportion=0.8
)
new_dataf = ft.transform(test_dataf)
🤖 Neutralized 'prediction' with proportion '0.8' 🤖
New neutralized column = 'prediction_neutralized_0.8'.
✅ Finished step FeatureNeutralizer. Output shape=(100, 10). Time taken for step: 0:00:00.455949. ✅
assert "prediction_neutralized_0.8" in new_dataf.prediction_cols
assert 0.0 in new_dataf.get_prediction_data["prediction_neutralized_0.8"]
assert 1.0 in new_dataf.get_prediction_data["prediction_neutralized_0.8"]

Generated columns and data can be easily retrieved for the NumerFrame.

new_dataf.prediction_cols
['prediction_A',
 'prediction_B',
 'prediction_C',
 'prediction_D',
 'prediction_E',
 'prediction',
 'prediction_geo',
 'prediction_neutralized_0.8']
new_dataf.get_prediction_data.head(3)
prediction_A prediction_B prediction_C prediction_D prediction_E prediction prediction_geo prediction_neutralized_0.8
0 0.809680 0.821740 0.673158 0.130708 0.946340 0.691955 0.560664 NaN
1 0.665325 0.402088 0.454365 0.820944 0.091936 0.374713 0.391302 NaN
2 0.404700 0.519101 0.104269 0.781825 0.263947 0.398201 0.339651 NaN

1.0.3.2. Feature Penalization


source

FeaturePenalizer

 FeaturePenalizer (max_exposure:float, feature_names:list=None,
                   pred_name:str='prediction', suffix:str=None)

Feature penalization with TensorFlow.

Source (by jrb): https://github.com/jonrtaylor/twitch/blob/master/FE_Clipping_Script.ipynb

Source of first PyTorch implementation (by Michael Oliver / mdo): https://forum.numer.ai/t/model-diagnostics-feature-exposure/899/12

:param feature_names: List of column names to reduce feature exposure. Uses all feature columns by default.

:param pred_name: Prediction column to neutralize.

:param max_exposure: Number in range [0…1] indicating how much to reduce max feature exposure to.

1.1. Numerai Classic

Postprocessing steps that are specific to Numerai Classic

# 1.1.
# No Numerai Classic specific postprocessors implemented yet.

1.2. Numerai Signals

Postprocessors that are specific to Numerai Signals.

# 1.2.
# No Numerai Signals specific postprocessors implemented yet.

2. Custom PostProcessors

As with preprocessors, there are an almost unlimited number of ways to postprocess data. We (once again) invite the Numerai community to develop Numerai Classic and Signals postprocessors.

A new Postprocessor should inherit from BasePostProcessor and implement a transform method. The transform method should take a NumerFrame as input and return a NumerFrame object as output. A template for this is given below.

To enable fancy logging output. Add the @display_processor_info decorator to the transform method.


source

AwesomePostProcessor

 AwesomePostProcessor (final_col_name:str, *args, **kwargs)

TEMPLATE - Do some awesome postprocessing.

:param final_col_name: Column name to store manipulated or ensembled predictions in.