Full API Reference
This section provides a detailed reference to all objects defined in NumerBlox.
BaseDownloader
Bases: BaseIO
Abstract base class for downloaders.
:param directory_path: Base folder to download files to.
Source code in numerblox/download.py
130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 |
|
__call__(*args, **kwargs)
The most common use case will be to get weekly inference data. So calling the class itself returns inference data.
Source code in numerblox/download.py
162 163 164 165 166 |
|
download_live_data(*args, **kwargs)
abstractmethod
Download minimal amount of files needed for weekly inference.
Source code in numerblox/download.py
144 145 146 147 |
|
download_training_data(*args, **kwargs)
abstractmethod
Download all necessary files needed for training.
Source code in numerblox/download.py
139 140 141 142 |
|
BaseIO
Bases: ABC
Basic functionality for IO (downloading and uploading).
:param directory_path: Base folder for IO. Will be created if it does not exist.
Source code in numerblox/download.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
|
get_all_files: list
property
Return all paths of contents in directory.
is_empty: bool
property
Check if directory is empty.
download_directory_from_gcs(bucket_name, gcs_path)
Copy full directory from GCS bucket to local environment. :param gcs_path: Name of directory on GCS bucket.
Source code in numerblox/download.py
61 62 63 64 65 66 67 68 69 70 71 72 73 |
|
download_file_from_gcs(bucket_name, gcs_path)
Get file from GCS bucket and download to local directory. :param gcs_path: Path to file on GCS bucket.
Source code in numerblox/download.py
38 39 40 41 42 43 44 45 46 47 48 |
|
remove_base_directory()
Remove directory with all contents.
Source code in numerblox/download.py
30 31 32 33 34 35 36 |
|
upload_directory_to_gcs(bucket_name, gcs_path)
Upload full base directory to GCS bucket. :param gcs_path: Name of directory on GCS bucket.
Source code in numerblox/download.py
75 76 77 78 79 80 81 82 83 84 85 86 |
|
upload_file_to_gcs(bucket_name, gcs_path, local_path)
Upload file to some GCS bucket. :param gcs_path: Path to file on GCS bucket.
Source code in numerblox/download.py
50 51 52 53 54 55 56 57 58 59 |
|
EODDownloader
Bases: BaseDownloader
Download data from EOD historical data.
More info: https://eodhistoricaldata.com/
Make sure you have the underlying Python package installed.
pip install eod
.
:param directory_path: Base folder to download files to.
:param key: Valid EOD client key.
:param tickers: List of valid EOD tickers (Bloomberg ticker format).
:param frequency: Choose from [d, w, m].
Daily data by default.
Source code in numerblox/download.py
471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 |
|
download_live_data()
Download one year of data for defined tickers.
Source code in numerblox/download.py
506 507 508 509 510 511 512 |
|
download_training_data(start=None)
Download full date length available. start: Starting data in %Y-%m-%d format.
Source code in numerblox/download.py
514 515 516 517 518 519 520 521 522 523 |
|
generate_full_dataf(start)
Collect all price data for list of EOD ticker symbols (Bloomberg tickers). start: Starting data in %Y-%m-%d format.
Source code in numerblox/download.py
533 534 535 536 537 538 539 540 541 542 543 544 545 |
|
generate_stock_dataf(ticker, start)
Generate Price DataFrame for a single ticker. ticker: EOD ticker symbol (Bloomberg tickers). For example, Apple stock = AAPL.US. start: Starting data in %Y-%m-%d format.
Source code in numerblox/download.py
547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 |
|
get_numerframe_data(start)
Get NumerFrame data from some starting date. start: Starting data in %Y-%m-%d format.
Source code in numerblox/download.py
525 526 527 528 529 530 531 |
|
KaggleDownloader
Bases: BaseDownloader
Download financial data from Kaggle.
For authentication, make sure you have a directory called .kaggle in your home directory with therein a kaggle.json file. kaggle.json should have the following structure:
{"username": USERNAME, "key": KAGGLE_API_KEY}
More info on authentication: github.com/Kaggle/kaggle-api#api-credentials
More info on the Kaggle Python API: kaggle.com/donkeys/kaggle-python-api
:param directory_path: Base folder to download files to.
Source code in numerblox/download.py
430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 |
|
download_live_data(kaggle_dataset_path)
Download arbitrary Kaggle dataset. :param kaggle_dataset_path: Path on Kaggle (URL slug on kaggle.com/)
Source code in numerblox/download.py
447 448 449 450 451 452 |
|
download_training_data(kaggle_dataset_path)
Download arbitrary Kaggle dataset. :param kaggle_dataset_path: Path on Kaggle (URL slug on kaggle.com/)
Source code in numerblox/download.py
454 455 456 457 458 459 460 461 |
|
NumeraiClassicDownloader
Bases: BaseDownloader
WARNING: Versions 1-3 (legacy data) are deprecated. Only supporting version 4+.
Downloading from NumerAPI for Numerai Classic data.
:param directory_path: Base folder to download files to.
All kwargs will be passed to NumerAPI initialization.
Source code in numerblox/download.py
169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 |
|
download_example_data(subfolder='', version='4.3', round_num=None)
Download all example prediction data in specified folder for given version.
:param subfolder: Specify folder to create folder within base directory root. Saves in base directory root by default. :param version: Numerai dataset version. 4 = April 2022 dataset 4.1 = Sunshine dataset 4.2 (default) = Rain Dataset 4.3 = Midnight dataset :param round_num: Numerai tournament round number. Downloads latest round by default.
Source code in numerblox/download.py
261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 |
|
download_live_data(subfolder='', version='4.3', round_num=None)
Download all live data in specified folder for given version (i.e. minimal data needed for inference).
:param subfolder: Specify folder to create folder within directory root. Saves in directory root by default. :param version: Numerai dataset version. 4 = April 2022 dataset 4.1 = Sunshine dataset 4.2 (default) = Rain Dataset 4.3 = Midnight dataset :param round_num: Numerai tournament round number. Downloads latest round by default.
Source code in numerblox/download.py
233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 |
|
download_meta_model_preds(subfolder='', filename='v4.3/meta_model.parquet')
Download Meta model predictions through NumerAPI. :param subfolder: Specify folder to create folder within base directory root. Saves in base directory root by default. :param filename: name for meta model predictions file. :return: Meta model predictions as DataFrame.
Source code in numerblox/download.py
304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 |
|
download_single_dataset(filename, dest_path, round_num=None)
Download one of the available datasets through NumerAPI.
:param filename: Name as listed in NumerAPI (Check NumerAPI().list_datasets() for full overview) :param dest_path: Full path where file will be saved. :param round_num: Numerai tournament round number. Downloads latest round by default.
Source code in numerblox/download.py
214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 |
|
download_training_data(subfolder='', version='4.3')
Get Numerai classic training and validation data. :param subfolder: Specify folder to create folder within base directory root. Saves in base directory root by default. :param version: Numerai dataset version. 4 = April 2022 dataset 4.1 = Sunshine dataset 4.2 (default) = Rain Dataset 4.3 = Midnight dataset
Source code in numerblox/download.py
191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 |
|
get_classic_features(subfolder='', filename='v4.3/features.json', *args, **kwargs)
Download feature overview (stats and feature sets) through NumerAPI and load as dict. :param subfolder: Specify folder to create folder within base directory root. Saves in base directory root by default. :param filename: name for feature overview. args, *kwargs will be passed to the JSON loader. :return: Feature overview dict
Source code in numerblox/download.py
287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 |
|
NumeraiSignalsDownloader
Bases: BaseDownloader
Support for Numerai Signals v1 parquet data.
Downloading from SignalsAPI for Numerai Signals data.
:param directory_path: Base folder to download files to.
All kwargs will be passed to SignalsAPI initialization.
Source code in numerblox/download.py
325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 |
|
download_example_data(subfolder='', version='1.0')
Download all example prediction data in specified folder for given version.
:param subfolder: Specify folder to create folder within base directory root. Saves in base directory root by default. :param version: Numerai dataset version. Currently only v1.0 is supported.
Source code in numerblox/download.py
405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 |
|
download_live_data(subfolder='', version='1.0')
Download all live data in specified folder (i.e. minimal data needed for inference).
:param subfolder: Specify folder to create folder within directory root. Saves in directory root by default. :param version: Numerai dataset version. Currently only v1.0 is supported.
Source code in numerblox/download.py
383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 |
|
download_single_dataset(filename, dest_path)
Download one of the available datasets through SignalsAPI.
:param filename: Name as listed in NumerAPI (Check NumerAPI().list_datasets() for full overview) :param dest_path: Full path where file will be saved.
Source code in numerblox/download.py
366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 |
|
download_training_data(subfolder='', version='1.0')
Get Numerai Signals training and validation data. :param subfolder: Specify folder to create folder within base directory root. Saves in base directory root by default. :param version: Numerai Signals dataset version. Currently only v1.0 is supported.
Source code in numerblox/download.py
346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 |
|
NumerFrame
Bases: DataFrame
Data structure which extends Pandas DataFrames and allows for additional Numerai specific functionality.
Source code in numerblox/numerframe.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 |
|
get_aux_data: NumerFrame
property
All columns that are not features, targets or predictions.
get_dates_from_era_col: pd.Series
property
Column of all dates from era column.
get_era_data: NumerFrame
property
Column of all eras.
get_eras_from_date_col: pd.Series
property
Column of all eras from date column.
get_feature_data: NumerFrame
property
All columns for which name starts with 'target'.
get_fncv3_feature_data: NumerFrame
property
Get FNCv3 features.
get_medium_feature_data: NumerFrame
property
Medium subset of the Numerai dataset for v4.2 data.
get_prediction_aux_data: NumerFrame
property
All predictions columns and aux columns (for ensembling, etc.).
get_prediction_data: NumerFrame
property
All columns for which name starts with 'prediction'.
get_single_target_data: NumerFrame
property
Column with name 'target' (Main Numerai target column).
get_small_feature_data: NumerFrame
property
Small subset of the Numerai dataset for v4.2 data.
get_target_data: NumerFrame
property
All columns for which name starts with 'target'.
get_unique_eras: List[str]
property
Get all unique eras in the data.
get_v2_equivalent_feature_data: NumerFrame
property
Features equivalent to the deprecated v2 Numerai data. For v4.2 data.
get_v3_equivalent_feature_data: NumerFrame
property
Features equivalent to the deprecated v3 Numerai data. For v4.2 data.
__init_meta_attrs()
Dynamically track column groups.
Source code in numerblox/numerframe.py
34 35 36 37 38 39 40 41 42 43 44 |
|
__set_era_col()
Each NumerFrame should have an era column to benefit from all functionality.
Source code in numerblox/numerframe.py
46 47 48 49 50 51 52 53 |
|
get_column_selection(cols)
Return NumerFrame from selection of columns.
Source code in numerblox/numerframe.py
55 56 57 |
|
get_date_from_era(era)
staticmethod
Get the date from a specific era. :param era: Era number for which to get date. Should be an integer which is at least 1. :return: Datetime object representing the date of the given era.
Source code in numerblox/numerframe.py
265 266 267 268 269 270 271 272 273 274 275 276 |
|
get_date_range(start_date, end_date)
Get all eras between two dates. :param start_date: Starting date (inclusive). :param end_date: Ending date (inclusive). :return: NumerFrame with all eras between start_date and end_date.
Source code in numerblox/numerframe.py
231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 |
|
get_era_batch(eras, convert_to_tf=False, aemlp_batch=False, features=None, targets=None, *args, **kwargs)
Get feature target pair batch of 1 or multiple eras.
:param eras: Selection of era names that should be present in era_col.
:param convert_to_tf: Convert to tf.Tensor.
:param aemlp_batch: Specific target batch for autoencoder training.
y
output will contain three components: features, targets and targets.
:param features: List of features to select. All by default
:param targets: List of targets to select. All by default.
args, *kwargs are passed to initialization of Tensor.
Source code in numerblox/numerframe.py
157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 |
|
get_era_from_date(date_object)
staticmethod
Get the era number from a specific date. :param date_object: Pandas Timestamp object for which to get era. :return: Era number.
Source code in numerblox/numerframe.py
252 253 254 255 256 257 258 259 260 261 262 263 |
|
get_era_range(start_era, end_era)
Get all eras between two era numbers. :param start_era: Era number to start from (inclusive). :param end_era: Era number to end with (inclusive). :return: NumerFrame with all eras between start_era and end_era.
Source code in numerblox/numerframe.py
210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 |
|
get_feature_group(group)
Get feature group based on name or list of names.
Source code in numerblox/numerframe.py
134 135 136 137 138 |
|
get_feature_target_pair(multi_target=False)
Get split of feature and target columns. :param multi_target: Returns only 'target' column by default. Returns all target columns when set to True.
Source code in numerblox/numerframe.py
147 148 149 150 151 152 153 154 155 |
|
get_last_n_eras(n)
Get data for the last n eras. Make sure eras are sorted in the way you prefer. :param n: Number of eras to select. :return: NumerFrame with last n eras.
Source code in numerblox/numerframe.py
124 125 126 127 128 129 130 131 132 |
|
get_pattern_data(pattern)
Get columns based on pattern (for example '_20' to get all 20-day Numerai targets). :param pattern: A 'like' pattern (pattern in column_name == True)
Source code in numerblox/numerframe.py
140 141 142 143 144 145 |
|
create_numerframe(file_path, columns=None, *args, **kwargs)
Convenient function to initialize NumerFrame. Support most used file formats for Pandas DataFrames
(.csv, .parquet, .xls, .pkl, etc.). For more details check https://pandas.pydata.org/docs/reference/io.html
:param file_path: Relative or absolute path to data file.
:param columns: Which columns to read (All by default).
args, *kwargs will be passed to Pandas loading function.
Source code in numerblox/numerframe.py
280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 |
|
BasePreProcessor
Bases: BaseEstimator
, TransformerMixin
Common functionality for preprocessors and postprocessors.
Source code in numerblox/preprocessing/base.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
GroupStatsPreProcessor
Bases: BasePreProcessor
WARNING: Only supported for v4.2 (Rain) data. The Rain dataset (re)introduced feature groups.
Note that this class only works with pd.DataFrame
input.
We using in a Pipeline, make sure that the Pandas output API is set (.set_output(transform="pandas")
.
Calculates group statistics for all data groups.
:param groups: Groups to create features for. All groups by default.
Source code in numerblox/preprocessing/classic.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
|
get_feature_names_out(input_features=None)
Return feature names.
Source code in numerblox/preprocessing/classic.py
56 57 58 59 60 61 62 63 64 65 66 |
|
transform(X)
Check validity and add group features.
Source code in numerblox/preprocessing/classic.py
36 37 38 39 |
|
DifferencePreProcessor
Bases: BasePreProcessor
Add difference features based on given windows. Run LagPreProcessor first.
Usage in Pipeline works only with Pandas API.
Run .set_output("pandas")
on your pipeline first.
:param windows: All lag windows to process for all features.
:param feature_names: All features for which you want to create differences. All features that also have lags by default.
:param pct_change: Method to calculate differences. If True, will calculate differences with a percentage change. Otherwise calculates a simple difference. Defaults to False
:param abs_diff: Whether to also calculate the absolute value of all differences. Defaults to True
Source code in numerblox/preprocessing/signals.py
416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 |
|
transform(X)
Create difference feature from lag features. :param X: DataFrame with lag features. NOTE: Make sure only lag features are present in the DataFrame.
Source code in numerblox/preprocessing/signals.py
439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 |
|
EraQuantileProcessor
Bases: BasePreProcessor
Transform features into quantiles by era. :param num_quantiles: Number of quantiles to use for quantile transformation. :param random_state: Random state for QuantileTransformer. :param cpu_cores: Number of CPU cores to use for parallel processing.
Source code in numerblox/preprocessing/signals.py
252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 |
|
get_feature_names_out(input_features=None)
Return feature names.
Source code in numerblox/preprocessing/signals.py
317 318 319 320 321 322 323 324 325 |
|
transform(X, era_series=None)
Quantile all features by era. :param X: Array or DataFrame containing features to be quantiled. :param era_series: Series containing era information. :return: Quantiled features.
Source code in numerblox/preprocessing/signals.py
284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 |
|
HLOCVAdjuster
Bases: BasePreProcessor
Adjust HLOCV data for splits and dividends based on ratio of unadjusted and adjusted close prices. NOTE: This step only works with DataFrame input. Usage in intermediate steps of a scikit-learn Pipeline works with the Pandas set_output API. i.e. pipeline.set_output(transform="pandas").
Source code in numerblox/preprocessing/signals.py
548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 |
|
transform(X)
Adjust open, high, low, close and volume for splits and dividends. :param X: DataFrame with columns: [high, low, open, close, volume] (HLOCV) :return: Array with adjusted HLOCV columns
Source code in numerblox/preprocessing/signals.py
573 574 575 576 577 578 579 580 581 582 583 584 |
|
KatsuFeatureGenerator
Bases: BasePreProcessor
Effective feature engineering setup based on Katsu's starter notebook. Based on source by Katsu1110: https://www.kaggle.com/code1110/numeraisignals-starter-for-beginners
:param windows: Time interval to apply for window features:
-
Percentage Rate of change
-
Volatility
-
Moving Average gap
:param ticker_col: Columns with tickers to iterate over.
:param close_col: Column name where you have closing price stored. :param num_cores: Number of cores to use for multiprocessing.
:param verbose: Print additional information.
Source code in numerblox/preprocessing/signals.py
113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 |
|
__ema1(series, span)
staticmethod
Exponential moving average
Source code in numerblox/preprocessing/signals.py
226 227 228 229 230 |
|
feature_engineering(dataf)
Feature engineering for single ticker.
Source code in numerblox/preprocessing/signals.py
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 |
|
get_feature_names_out(input_features=None)
Return feature names.
Source code in numerblox/preprocessing/signals.py
232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 |
|
transform(dataf)
Multiprocessing feature engineering.
:param dataf: DataFrame with columns: [ticker, date, open, high, low, close, volume]
Source code in numerblox/preprocessing/signals.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 |
|
LagPreProcessor
Bases: BasePreProcessor
Add lag features based on given windows.
:param windows: All lag windows to process for all features.
[5, 10, 15, 20] by default (4 weeks lookback)
Source code in numerblox/preprocessing/signals.py
373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 |
|
get_feature_names_out(input_features=None)
Return feature names.
Source code in numerblox/preprocessing/signals.py
411 412 413 |
|
MinimumDataFilter
Bases: BasePreProcessor
Filter dates and tickers based on minimum data requirements. NOTE: This step only works with DataFrame input.
:param min_samples_date: Minimum number of samples per date. Defaults to 200. :param min_samples_ticker: Minimum number of samples per ticker. Defaults to 1200. :param blacklist_tickers: List of tickers to exclude from the dataset. Defaults to None. :param date_col: Column name for date. Defaults to "date". :param ticker_col: Column name for ticker. Defaults to "bloomberg_ticker".
Source code in numerblox/preprocessing/signals.py
590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 |
|
transform(X)
Filter dates and tickers based on minimum data requirements. :param X: DataFrame with columns: [ticker_col, date_col, open, high, low, close, volume] (HLOCV) :return: Array with filtered DataFrame
Source code in numerblox/preprocessing/signals.py
614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 |
|
PandasTaFeatureGenerator
Bases: BasePreProcessor
Generate features with pandas-ta.
https://github.com/twopirllc/pandas-ta
Usage in Pipeline works only with Pandas API.
Run .set_output("pandas")
on your pipeline first.
:param strategy: Valid Pandas Ta strategy.
For more information on creating a strategy, see:
https://github.com/twopirllc/pandas-ta#pandas-ta-strategy
By default, a strategy with RSI(14) and RSI(60) is used.
:param ticker_col: Column name for grouping by tickers.
:param num_cores: Number of cores to use for multiprocessing.
By default, all available cores are used.
Source code in numerblox/preprocessing/signals.py
470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 |
|
add_features(ticker_df)
The TA strategy is applied to the DataFrame here. :param ticker_df: DataFrame for a single ticker. :return: DataFrame with features added.
Source code in numerblox/preprocessing/signals.py
532 533 534 535 536 537 538 539 540 541 542 |
|
transform(X)
Main feature generation method.
:param X: DataFrame with columns: [ticker, date, open, high, low, close, volume]
:return: PandasTA features
Source code in numerblox/preprocessing/signals.py
498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 |
|
ReduceMemoryProcessor
Bases: BasePreProcessor
Reduce memory usage as much as possible.
Credits to kainsama and others for writing about memory usage reduction for Numerai data: https://forum.numer.ai/t/reducing-memory/313
:param deep_mem_inspect: Introspect the data deeply by interrogating object dtypes. Yields a more accurate representation of memory usage if you have complex object columns. :param verbose: Print memory usage before and after optimization.
Source code in numerblox/preprocessing/signals.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 |
|
get_feature_names_out(input_features=None)
Return feature names.
Source code in numerblox/preprocessing/signals.py
108 109 110 |
|
TickerMapper
Bases: BasePreProcessor
Map ticker from one format to another.
:param ticker_col: Column used for mapping. Must already be present in the input data.
:param target_ticker_format: Format to map tickers to. Must be present in the ticker map.
For default mapper supported ticker formats are: ['ticker', 'bloomberg_ticker', 'yahoo']
:param mapper_path: Path to CSV file containing at least ticker_col and target_ticker_format columns.
Can be either a web link of local path. Numerai Signals mapping by default.
Source code in numerblox/preprocessing/signals.py
328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 |
|
transform(X)
Transform ticker column. :param X: Ticker column :return tickers: Mapped tickers
Source code in numerblox/preprocessing/signals.py
360 361 362 363 364 365 366 367 |
|
BaseTargetProcessor
Bases: BaseEstimator
, TransformerMixin
Common functionality for preprocessors and postprocessors.
Source code in numerblox/targets.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
|
BayesianGMMTargetProcessor
Bases: BaseTargetProcessor
Generate synthetic (fake) target using a Bayesian Gaussian Mixture model.
Based on Michael Oliver's GitHub Gist implementation:
https://gist.github.com/the-moliver/dcdd2862dc2c78dda600f1b449071c93
:param n_components: Number of components for fitting Bayesian Gaussian Mixture Model.
Source code in numerblox/targets.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
|
fit(X, y, era_series)
Fit Bayesian Gaussian Mixture model on coefficients and normalize. :param X: DataFrame containing features. :param y: Series containing real target. :param era_series: Series containing era information.
Source code in numerblox/targets.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
|
get_feature_names_out(input_features=None)
Return feature names.
Source code in numerblox/targets.py
127 128 129 |
|
transform(X, era_series)
Main method for generating fake target. :param X: DataFrame containing features. :param era_series: Series containing era information.
Source code in numerblox/targets.py
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
|
SignalsTargetProcessor
Bases: BaseTargetProcessor
Engineer targets for Numerai Signals.
More information on implements Numerai Signals targets:
https://forum.numer.ai/t/decoding-the-signals-target/2501
:param price_col: Column from which target will be derived.
:param windows: Timeframes to use for engineering targets. 10 and 20-day by default.
:param bins: Binning used to create group targets. Nomi binning by default.
:param labels: Scaling for binned target. Must be same length as resulting bins (bins-1). Numerai labels by default.
Source code in numerblox/targets.py
132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 |
|
get_feature_names_out(input_features=None)
Return feature names of Signals targets.
Source code in numerblox/targets.py
177 178 179 180 181 182 183 184 185 186 187 |
|
CrossValEstimator
Bases: BaseEstimator
, TransformerMixin
Split your data into multiple folds and fit an estimator on each fold.
For transforms predictions are concatenated into a 2D array.
:param cv: Cross validation object that follows scikit-learn conventions.
:param estimator: Estimator to fit on each fold.
:param evaluation_func: Custom evaluation logic that is executed on validation data for each fold. Must accepts as input y_true and y_pred.
For example, evaluation_func can handle logging metrics for each fold.
Anything that evaluation_func returns is stored in self.eval_results_
.
:param predict_func: Name of the function that will be used for prediction.
Must be one of 'predict', 'predict_proba', 'predict_log_proba'.
For example, XGBRegressor has 'predict' and 'predict_proba' functions.
:param verbose: Whether to print progress.
Source code in numerblox/meta.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 |
|
__sklearn_is_fitted__()
Check fitted status.
Source code in numerblox/meta.py
198 199 200 201 |
|
fit(X, y, **kwargs)
Use cross validation object to fit estimators.
Source code in numerblox/meta.py
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 |
|
transform(X, model_idxs=None, **kwargs)
Use cross validation object to transform estimators. :param X: Input data for inference. :param y: Target data for inference. :param model_idxs: List of indices of models to use for inference. By default, all fitted models are used. :param kwargs: Additional arguments to pass to the estimator's predict function.
Source code in numerblox/meta.py
148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 |
|
MetaEstimator
Bases: BaseEstimator
, TransformerMixin
, MetaEstimatorMixin
Helper for NumeraiPipeline and NumeraiFeatureUnion to use a model as a transformer.
:param estimator: Underlying estimator like XGBoost, Catboost, scikit-learn, etc. :param predict_func: Name of the function that will be used for prediction. Must be one of 'predict', 'predict_proba', 'predict_log_proba'. For example, XGBRegressor has 'predict' and 'predict_proba' functions. :param model_type: "regressor" or "classifier". Used to determine if the estimator is multi output.
Source code in numerblox/meta.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
|
fit(X, y, **kwargs)
Fit underlying estimator and set attributes.
Source code in numerblox/meta.py
42 43 44 45 46 47 48 49 50 51 |
|
predict(X, **kwargs)
For if a MetaEstimator happens to be the last step in the pipeline. Has same behavior as transform.
Source code in numerblox/meta.py
64 65 66 67 68 |
|
transform(X, **kwargs)
Apply the predict_func
on the fitted estimator.
Shape (X.shape[0], )
if estimator is not multi output and else (X.shape[0], y.shape[1])
.
All additional kwargs are passed to the underlying estimator's predict function.
Source code in numerblox/meta.py
53 54 55 56 57 58 59 60 61 62 |
|
MetaPipeline
Bases: Pipeline
Pipeline which turns all estimators into transformers by wrapping them in MetaEstimator. This allows to have pipeline steps after models. For example, a FeatureNeutralizer after an XGBRegressor.
:param steps: List of (name, transform) tuples (implementing fit/transform) that are chained, in the order in which they are chained, with the last object an instance of BaseNeutralizer. :param memory: Used to cache the fitted transformers of the pipeline. :param verbose: If True, the time elapsed while fitting each step will be printed as it is completed. :param predict_func: Name of the function that will be used for prediction.
Source code in numerblox/meta.py
204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 |
|
wrap_estimators_as_transformers(steps)
Converts all estimator steps (except the last step) into transformers by wrapping them in MetaEstimator. :param steps: List of (name, transform) tuples specifying the pipeline steps. :return: Modified steps with all estimators wrapped as transformers.
Source code in numerblox/meta.py
223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 |
|
make_meta_pipeline(*steps, memory=None, verbose=False)
Convenience function for creating a MetaPipeline. :param steps: List of (name, transform) tuples (implementing fit/transform) that are chained, in the order in which they are chained, with the last object an instance of BaseNeutralizer. :param memory: Used to cache the fitted transformers of the pipeline. :param verbose: If True, the time elapsed while fitting each step will be printed as it is completed.
Source code in numerblox/meta.py
265 266 267 268 269 270 271 272 |
|
BaseNeutralizer
Bases: BaseEstimator
, TransformerMixin
Base class for neutralization so it is compatible with scikit-learn. :param new_col_name: Name of new neutralized column.
Source code in numerblox/neutralizers.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
|
fit_transform(X, features, era_series=None)
Convenience function for scikit-learn compatibility. Needed because fit and transform except different arguments here.
Source code in numerblox/neutralizers.py
40 41 42 43 44 45 |
|
get_feature_names_out(input_features=None)
Get feature names for neutralized output.
:param input_features: Optional list of input feature names. :return: List of feature names for neutralized output.
Source code in numerblox/neutralizers.py
47 48 49 50 51 52 53 54 |
|
predict(X, features, era_series=None)
Convenience function for scikit-learn compatibility.
Source code in numerblox/neutralizers.py
36 37 38 |
|
FeatureNeutralizer
Bases: BaseNeutralizer
Classic feature neutralization by subtracting a linear model.
:param pred_name: Name of prediction column. For creating the new column name. :param proportion: Number in range [0...1] indicating how much to neutralize. :param suffix: Optional suffix that is added to new column name. :param num_cores: Number of cores to use for parallel processing. By default, all CPU cores are used.
Source code in numerblox/neutralizers.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 |
|
neutralize(dataf, columns, by, proportion)
Neutralize on CPU. :param dataf: DataFrame with features and predictions. :param columns: List of prediction column names. :param by: List of feature column names. :param proportion: Proportion to neutralize. :return: Neutralized predictions.
Source code in numerblox/neutralizers.py
151 152 153 154 155 156 157 158 159 160 161 162 163 |
|
normalize(dataf)
staticmethod
Normalize predictions. 1. Rank predictions. 2. Normalize ranks. 3. Gaussianize ranks. :param dataf: DataFrame with predictions. :return: Gaussianized rank predictions.
Source code in numerblox/neutralizers.py
165 166 167 168 169 170 171 172 173 174 175 176 |
|
normalize_and_neutralize(dataf, columns, by, proportion)
Gaussianize predictions and neutralize with one combination of prediction and proportion. :param dataf: DataFrame with features and predictions. :param columns: List of prediction column names. :param by: List of feature column names. :param proportion: Proportion to neutralize. :return: Neutralized predictions DataFrame.
Source code in numerblox/neutralizers.py
178 179 180 181 182 183 184 185 186 187 188 189 190 191 |
|
transform(X, features, era_series=None)
Main transform function. :param X: Input predictions to neutralize.
:param features: DataFrame with features for neutralization.
:param era_series: Series with era labels for each row in features.
Features, era_series and the prediction column must all have the same length. :return: Neutralized predictions NumPy array.
Source code in numerblox/neutralizers.py
93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
|
BasePenalizer
Bases: BaseEstimator
, TransformerMixin
Base class for penalization so it is compatible with scikit-learn. :param new_col_name: Name of new neutralized column.
Source code in numerblox/penalizers.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
|
fit_transform(X, features, era_series)
Convenience function for scikit-learn compatibility. Needed because fit and transform except different arguments here.
Source code in numerblox/penalizers.py
43 44 45 46 47 48 |
|
get_feature_names_out(input_features=None)
Get feature names for neutralized output.
:param input_features: Optional list of input feature names. :return: List of feature names for neutralized output.
Source code in numerblox/penalizers.py
50 51 52 53 54 55 56 57 |
|
predict(X, features, era_series)
Convenience function for scikit-learn compatibility.
Source code in numerblox/penalizers.py
39 40 41 |
|
FeaturePenalizer
Bases: BasePenalizer
Feature penalization with TensorFlow.
Source (by jrb): https://github.com/jonrtaylor/twitch/blob/master/FE_Clipping_Script.ipynb
Source of first PyTorch implementation (by Michael Oliver / mdo): https://forum.numer.ai/t/model-diagnostics-feature-exposure/899/12
:param max_exposure: Number in range [0...1] indicating how much to reduce max feature exposure to. :param pred_name: Prediction column name. Used for new column name.
:param suffix: Optional suffix that is added to new column name.
Source code in numerblox/penalizers.py
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 |
|
transform(X, features, era_series)
Main transform method. :param X: Input predictions to neutralize. :param features: DataFrame with features for neutralization. :param era_series: Series with era labels for each row in features. Features, eras and the prediction column must all have the same length. :return: Penalized predictions.
Source code in numerblox/penalizers.py
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
|
BasePredictionLoader
Bases: BaseEstimator
, TransformerMixin
Shared functionality for all Prediction Loaders.
Source code in numerblox/prediction_loaders.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
|
get_feature_names_out(input_features=None)
abstractmethod
Return feature names.
Source code in numerblox/prediction_loaders.py
22 23 24 25 |
|
transform(X=None, y=None)
abstractmethod
Return Predictions generated by model.
Source code in numerblox/prediction_loaders.py
17 18 19 20 |
|
ExamplePredictions
Bases: BasePredictionLoader
Load example predictions. :param file_name: File to download from NumerAPI. By default this is example predictions for v4.2 data. 'v4.2/live_example_preds.parquet' by default. Example predictions in previous versions: - v4.2. validation examples -> "v4.2/validation_example_preds.parquet" - v4.2. live benchmark models -> "v4.2/live_benchmark_models.parquet" - v4.2. validation benchmark models -> "v4.2/validation_benchmark_models.parquet" :param round_num: Optional round number. Downloads most recent round by default. :param keep_files: Whether to keep downloaded files. By default, files are deleted after the predictions are loaded.
Source code in numerblox/prediction_loaders.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
|
transform(X=None, y=None)
Return example predictions.
Source code in numerblox/prediction_loaders.py
48 49 50 51 52 53 54 |
|
BaseEvaluator
Evaluation functionality that is relevant for both Numerai Classic and Numerai Signals.
Metrics include: - Mean, Standard Deviation and Sharpe (Corrv2) for era returns. - Max drawdown. - Annual Percentage Yield (APY). - Correlation with benchmark predictions. - Max feature exposure: https://forum.numer.ai/t/model-diagnostics-feature-exposure/899. - Feature Neutral Mean, Standard deviation and Sharpe: https://docs.numer.ai/tournament/feature-neutral-correlation. - Smart Sharpe - Exposure Dissimilarity: https://forum.numer.ai/t/true-contribution-details/5128/4. - Autocorrelation (1st order). - Calmar Ratio. - Performance vs. Benchmark predictions. - Mean, Standard Deviation and Sharpe for TB200 (Buy top 200 stocks and sell bottom 200 stocks). - Mean, Standard Deviation and Sharpe for TB500 (Buy top 500 stocks and sell bottom 500 stocks).
:param metrics_list: List of metrics to calculate. Default: FAST_METRICS. :param era_col: Column name pointing to eras. Most commonly "era" for Numerai Classic and "date" for Numerai Signals. :param custom_functions: Additional functions called in evaluation. Check out the NumerBlox docs on evaluation for more info on using custom functions. :param show_detailed_progress_bar: Show detailed progress bar for evaluation of each prediction column.
Note that we calculate the sample standard deviation with ddof=0. It may differ slightly from the standard Pandas calculation, but is consistent with how NumPy computes standard deviation. More info: https://stackoverflow.com/questions/24984178/different-std-in-pandas-vs-numpy
Source code in numerblox/evaluation.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 |
|
apy(era_corrs, stake_compounding_lag=4)
staticmethod
Annual percentage yield. :param era_corrs: Correlation scores by era :param stake_compounding_lag: Compounding lag for Numerai rounds (4 for Numerai Classic)
Source code in numerblox/evaluation.py
466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 |
|
autocorr1(era_corrs)
1st order autocorrelation. :param era_corrs: Correlation scores by era.
Source code in numerblox/evaluation.py
768 769 770 771 772 773 |
|
autocorr_penalty(era_corrs)
Adjusting factor for autocorrelation. Used in Smart Sharpe. :param era_corrs: Correlation scores by era.
Source code in numerblox/evaluation.py
758 759 760 761 762 763 764 765 766 |
|
contributive_correlation(dataf, pred_col, target_col, other_col)
Calculate the contributive correlation of the given predictions wrt the given meta model. see: https://docs.numer.ai/numerai-tournament/scoring/meta-model-contribution-mmc-and-bmc
Uses Numerai's official scoring function for contribution under the hood. See: https://github.com/numerai/numerai-tools/blob/master/numerai_tools/scoring.py
Calculate contributive correlation by: 1. tie-kept ranking each prediction and the meta model 2. gaussianizing each prediction and the meta model 3. orthogonalizing each prediction wrt the meta model 3.5. scaling the targets to buckets [-2, -1, 0, 1, 2] 4. dot product the orthogonalized predictions and the targets then normalize by the length of the target (equivalent to covariance)
:param dataf: DataFrame containing era_col, pred_col, target_col and other_col. :param pred_col: Prediction column to calculate MMC for. :param target_col: Target column to calculate MMC against. Make sure the range of targets is [0, 1] (inclusive). If the function is called from full_evalation, this is guaranteed because of the checks. :param other_col: Meta model column containing predictions to neutralize against.
:return: A 1D NumPy array of contributive correlations by era.
Source code in numerblox/evaluation.py
803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 |
|
cross_correlation(dataf, pred_col, other_col)
Corrv2 correlation with other predictions (like another model, example predictions or meta model prediction). :param dataf: DataFrame containing both pred_col and other_col. :param pred_col: Main Prediction. :param other_col: Other prediction column to calculate correlation with pred_col.
:return: Correlation between Corrv2's of pred_col and other_col.
Source code in numerblox/evaluation.py
485 486 487 488 489 490 491 492 493 494 495 496 497 498 |
|
evaluation_one_col(dataf, feature_cols, pred_col, target_col, benchmark_cols=None)
Perform evaluation for one prediction column against given target and benchmark column(s).
Source code in numerblox/evaluation.py
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 |
|
exposure_dissimilarity(dataf, pred_col, other_col, corr_method='pearson')
Model pattern of feature exposure to the another column. See TC details forum post: https://forum.numer.ai/t/true-contribution-details/5128/4 :param dataf: DataFrame containing both pred_col and other_col. :param pred_col: Main Prediction. :param other_col: Other prediction column to calculate exposure dissimilarity against. :param corr_method: Correlation method to use for calculating feature exposures. corr_method should be one of ['pearson', 'kendall', 'spearman']. Default: 'pearson'.
Source code in numerblox/evaluation.py
550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 |
|
feature_neutral_mean_std_sharpe(dataf, pred_col, target_col, feature_names)
Feature neutralized mean performance. More info: https://docs.numer.ai/tournament/feature-neutral-correlation
Source code in numerblox/evaluation.py
510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 |
|
full_evaluation(dataf, pred_cols, target_col='target', benchmark_cols=None)
Perform evaluation for each prediction column in pred_cols. By default only the "prediction" column is evaluated. Evaluation is done against given target and benchmark prediction column. :param dataf: DataFrame containing era_col, pred_cols, target_col and optional benchmark_cols. :param pred_cols: List of prediction columns to calculate evaluation metrics for. :param target_col: Target column to evaluate against. :param benchmark_cols: Optional list of benchmark columns to calculate evaluation metrics for.
Source code in numerblox/evaluation.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
|
get_feature_exposures_corrv2(dataf, pred_col, feature_list, cpu_cores=-1)
Calculate feature exposures for each era using 'Numerai Corr'. Results will be similar to get_feature_exposures() but more accurate. This method will take longer to compute.
:param dataf: DataFrame containing predictions, features, and eras. :param pred_col: Prediction column to calculate feature exposures for. :param feature_list: List of feature columns in X. :param cpu_cores: Number of CPU cores to use for parallelization. Default: -1 (all cores). :return: DataFrame with Corrv2 feature exposures by era for each feature.
Source code in numerblox/evaluation.py
704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 |
|
get_feature_exposures_pearson(dataf, pred_col, feature_list, cpu_cores=-1)
Calculate feature exposures for each era using Pearson correlation.
:param dataf: DataFrame containing predictions, features, and eras. :param pred_col: Prediction column to calculate feature exposures for. :param feature_list: List of feature columns in X. :param cpu_cores: Number of CPU cores to use for parallelization. :return: DataFrame with Pearson feature exposures by era for each feature.
Source code in numerblox/evaluation.py
658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 |
|
legacy_contribution(dataf, pred_col, target_col, other_col)
Legacy contibution mean, standard deviation and sharpe ratio. More info: https://forum.numer.ai/t/mmc2-announcement/93
:param dataf: DataFrame containing era_col, pred_col, target_col and other_col. :param pred_col: Prediction column to calculate MMC for. :param target_col: Target column to calculate MMC against. :param other_col: Meta model column containing predictions to neutralize against.
:return: List of legacy contribution scores by era.
Source code in numerblox/evaluation.py
775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 |
|
max_drawdown(era_corrs)
staticmethod
Maximum drawdown per era.
Source code in numerblox/evaluation.py
455 456 457 458 459 460 461 462 463 464 |
|
max_feature_exposure(dataf, feature_cols, pred_col)
Maximum exposure over all features.
Source code in numerblox/evaluation.py
500 501 502 503 504 505 506 507 508 |
|
mean_std_sharpe(era_corrs)
Average, standard deviation and Sharpe ratio for correlations per era.
Source code in numerblox/evaluation.py
419 420 421 422 423 424 425 426 427 428 429 |
|
numerai_corr(dataf, pred_col, target_col)
Computes 'Numerai Corr' aka 'Corrv2'. More info: https://forum.numer.ai/t/target-cyrus-new-primary-target/6303
Assumes original target col as input (i.e. in [0, 1] range).
Source code in numerblox/evaluation.py
431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 |
|
per_era_corrs(dataf, pred_col, target_col)
Correlation between prediction and target for each era.
Source code in numerblox/evaluation.py
401 402 403 404 405 406 407 408 409 |
|
per_era_numerai_corrs(dataf, pred_col, target_col)
Numerai Corr between prediction and target for each era.
Source code in numerblox/evaluation.py
411 412 413 414 415 416 417 |
|
plot_correlations(dataf, pred_cols, corr_cols=None, target_col='target', roll_mean=20)
Plot per era correlations over time. :param dataf: DataFrame that contains at least all pred_cols, target_col and corr_cols. :param pred_cols: List of prediction columns to calculate per era correlations for and plot. :param corr_cols: Per era correlations already prepared to include in the plot. This is optional for if you already have per era correlations prepared in your input dataf. :param target_col: Target column name to compute per era correlations against. :param roll_mean: How many eras should be averaged to compute a rolling score.
Source code in numerblox/evaluation.py
861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 |
|
smart_sharpe(era_corrs)
Sharpe adjusted for autocorrelation. :param era_corrs: Correlation scores by era
Source code in numerblox/evaluation.py
749 750 751 752 753 754 755 756 |
|
tbx_mean_std_sharpe(dataf, pred_col, target_col, tb=200)
Calculate Mean, Standard deviation and Sharpe ratio when we focus on the x top and x bottom predictions. :param tb: How many of top and bottom predictions to focus on. TB200 and TB500 are the most common situations.
Source code in numerblox/evaluation.py
536 537 538 539 540 541 542 543 544 545 546 547 548 |
|
NumeraiClassicEvaluator
Bases: BaseEvaluator
Evaluator for all metrics that are relevant in Numerai Classic.
Source code in numerblox/evaluation.py
941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 |
|
NumeraiSignalsEvaluator
Bases: BaseEvaluator
Evaluator for all metrics that are relevant in Numerai Signals.
Source code in numerblox/evaluation.py
1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 |
|
__await_diagnostics(api, model_id, diagnostics_id, timeout_min, interval_sec=15)
staticmethod
Wait for diagnostics to be uploaded. Try every 'interval_sec' seconds until 'timeout_min' minutes have passed.
Source code in numerblox/evaluation.py
1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 |
|
get_diagnostics(val_dataf, model_name, key, timeout_min=2, col='validationFncV4')
Retrieved neutralized validation correlation by era.
Calculated on Numerai servers.
:param val_dataf: A DataFrame containing prediction, date, ticker and data_type columns.
data_type column should contain 'validation' instances.
:param model_name: Any model name for which you have authentication credentials.
:param key: Key object to authenticate upload of diagnostics.
:param timeout_min: How many minutes to wait on diagnostics Computing on Numerai servers before timing out.
:param col: Which column to return. Should be one of ['validationCorrV4', 'validationFncV4', 'validationIcV2', 'validationRic']. If None, all columns will be returned.
2 minutes by default.
:return: Pandas Series with era as index and neutralized validation correlations (validationCorr).
Source code in numerblox/evaluation.py
1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 |
|
BaseSubmitter
Bases: BaseIO
Basic functionality for submitting to Numerai. Uses numerapi under the hood. More info: https://numerapi.readthedocs.io/
:param directory_path: Directory to store and read submissions from. :param api: NumerAPI or SignalsAPI :param max_retries: Maximum number of retries for uploading predictions to Numerai. :param sleep_time: Time to sleep between uploading retries. :param fail_silently: Whether to skip uploading to Numerai without raising an error. Useful for if you are uploading many models in a loop and want to skip models that fail to upload.
Source code in numerblox/submission.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 |
|
get_model_mapping: dict
property
Mapping between raw model names and model IDs.
__call__(dataf, model_name, file_name='submission.csv', cols='prediction', *args, **kwargs)
The most common use case will be to create a CSV and submit it immediately after that. full_submission handles this.
Source code in numerblox/submission.py
162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 |
|
combine_csvs(csv_paths, aux_cols, era_col=None, pred_col='prediction')
Read in csv files and combine all predictions with a rank mean.
Multi-target predictions will be averaged out.
:param csv_paths: List of full paths to .csv prediction files.
:param aux_cols: ['id'] for Numerai Classic.
['ticker', 'last_friday', 'data_type'], for example, with Numerai Signals.
:param era_col: Column indicating era ('era' or 'last_friday').
Will be used for Grouping the rank mean if given. Skip groupby if no era_col provided.
:param pred_col: 'prediction' for Numerai Classic and 'signal' for Numerai Signals.
Source code in numerblox/submission.py
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
|
full_submission(dataf, model_name, cols, file_name='submission.csv', *args, **kwargs)
Save DataFrame to csv and upload predictions through API.
:param dataf: Main DataFrame containing cols
.
:param model_name: Lowercase Numerai model name.
:param file_name: path to save model to relative to base directory.
:param cols: Columns to be saved in submission file.
1 prediction column for Numerai Classic.
At least 1 prediction column and 1 ticker column for Numerai Signals.
args, *kwargs are passed to numerapi API.
For example version
argument in Numerai Classic submissions.
Source code in numerblox/submission.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
|
save_csv(dataf, file_name, cols, *args, **kwargs)
abstractmethod
For Numerai Classic: Save index column + 'cols' (targets) to CSV. For Numerai Signals: Save ticker, date, data_type and signal columns to CSV.
Source code in numerblox/submission.py
35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
|
upload_predictions(file_name, model_name, *args, **kwargs)
Upload CSV file to Numerai for given model name. :param file_name: File name/path relative to directory_path. :param model_name: Lowercase raw model name (For example, 'integration_test').
Source code in numerblox/submission.py
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
|
NumerBaySubmitter
Bases: BaseSubmitter
Submit to NumerBay to fulfill sale orders, in addition to submission to Numerai.
:param tournament_submitter: Base tournament submitter (NumeraiClassicSubmitter or NumeraiSignalsSubmitter). This submitter will use the same directory path. :param upload_to_numerai: Whether to also submit to Numerai using the tournament submitter. Defaults to True, set to False to only upload to NumerBay. :param numerbay_username: NumerBay username :param numerbay_password: NumerBay password
Source code in numerblox/submission.py
306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 |
|
__call__(dataf, model_name, numerbay_product_full_name, file_name='submission.csv', cols='prediction', *args, **kwargs)
The most common use case will be to create a CSV and submit it immediately after that. full_submission handles this.
Source code in numerblox/submission.py
400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 |
|
full_submission(dataf, model_name, cols, numerbay_product_full_name, file_name='submission.csv', *args, **kwargs)
Save DataFrame to csv and upload predictions through API.
:param dataf: Main DataFrame containing cols
.
:param model_name: Lowercase Numerai model name.
:param numerbay_product_full_name: NumerBay product full name in the format of [category]-[product name], e.g. 'numerai-predictions-numerbay'
:param file_name: path to save model to relative to base directory.
:param cols: Columns to be saved in submission file.
1 prediction column for Numerai Classic.
At least 1 prediction column and 1 ticker column for Numerai Signals.
args, *kwargs are passed to numerapi API.
For example version
argument in Numerai Classic submissions.
Source code in numerblox/submission.py
361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 |
|
upload_predictions(file_name, model_name, numerbay_product_full_name, *args, **kwargs)
Upload CSV file to NumerBay (and Numerai if 'upload_to_numerai' is True) for given model name and NumerBay product full name. :param file_name: File name/path relative to directory_path. :param model_name: Lowercase raw model name (For example, 'integration_test'). :param numerbay_product_full_name: NumerBay product full name in the format of [category]-[product name], e.g. 'numerai-predictions-numerbay'
Source code in numerblox/submission.py
330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 |
|
NumeraiClassicSubmitter
Bases: BaseSubmitter
Submit for Numerai Classic.
:param directory_path: Base directory to save and read prediction files from.
:param key: Key object containing valid credentials for Numerai Classic.
:param max_retries: Maximum number of retries for uploading predictions to Numerai. :param sleep_time: Time to sleep between uploading retries. :param fail_silently: Whether to skip uploading to Numerai without raising an error. Useful for if you are uploading many models in a loop and want to skip models that fail to upload. args, *kwargs will be passed to NumerAPI initialization.
Source code in numerblox/submission.py
185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 |
|
save_csv(dataf, file_name='submission.csv', cols='prediction', *args, **kwargs)
:param dataf: DataFrame which should have at least the following columns: 1. id (as index column) 2. cols (for example, 'prediction_mymodel'). Will be saved in 'prediction' column :param file_name: .csv file path. :param cols: Prediction column name. For example, 'prediction' or 'prediction_mymodel'.
Source code in numerblox/submission.py
207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 |
|
NumeraiSignalsSubmitter
Bases: BaseSubmitter
Submit for Numerai Signals.
:param directory_path: Base directory to save and read prediction files from.
:param key: Key object containing valid credentials for Numerai Signals.
:param max_retries: Maximum number of retries for uploading predictions to Numerai. :param sleep_time: Time to sleep between uploading retries. :param fail_silently: Whether to skip uploading to Numerai without raising an error. Useful for if you are uploading many models in a loop and want to skip models that fail to upload. args, *kwargs will be passed to SignalsAPI initialization.
Source code in numerblox/submission.py
234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 |
|
save_csv(dataf, cols, file_name='submission.csv', *args, **kwargs)
:param dataf: DataFrame which should have at least the following columns: 1. One of supported ticker formats (cusip, sedol, ticker, numerai_ticker or bloomberg_ticker) 2. signal (Values between 0 and 1 (exclusive)) Additional columns for if you include validation data (optional): 3. date (YYYY-MM-DD format date indication) 4. data_type ('val' and 'live' partitions)
:param cols: All cols that are saved in CSV. cols should contain at least 1 ticker column and a 'signal' column. For example: ['bloomberg_ticker', 'signal'] :param file_name: .csv file path.
Source code in numerblox/submission.py
265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 |
|