Installation & Quick Start
Installation
Quick Start
centimators
transformers are dataframe-agnostic, powered by narwhals.
You can use the same transformer (like RankTransformer
) seamlessly with both Pandas and Polars DataFrames. This transformer calculates the normalized rank of features within each date group.
First, let's define some common data:
import pandas as pd
import polars as pl
# Create sample OHLCV data for two stocks over four trading days
data = {
'date': ['2021-01-01', '2021-01-01', '2021-01-02', '2021-01-02',
'2021-01-03', '2021-01-03', '2021-01-04', '2021-01-04'],
'ticker': ['AAPL', 'MSFT', 'AAPL', 'MSFT', 'AAPL', 'MSFT', 'AAPL', 'MSFT'],
'open': [150.0, 280.0, 151.0, 282.0, 152.0, 283.0, 153.0, 284.0], # Opening prices
'high': [152.0, 282.0, 153.0, 284.0, 154.0, 285.0, 155.0, 286.0], # Daily highs
'low': [149.0, 278.0, 150.0, 280.0, 151.0, 281.0, 152.0, 282.0], # Daily lows
'close': [151.0, 281.0, 152.0, 283.0, 153.0, 284.0, 154.0, 285.0], # Closing prices
'volume': [1000000, 800000, 1200000, 900000, 1100000, 850000, 1050000, 820000] # Trading volume
}
# Create both Pandas and Polars DataFrames
df_pd = pd.DataFrame(data)
df_pl = pl.DataFrame(data)
# Define the OHLCV features we want to transform
feature_cols = ['volume', 'close']
Now, let's use the transformer:
from centimators.feature_transformers import RankTransformer
transformer = RankTransformer(feature_names=feature_cols)
result_pd = transformer.fit_transform(df_pd[feature_cols], date_series=df_pd['date'])
result_pl = transformer.fit_transform(df_pl[feature_cols], date_series=df_pl['date'])
Both result_pd
(from Pandas) and result_pl
(from Polars) will contain the same transformed data in their native DataFrame formats. You may find significant performance gains using Polars for certain operations.