Skip to content

A blazingly fast package for quantitative finance analysis using Polars

License

Notifications You must be signed in to change notification settings

matthewgson/quantpolars

Repository files navigation

QuantPolars

A Python package for quantitative finance analysis using Polars, providing blazingly fast tools for data summarization and option pricing.

Installation

pip3 install git+https://github.com/matthewgson/quantpolars.git

Requirements: Python 3.8+, Polars

Data Summary Function (sm)

Generate comprehensive summary statistics for all columns in your DataFrame with a single function call. Returns a Polars DataFrame with summary statistics that can be optionally converted to styled GT tables.

Features

  • Blazingly Fast: Single-pass computation using Polars expressions
  • Type-Aware: Different statistics based on data type (numeric, date, categorical)
  • Missing Data: Includes percentage of missing values for each column
  • Simple API: Returns DataFrame directly, convert to GT styling when needed
  • Styled Output: Optional Great Tables formatting for beautiful HTML tables
  • LazyFrame Support: Works with both eager and lazy evaluation

Basic Usage

importpolarsasplfromdatetimeimportdatefromquantpolarsimportsm# Create sample datadf=pl.DataFrame({'revenue': [1000, 2500, 1800, 3200, 2900, None, 2100, 1750], 'profit_margin': [0.15, 0.22, 0.18, 0.25, 0.20, 0.17, 0.19, 0.16], 'transaction_date': [ date(2024, 1, 15), date(2024, 2, 20), date(2024, 3, 10), date(2024, 4, 5), date(2024, 5, 12), date(2024, 6, 8), date(2024, 7, 22), None ], 'customer_segment': ['Enterprise', 'SMB', 'Enterprise', 'SMB', 'Enterprise', 'SMB', 'Enterprise', 'SMB'], 'active': [True, True, False, True, False, True, True, False] }) print("Sample Data:") df
# Generate summary statisticssummary=sm(df) print("Summary Statistics with % Missing:") summary# This is now a Polars DataFrame directly

Output:

shape: (5, 16) ┌──────────────────┬─────────────┬──────┬─────────────┬───┬────────┬────────┬────────┬──────────┐ │ variable ┆ type ┆ nobs ┆ pct_missing ┆ … ┆ p75 ┆ p95 ┆ p99 ┆ n_unique │ │ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │ │ str ┆ str ┆ i64 ┆ f64 ┆ ┆ f64 ┆ f64 ┆ f64 ┆ i64 │ ╞══════════════════╪═════════════╪══════╪═════════════╪═══╪════════╪════════╪════════╪══════════╡ │ transaction_date ┆ date ┆ 7 ┆ 12.5 ┆ … ┆ null ┆ null ┆ null ┆ 7 │ │ customer_segment ┆ categorical ┆ 8 ┆ 0.0 ┆ … ┆ null ┆ null ┆ null ┆ 2 │ │ active ┆ categorical ┆ 8 ┆ 0.0 ┆ … ┆ null ┆ null ┆ null ┆ 2 │ │ revenue ┆ numeric ┆ 7 ┆ 12.5 ┆ … ┆ 2900.0 ┆ 3200.0 ┆ 3200.0 ┆ 7 │ │ profit_margin ┆ numeric ┆ 8 ┆ 0.0 ┆ … ┆ 0.2 ┆ 0.25 ┆ 0.25 ┆ 8 │ └──────────────────┴─────────────┴──────┴─────────────┴───┴────────┴────────┴────────┴──────────┘ 

Column Reference

ColumnDescription
variableColumn name
typeData type category (numeric, date, categorical)
nobsNumber of non-null observations
pct_missingPercentage of missing values
meanMean value (numeric columns only)
sdStandard deviation (numeric columns only)
minMinimum value (numeric and date columns only)
maxMaximum value (numeric and date columns only)
p1-p99Percentiles (numeric columns only)
n_uniqueNumber of unique values

Styled Output

For beautiful formatted tables with proper date formatting:

fromquantpolarsimportto_gt# Requires: pip3 install great-tablesstyled_summary=to_gt(summary) # Convert DataFrame to styled GT tablestyled_summary# In Jupyter, displays as formatted HTML table

Rendered Output Example: The .to_gt() method returns a Great Tables (GT) object that renders as a beautifully formatted HTML table in Jupyter notebooks with:

  • Table Header: "Data Summary Statistics" with subtitle showing variable count
  • Formatted Numbers: Statistics rounded to 2 decimal places
  • Percentage Formatting: Missing values shown as percentages (e.g., "12.5%")
  • Date Formatting: Min/max dates formatted as MM/DD/YYYY (e.g., "1/1/2023")
  • Professional Styling: Clean borders, alternating row colors, proper alignment
  • Column Labels: User-friendly names ("Std Dev" instead of "sd", "N Obs" instead of "nobs")

Example of what the styled table displays:

VariableTypeN Obs% MissingMeanStd DevMinMax1%5%25%50%75%95%99%N Unique
transaction_datedate712.5%Jan 15, 2024Jul 22, 20247
customer_segmentcategorical80.0%2
activecategorical80.0%2
revenuenumeric712.5%2,225.00716.021,000.003,200.001,000.001,000.001,800.002,100.002,900.003,200.003,200.007
profit_marginnumeric80.0%0.190.030.150.250.150.150.170.190.220.250.258

Data Type Handling

  • Numeric: Full statistics including percentiles
  • Date: Min/max dates only (percentiles not supported by Polars)
  • Categorical: Unique counts only

Out-of-Core Example

importpolarsasplimportquantpolarsasqp# Batch price 1M optionsdf=pl.scan_csv("options_data.csv") # Out-of-coredf=df.with_columns( price=qp.black_scholes(df, 'S', 'K', 'T', 'r', 'sigma', 'call')['price'] )

Features

  • Data Summary Tools: Out-of-core data summarization for big data
  • Option Pricing: Black-Scholes, Cox-Ross-Rubinstein (CRR), Barone-Adesi-Whaley (BAW) models
  • Implied Volatility: Calculation of implied volatility
  • Greeks: Delta, Gamma, Theta, Vega, Rho calculators

Key Optimizations

  • Vectorized DataFrame API: Functions operate on Polars DataFrames for batch processing of multiple options
  • Fast Norm CDF Approximation: Implemented Abramowitz & Stegun approximation using Polars expressions
  • Lazy Evaluation: All operations are lazy, enabling out-of-core processing for big data

Updated API

The functions now work on Polars DataFrames, allowing for:

  • Batch Processing: Price thousands of options in a single operation
  • Big Data Ready: Handles datasets larger than memory with Polars' streaming
  • Extreme Speed: Vectorized operations on columnar data

Performance Benefits

  • No Loops: All vectorized in Polars/Rust
  • Memory Efficient: Columnar storage and lazy evaluation
  • Scalable: Handles billions of rows with minimal memory
  • Parallel: Automatic parallelization where possible

About

A blazingly fast package for quantitative finance analysis using Polars

Resources

License

Stars

Watchers

Forks

Packages

No packages published