15 KiB
Quant Finance Hybrid Project — Requirements Elicitation + Mathematical Specification (C++ Core, Python Research Layer)
Project codename: quant-engine
Goal: Build a reproducible, testable, performance-aware pricing library for European options under Black–Scholes, with Monte Carlo (MC) as the primary numerical method, and an optional PDE/FEM track for cross-validation and CSE-grade numerical analysis.
1. Stakeholders and Intended Use
1.1 Primary stakeholder
- You (developer/researcher): want a sustainable workflow: derivations → implementation → verification → benchmarking → reporting.
1.2 Secondary stakeholders (implicit)
- Course staff / graders: expect numerics rigor (convergence, stability, error bars), clean code structure, and evidence of understanding.
- Recruiters / interviewers (optional): expect engineering maturity (modular design, tests, profiling) and mathematical correctness.
1.3 Use cases
- Price a European call/put via Monte Carlo with confidence intervals.
- Compute Greeks (Delta, Vega) with stable estimators.
- Reduce variance via antithetic + control variates; quantify variance reduction.
- Benchmark performance (single-thread vs multi-thread; C++ vs Python wrapper).
- (Optional) Solve Black–Scholes PDE numerically (FD or FEM) and compare to MC and analytic pricing.
2. Scope and Non-Scope
2.1 In-scope (baseline)
- Black–Scholes risk-neutral dynamics
- European options (call/put)
- MC pricing with error estimation
- Variance reduction: antithetic variates, control variates
- Greeks: Delta, Vega via pathwise derivative
- Deterministic validation against closed-form Black–Scholes formula
- Clean C++ library with a small CLI runner and Python experiment scripts
2.2 In-scope (advanced extensions)
- Quasi-MC (Sobol) optional
- OpenMP parallel MC
- Calibration (implied vol) optional
- PDE solver (FD or FEM/Galerkin) optional for cross-validation
2.3 Explicit non-scope (for now)
- Exotic options requiring full path simulation (barrier/asian) unless you later extend
- Jump-diffusion, local vol, Heston calibration (future phases)
- Full market data infrastructure (DB ingestion, tick-level)
3. Deliverables
3.1 Code deliverables
libqengine(C++): pricing + Greeks + variance reductionqengine_cli(C++): command-line program to run experimentspyqengine(Python layer): plotting + experiment orchestration (optional binding via pybind11)- Test suite (unit + numerical regression tests)
- Benchmark scripts + reproducible configs
3.2 Documentation deliverables
- A technical report in Markdown/LaTeX including:
- model assumptions
- derivations
- estimator definitions
- error/variance analysis
- convergence plots
- performance benchmarks
- limitations + future work
4. Mathematical Model Specification
4.1 Risk-neutral model (Black–Scholes)
Under the risk-neutral measure \mathbb{Q},
dS_t = r S_t\,dt + \sigma S_t\,dW_t,
where:
S_t > 0is the asset priceris the continuously-compounded risk-free rate\sigma > 0is volatilityW_tis standard Brownian motion.
4.2 Terminal distribution (exact sampling)
Apply Itô’s lemma to \log S_t:
d(\log S_t) = \left(r - \tfrac12\sigma^2\right)dt + \sigma\,dW_t.
Integrating from 0 to T yields:
S_T = S_0 \exp\!\left(\left(r - \tfrac12\sigma^2\right)T + \sigma\sqrt{T}\,Z\right),\quad Z\sim \mathcal{N}(0,1).
4.3 Payoffs
- European call:
h(S_T)=(S_T-K)^+ = \max(S_T-K,0) - European put:
h(S_T)=(K-S_T)^+ = \max(K-S_T,0)
4.4 Pricing equation (risk-neutral valuation)
V_0 = e^{-rT}\,\mathbb{E}^{\mathbb{Q}}[h(S_T)].
5. Numerical Methods Specification (Primary Track: Monte Carlo)
5.1 Basic Monte Carlo estimator
Define discounted payoff random variable:
X = e^{-rT}h(S_T).
With i.i.d. samples Z_i\sim\mathcal{N}(0,1), compute S_T^{(i)} and X_i, then:
\widehat{V}_N = \frac{1}{N}\sum_{i=1}^N X_i.
Properties
- Unbiased:
\mathbb{E}[\widehat{V}_N]=V_0 - Variance:
\mathrm{Var}(\widehat{V}_N)=\mathrm{Var}(X)/N - CLT:
\sqrt{N}(\widehat{V}_N-V_0)\Rightarrow \mathcal{N}(0,\mathrm{Var}(X))
5.2 Standard error and confidence interval
Estimate variance with sample variance:
\widehat{\mathrm{Var}}(X)=\frac{1}{N-1}\sum_{i=1}^N (X_i-\overline{X})^2.
Standard error of the mean:
\mathrm{SE}(\widehat{V}_N)=\sqrt{\widehat{\mathrm{Var}}(X)/N}.
Approximate 95% confidence interval:
\widehat{V}_N \pm 1.96\,\mathrm{SE}(\widehat{V}_N).
5.3 Variance reduction requirements
5.3.1 Antithetic variates
For each Z, also use -Z. Define:
\widehat{V}^{\text{anti}}_N=\frac{1}{N}\sum_{i=1}^N \frac{X(Z_i)+X(-Z_i)}{2}.
Expected outcome: lower variance for monotone payoffs (calls/puts) due to negative correlation.
5.3.2 Control variates (with known expectation)
Choose a control variable Y correlated with X and with known mean.
A standard choice:
Y=e^{-rT}S_T,\qquad \mathbb{E}[Y]=S_0.
Estimator:
\widehat{V}^{\text{cv}}=\frac{1}{N}\sum_{i=1}^N \left(X_i - \beta (Y_i-\mathbb{E}[Y])\right).
Optimal coefficient:
\beta^*=\frac{\mathrm{Cov}(X,Y)}{\mathrm{Var}(Y)}.
In practice estimate \beta^* from samples.
Acceptance criterion: demonstrate variance reduction factor empirically, e.g.
\frac{\widehat{\mathrm{Var}}(\widehat{V}_N)}{\widehat{\mathrm{Var}}(\widehat{V}^{\text{cv}}_N)} > 1.
5.4 Greeks (pathwise derivative)
Let V(S_0,\sigma,r,T)=e^{-rT}\mathbb{E}[h(S_T)].
5.4.1 Delta
Since S_T = S_0 \exp(\cdots), we have:
\frac{\partial S_T}{\partial S_0}=\frac{S_T}{S_0}.
For a call, h'(S_T)=\mathbf{1}_{\{S_T>K\}} almost everywhere.
Thus:
\Delta = \frac{\partial V}{\partial S_0}
= e^{-rT}\mathbb{E}\left[\mathbf{1}_{\{S_T>K\}}\frac{S_T}{S_0}\right].
MC estimator:
\widehat{\Delta}_N=\frac{e^{-rT}}{N}\sum_{i=1}^N \mathbf{1}_{\{S_T^{(i)}>K\}}\frac{S_T^{(i)}}{S_0}.
5.4.2 Vega
Differentiate:
\log S_T = \log S_0 + (r-\tfrac12\sigma^2)T + \sigma\sqrt{T}Z.
So:
\frac{\partial \log S_T}{\partial \sigma} = -\sigma T + \sqrt{T}Z,
\qquad
\frac{\partial S_T}{\partial \sigma}=S_T(-\sigma T + \sqrt{T}Z).
Hence for a call:
\text{Vega} = e^{-rT}\mathbb{E}\left[\mathbf{1}_{\{S_T>K\}}\,S_T(-\sigma T+\sqrt{T}Z)\right].
Greeks acceptance criterion: compare MC Greeks against analytic Greeks (Black–Scholes closed form) within statistical error.
6. Analytical Reference (Black–Scholes Closed-Form)
To validate MC, implement closed form pricing.
Define:
d_1=\frac{\ln(S_0/K)+(r+\tfrac12\sigma^2)T}{\sigma\sqrt{T}},\quad
d_2=d_1-\sigma\sqrt{T}.
Call:
C = S_0\Phi(d_1) - K e^{-rT}\Phi(d_2).
Put:
P = K e^{-rT}\Phi(-d_2) - S_0\Phi(-d_1).
Where \Phi is standard normal CDF.
Acceptance criterion: MC estimates converge to analytic price; error decreases ~ O(N^{-1/2}).
7. Optional PDE Track (for CSE/Numerics Alignment)
7.1 Black–Scholes PDE
Option price V(t,S) satisfies:
\frac{\partial V}{\partial t}
+\frac12\sigma^2 S^2\frac{\partial^2 V}{\partial S^2}
+rS\frac{\partial V}{\partial S}
-rV=0,\quad V(T,S)=h(S).
7.2 Weak form (Galerkin idea)
Choose test functions w(S) and integrate over domain S\in[S_{\min},S_{\max}] (truncation required).
A typical weak form after multiplying by w and integrating (with integration by parts on the second derivative term) yields a bilinear form involving:
- mass term
\int V w \, dS - diffusion term
\int \sigma^2 S^2 V_S w_S \, dS - convection term
\int r S V_S w \, dS - reaction term
\int r V w \, dS
Time discretization (e.g. backward Euler / Crank–Nicolson) gives linear systems per timestep.
Why include this: deterministic solver for 1D that lets you cross-check MC and show FEM competence.
8. Functional Requirements (FR)
FR-1 Pricing API
Provide an API to compute:
- European call/put price under Black–Scholes
- methods:
MC,MC+anti,MC+cv,MC+anti+cvInputs: S_0,K,r,\sigma,T- number of samples
N - RNG seed
Outputs:
- price estimate
- standard error
- confidence interval
- runtime metrics
FR-2 Greeks API
Compute:
- Delta and Vega (at minimum) Outputs:
- estimate + standard error
FR-3 Analytic reference
Compute closed-form Black–Scholes price and Greeks (Delta, Vega) for validation.
FR-4 Experiment runner
CLI to run:
- convergence experiment over
Ngrid (e.g. 1e3…1e7) - variance reduction comparison
- parameter sweeps over
K,\sigma,T
FR-5 Reproducibility
- Deterministic results under fixed seed
- Logs include git commit hash (optional), compiler flags, CPU info (optional), params
9. Non-Functional Requirements (NFR)
NFR-1 Performance
- Baseline MC engine should handle
N \ge 10^7samples in reasonable time on VPS hardware (exact thresholds depend on CPU). - Vectorization-friendly and allocation-free inner loop.
NFR-2 Numerical correctness
- Pass regression tests comparing to analytic price within tolerance tied to SE:
- e.g.
|\widehat{V}_N - V_{\text{BS}}| \le 3\,\mathrm{SE}for largeN.
- e.g.
NFR-3 Maintainability
- Clear module boundaries: model vs payoff vs engine vs stats.
- Minimal header coupling.
- Consistent naming, docs, and unit tests.
NFR-4 Portability
- Build with CMake on Linux; optional macOS.
- No reliance on proprietary libraries.
10. System Architecture Requirements
10.1 C++ core modules
models/black_scholes.hpp/.cpp
payoffs/payoff.hppcall_payoff.hpp/.cppput_payoff.hpp/.cpp
mc/mc_engine.hpp(templated or type-erased interface)variance_reduction.hpp(anti/cv)stats.hpp(mean, variance, CI)
analytics/bs_closed_form.hpp/.cpp(price + Greeks)
cli/main.cppexperiment runner
10.2 Python layer (optional)
experiments/convergence.pyvariance_reduction.pyplots.py
Binding options:
- pybind11 wrapper around core functions OR
- run CLI and parse CSV output (simpler and still professional).
11. Step-by-Step Implementation Plan (What to implement, in order)
Step 1 — Deterministic foundation (no variance reduction yet)
Implement
- Black–Scholes terminal sampler:
ST(Z) - Payoffs: call/put
- MC estimator: price
- Stats: mean, sample variance, SE, CI
Verification
- sanity:
S_Tsample mean close toS_0 e^{rT} - price monotonicity:
- call increases with
S_0, decreases withK - put decreases with
S_0, increases withK
- call increases with
Step 2 — Closed-form Black–Scholes
Implement
\Phi(normal CDF) usingstd::erfc/std::erf- call/put analytic price
- analytic Delta/Vega (optional now, required later)
Verification
- compare MC price to analytic for increasing
N - show empirical
1/\sqrt{N}error trend on log-log plot (Python)
Step 3 — Antithetic variates
Implement
- paired sampling: evaluate
Zand-Zper iteration - update stats on paired average payoff
Verification
- variance reduction factor > 1 for calls/puts
Step 4 — Control variates
Implement
- compute
Y_i = e^{-rT}S_T^{(i)}, with known meanS_0 - estimate
\beta^*via sample cov/var - output both raw and CV prices + variances
Verification
- variance reduction factor consistent across parameters
- show dependence on correlation
\rho_{XY}
Step 5 — Greeks (pathwise)
Implement
- Delta estimator for call/put
- Vega estimator for call/put
- SE for Greeks (same sample variance approach)
Verification
- compare to analytic Greeks (Black–Scholes) within SE bands
Step 6 — Performance engineering
Implement
- avoid per-iteration allocations
- optional OpenMP parallelization
- benchmark scaling with threads
Verification
- measure throughput paths/sec
- show speedup and parallel efficiency
Step 7 (Optional) — PDE Solver (FD or FEM)
Implement
- domain truncation
S\in[S_{\min},S_{\max}] - boundary conditions (e.g. call:
V(t,0)=0,V(t,S_{\max})\approx S_{\max}-K e^{-r(T-t)}) - time stepping and spatial discretization (Galerkin/FEM or FD)
Verification
- PDE solution matches analytic at
t=0 - compare PDE vs MC vs analytic
12. Testing Strategy
12.1 Unit tests (fast)
terminal_price(Z)correctness for fixed Z- payoff correctness at boundary cases
- normal CDF sanity checks
12.2 Numerical regression tests (slower)
- Fix seed and moderate N (e.g. 1e6) and verify price within tolerance window
- Verify CI covers analytic price most of the time (statistical test across runs)
12.3 Property tests (optional)
- invariants like put-call parity:
Use analytic or MC estimates to test parity (MC will have noise).C - P = S_0 - K e^{-rT}.
13. Risk Register (What can go wrong)
- RNG issues: poor seeding or non-determinism in parallel runs.
- Estimator bugs: incorrect discounting, wrong variance formula, wrong control mean.
- Numerical instability: CDF implementation errors for extreme
d_1,d_2. - Performance pitfalls: unnecessary allocations, virtual dispatch in hot loop.
- PDE truncation error: wrong boundary conditions dominate solution.
Mitigation: keep a known parameter set where analytic results are stable; build regression tests early.
14. Acceptance Criteria (Definition of Done)
Baseline completion:
- MC pricing returns price + SE + CI.
- Analytic Black–Scholes price implemented.
- Convergence plot shows MC error ~
N^{-1/2}. - Antithetic and control variate implemented with demonstrated variance reduction.
- Delta and Vega implemented and validated vs analytic Greeks.
- CLI runs experiments and outputs CSV/JSON for Python plotting.
Advanced completion (optional): 7. OpenMP scaling plot + efficiency. 8. PDE solver cross-checks analytic solution. 9. Python binding or orchestration layer produces publication-quality plots.
15. Implementation Notes (Engineering choices)
15.1 RNG
- Use
std::mt19937_64+std::normal_distribution<double> - For parallel runs: per-thread RNG with deterministic seed schedule.
15.2 Hot loop design
- Prefer templates or function objects to avoid virtual calls inside the MC loop.
- Keep payoff polymorphism at the boundary (setup), not inside the innermost loop.
15.3 Data output
- Output CSV with columns:
- method, N, price, se, ci_low, ci_high, runtime_ms, seed, params...
This makes Python plotting trivial.
16. Suggested Reading Map (aligned with requirements)
- Black–Scholes theory / derivations: Shreve, Stochastic Calculus for Finance II
- Monte Carlo + variance reduction + Greeks: Glasserman, Monte Carlo Methods in Financial Engineering
- C++ architecture for pricing libraries: Joshi, C++ Design Patterns and Derivatives Pricing
- PDE + FEM track: any numerical PDE/FEM text; treat BS PDE as a parabolic convection–diffusion–reaction equation.
Appendix A — Analytic Greeks (for validation)
Call Delta
\Delta_{\text{call}}=\Phi(d_1).
Put Delta
\Delta_{\text{put}}=\Phi(d_1)-1.
Vega (call = put)
\text{Vega}= S_0 \phi(d_1)\sqrt{T},
where \phi is standard normal PDF:
\phi(x)=\frac{1}{\sqrt{2\pi}}e^{-x^2/2}.
Appendix B — Put-Call Parity
C - P = S_0 - K e^{-rT}.
Use this as an additional correctness check.