# Quant Finance Hybrid Project — Requirements Elicitation + Mathematical Specification (C++ Core, Python Research Layer) **Project codename:** `quant-engine` **Goal:** Build a reproducible, testable, performance-aware pricing library for European options under Black–Scholes, with Monte Carlo (MC) as the primary numerical method, and an optional PDE/FEM track for cross-validation and CSE-grade numerical analysis. --- ## 1. Stakeholders and Intended Use ### 1.1 Primary stakeholder - **You (developer/researcher):** want a sustainable workflow: derivations → implementation → verification → benchmarking → reporting. ### 1.2 Secondary stakeholders (implicit) - **Course staff / graders:** expect numerics rigor (convergence, stability, error bars), clean code structure, and evidence of understanding. - **Recruiters / interviewers (optional):** expect engineering maturity (modular design, tests, profiling) and mathematical correctness. ### 1.3 Use cases 1. Price a European call/put via Monte Carlo with confidence intervals. 2. Compute Greeks (Delta, Vega) with stable estimators. 3. Reduce variance via antithetic + control variates; quantify variance reduction. 4. Benchmark performance (single-thread vs multi-thread; C++ vs Python wrapper). 5. (Optional) Solve Black–Scholes PDE numerically (FD or FEM) and compare to MC and analytic pricing. --- ## 2. Scope and Non-Scope ### 2.1 In-scope (baseline) - Black–Scholes risk-neutral dynamics - European options (call/put) - MC pricing with error estimation - Variance reduction: antithetic variates, control variates - Greeks: Delta, Vega via pathwise derivative - Deterministic validation against closed-form Black–Scholes formula - Clean C++ library with a small CLI runner and Python experiment scripts ### 2.2 In-scope (advanced extensions) - Quasi-MC (Sobol) optional - OpenMP parallel MC - Calibration (implied vol) optional - PDE solver (FD or FEM/Galerkin) optional for cross-validation ### 2.3 Explicit non-scope (for now) - Exotic options requiring full path simulation (barrier/asian) **unless** you later extend - Jump-diffusion, local vol, Heston calibration (future phases) - Full market data infrastructure (DB ingestion, tick-level) --- ## 3. Deliverables ### 3.1 Code deliverables - `libqengine` (C++): pricing + Greeks + variance reduction - `qengine_cli` (C++): command-line program to run experiments - `pyqengine` (Python layer): plotting + experiment orchestration (optional binding via pybind11) - Test suite (unit + numerical regression tests) - Benchmark scripts + reproducible configs ### 3.2 Documentation deliverables - A technical report in Markdown/LaTeX including: - model assumptions - derivations - estimator definitions - error/variance analysis - convergence plots - performance benchmarks - limitations + future work --- ## 4. Mathematical Model Specification ### 4.1 Risk-neutral model (Black–Scholes) Under the risk-neutral measure $\mathbb{Q}$, $$ dS_t = r S_t\,dt + \sigma S_t\,dW_t, $$ where: - $S_t > 0$ is the asset price - $r$ is the continuously-compounded risk-free rate - $\sigma > 0$ is volatility - $W_t$ is standard Brownian motion. ### 4.2 Terminal distribution (exact sampling) Apply Itô’s lemma to $\log S_t$: $$ d(\log S_t) = \left(r - \tfrac12\sigma^2\right)dt + \sigma\,dW_t. $$ Integrating from $0$ to $T$ yields: $$ S_T = S_0 \exp\!\left(\left(r - \tfrac12\sigma^2\right)T + \sigma\sqrt{T}\,Z\right),\quad Z\sim \mathcal{N}(0,1). $$ ### 4.3 Payoffs - European call: $h(S_T)=(S_T-K)^+ = \max(S_T-K,0)$ - European put: $h(S_T)=(K-S_T)^+ = \max(K-S_T,0)$ ### 4.4 Pricing equation (risk-neutral valuation) $$ V_0 = e^{-rT}\,\mathbb{E}^{\mathbb{Q}}[h(S_T)]. $$ --- ## 5. Numerical Methods Specification (Primary Track: Monte Carlo) ### 5.1 Basic Monte Carlo estimator Define discounted payoff random variable: $$ X = e^{-rT}h(S_T). $$ With i.i.d. samples $Z_i\sim\mathcal{N}(0,1)$, compute $S_T^{(i)}$ and $X_i$, then: $$ \widehat{V}_N = \frac{1}{N}\sum_{i=1}^N X_i. $$ **Properties** - Unbiased: $\mathbb{E}[\widehat{V}_N]=V_0$ - Variance: $\mathrm{Var}(\widehat{V}_N)=\mathrm{Var}(X)/N$ - CLT: $\sqrt{N}(\widehat{V}_N-V_0)\Rightarrow \mathcal{N}(0,\mathrm{Var}(X))$ ### 5.2 Standard error and confidence interval Estimate variance with sample variance: $$ \widehat{\mathrm{Var}}(X)=\frac{1}{N-1}\sum_{i=1}^N (X_i-\overline{X})^2. $$ Standard error of the mean: $$ \mathrm{SE}(\widehat{V}_N)=\sqrt{\widehat{\mathrm{Var}}(X)/N}. $$ Approximate 95% confidence interval: $$ \widehat{V}_N \pm 1.96\,\mathrm{SE}(\widehat{V}_N). $$ ### 5.3 Variance reduction requirements #### 5.3.1 Antithetic variates For each $Z$, also use $-Z$. Define: $$ \widehat{V}^{\text{anti}}_N=\frac{1}{N}\sum_{i=1}^N \frac{X(Z_i)+X(-Z_i)}{2}. $$ **Expected outcome:** lower variance for monotone payoffs (calls/puts) due to negative correlation. #### 5.3.2 Control variates (with known expectation) Choose a control variable $Y$ correlated with $X$ and with known mean. A standard choice: $$ Y=e^{-rT}S_T,\qquad \mathbb{E}[Y]=S_0. $$ Estimator: $$ \widehat{V}^{\text{cv}}=\frac{1}{N}\sum_{i=1}^N \left(X_i - \beta (Y_i-\mathbb{E}[Y])\right). $$ Optimal coefficient: $$ \beta^*=\frac{\mathrm{Cov}(X,Y)}{\mathrm{Var}(Y)}. $$ In practice estimate $\beta^*$ from samples. **Acceptance criterion:** demonstrate variance reduction factor empirically, e.g. $$ \frac{\widehat{\mathrm{Var}}(\widehat{V}_N)}{\widehat{\mathrm{Var}}(\widehat{V}^{\text{cv}}_N)} > 1. $$ ### 5.4 Greeks (pathwise derivative) Let $V(S_0,\sigma,r,T)=e^{-rT}\mathbb{E}[h(S_T)]$. #### 5.4.1 Delta Since $S_T = S_0 \exp(\cdots)$, we have: $$ \frac{\partial S_T}{\partial S_0}=\frac{S_T}{S_0}. $$ For a call, $h'(S_T)=\mathbf{1}_{\{S_T>K\}}$ almost everywhere. Thus: $$ \Delta = \frac{\partial V}{\partial S_0} = e^{-rT}\mathbb{E}\left[\mathbf{1}_{\{S_T>K\}}\frac{S_T}{S_0}\right]. $$ MC estimator: $$ \widehat{\Delta}_N=\frac{e^{-rT}}{N}\sum_{i=1}^N \mathbf{1}_{\{S_T^{(i)}>K\}}\frac{S_T^{(i)}}{S_0}. $$ #### 5.4.2 Vega Differentiate: $$ \log S_T = \log S_0 + (r-\tfrac12\sigma^2)T + \sigma\sqrt{T}Z. $$ So: $$ \frac{\partial \log S_T}{\partial \sigma} = -\sigma T + \sqrt{T}Z, \qquad \frac{\partial S_T}{\partial \sigma}=S_T(-\sigma T + \sqrt{T}Z). $$ Hence for a call: $$ \text{Vega} = e^{-rT}\mathbb{E}\left[\mathbf{1}_{\{S_T>K\}}\,S_T(-\sigma T+\sqrt{T}Z)\right]. $$ **Greeks acceptance criterion:** compare MC Greeks against analytic Greeks (Black–Scholes closed form) within statistical error. --- ## 6. Analytical Reference (Black–Scholes Closed-Form) To validate MC, implement closed form pricing. Define: $$ d_1=\frac{\ln(S_0/K)+(r+\tfrac12\sigma^2)T}{\sigma\sqrt{T}},\quad d_2=d_1-\sigma\sqrt{T}. $$ Call: $$ C = S_0\Phi(d_1) - K e^{-rT}\Phi(d_2). $$ Put: $$ P = K e^{-rT}\Phi(-d_2) - S_0\Phi(-d_1). $$ Where $\Phi$ is standard normal CDF. **Acceptance criterion:** MC estimates converge to analytic price; error decreases ~ $O(N^{-1/2})$. --- ## 7. Optional PDE Track (for CSE/Numerics Alignment) ### 7.1 Black–Scholes PDE Option price $V(t,S)$ satisfies: $$ \frac{\partial V}{\partial t} +\frac12\sigma^2 S^2\frac{\partial^2 V}{\partial S^2} +rS\frac{\partial V}{\partial S} -rV=0,\quad V(T,S)=h(S). $$ ### 7.2 Weak form (Galerkin idea) Choose test functions $w(S)$ and integrate over domain $S\in[S_{\min},S_{\max}]$ (truncation required). A typical weak form after multiplying by $w$ and integrating (with integration by parts on the second derivative term) yields a bilinear form involving: - mass term $\int V w \, dS$ - diffusion term $\int \sigma^2 S^2 V_S w_S \, dS$ - convection term $\int r S V_S w \, dS$ - reaction term $\int r V w \, dS$ Time discretization (e.g. backward Euler / Crank–Nicolson) gives linear systems per timestep. **Why include this:** deterministic solver for 1D that lets you cross-check MC and show FEM competence. --- ## 8. Functional Requirements (FR) ### FR-1 Pricing API Provide an API to compute: - European call/put price under Black–Scholes - methods: `MC`, `MC+anti`, `MC+cv`, `MC+anti+cv` Inputs: - $S_0,K,r,\sigma,T$ - number of samples $N$ - RNG seed Outputs: - price estimate - standard error - confidence interval - runtime metrics ### FR-2 Greeks API Compute: - Delta and Vega (at minimum) Outputs: - estimate + standard error ### FR-3 Analytic reference Compute closed-form Black–Scholes price and Greeks (Delta, Vega) for validation. ### FR-4 Experiment runner CLI to run: - convergence experiment over $N$ grid (e.g. 1e3…1e7) - variance reduction comparison - parameter sweeps over $K$, $\sigma$, $T$ ### FR-5 Reproducibility - Deterministic results under fixed seed - Logs include git commit hash (optional), compiler flags, CPU info (optional), params --- ## 9. Non-Functional Requirements (NFR) ### NFR-1 Performance - Baseline MC engine should handle $N \ge 10^7$ samples in reasonable time on VPS hardware (exact thresholds depend on CPU). - Vectorization-friendly and allocation-free inner loop. ### NFR-2 Numerical correctness - Pass regression tests comparing to analytic price within tolerance tied to SE: - e.g. $|\widehat{V}_N - V_{\text{BS}}| \le 3\,\mathrm{SE}$ for large $N$. ### NFR-3 Maintainability - Clear module boundaries: model vs payoff vs engine vs stats. - Minimal header coupling. - Consistent naming, docs, and unit tests. ### NFR-4 Portability - Build with CMake on Linux; optional macOS. - No reliance on proprietary libraries. --- ## 10. System Architecture Requirements ### 10.1 C++ core modules - `models/` - `black_scholes.hpp/.cpp` - `payoffs/` - `payoff.hpp` - `call_payoff.hpp/.cpp` - `put_payoff.hpp/.cpp` - `mc/` - `mc_engine.hpp` (templated or type-erased interface) - `variance_reduction.hpp` (anti/cv) - `stats.hpp` (mean, variance, CI) - `analytics/` - `bs_closed_form.hpp/.cpp` (price + Greeks) - `cli/` - `main.cpp` experiment runner ### 10.2 Python layer (optional) - `experiments/` - `convergence.py` - `variance_reduction.py` - `plots.py` Binding options: - pybind11 wrapper around core functions OR - run CLI and parse CSV output (simpler and still professional). --- ## 11. Step-by-Step Implementation Plan (What to implement, in order) ### Step 1 — Deterministic foundation (no variance reduction yet) **Implement** - Black–Scholes terminal sampler: `ST(Z)` - Payoffs: call/put - MC estimator: price - Stats: mean, sample variance, SE, CI **Verification** - sanity: $S_T$ sample mean close to $S_0 e^{rT}$ - price monotonicity: - call increases with $S_0$, decreases with $K$ - put decreases with $S_0$, increases with $K$ ### Step 2 — Closed-form Black–Scholes **Implement** - $\Phi$ (normal CDF) using `std::erfc`/`std::erf` - call/put analytic price - analytic Delta/Vega (optional now, required later) **Verification** - compare MC price to analytic for increasing $N$ - show empirical $1/\sqrt{N}$ error trend on log-log plot (Python) ### Step 3 — Antithetic variates **Implement** - paired sampling: evaluate $Z$ and $-Z$ per iteration - update stats on paired average payoff **Verification** - variance reduction factor > 1 for calls/puts ### Step 4 — Control variates **Implement** - compute $Y_i = e^{-rT}S_T^{(i)}$, with known mean $S_0$ - estimate $\beta^*$ via sample cov/var - output both raw and CV prices + variances **Verification** - variance reduction factor consistent across parameters - show dependence on correlation $\rho_{XY}$ ### Step 5 — Greeks (pathwise) **Implement** - Delta estimator for call/put - Vega estimator for call/put - SE for Greeks (same sample variance approach) **Verification** - compare to analytic Greeks (Black–Scholes) within SE bands ### Step 6 — Performance engineering **Implement** - avoid per-iteration allocations - optional OpenMP parallelization - benchmark scaling with threads **Verification** - measure throughput paths/sec - show speedup and parallel efficiency ### Step 7 (Optional) — PDE Solver (FD or FEM) **Implement** - domain truncation $S\in[S_{\min},S_{\max}]$ - boundary conditions (e.g. call: $V(t,0)=0$, $V(t,S_{\max})\approx S_{\max}-K e^{-r(T-t)}$) - time stepping and spatial discretization (Galerkin/FEM or FD) **Verification** - PDE solution matches analytic at $t=0$ - compare PDE vs MC vs analytic --- ## 12. Testing Strategy ### 12.1 Unit tests (fast) - `terminal_price(Z)` correctness for fixed Z - payoff correctness at boundary cases - normal CDF sanity checks ### 12.2 Numerical regression tests (slower) - Fix seed and moderate N (e.g. 1e6) and verify price within tolerance window - Verify CI covers analytic price most of the time (statistical test across runs) ### 12.3 Property tests (optional) - invariants like put-call parity: $$ C - P = S_0 - K e^{-rT}. $$ Use analytic or MC estimates to test parity (MC will have noise). --- ## 13. Risk Register (What can go wrong) - **RNG issues:** poor seeding or non-determinism in parallel runs. - **Estimator bugs:** incorrect discounting, wrong variance formula, wrong control mean. - **Numerical instability:** CDF implementation errors for extreme $d_1,d_2$. - **Performance pitfalls:** unnecessary allocations, virtual dispatch in hot loop. - **PDE truncation error:** wrong boundary conditions dominate solution. Mitigation: keep a known parameter set where analytic results are stable; build regression tests early. --- ## 14. Acceptance Criteria (Definition of Done) Baseline completion: 1. MC pricing returns price + SE + CI. 2. Analytic Black–Scholes price implemented. 3. Convergence plot shows MC error ~ $N^{-1/2}$. 4. Antithetic and control variate implemented with demonstrated variance reduction. 5. Delta and Vega implemented and validated vs analytic Greeks. 6. CLI runs experiments and outputs CSV/JSON for Python plotting. Advanced completion (optional): 7. OpenMP scaling plot + efficiency. 8. PDE solver cross-checks analytic solution. 9. Python binding or orchestration layer produces publication-quality plots. --- ## 15. Implementation Notes (Engineering choices) ### 15.1 RNG - Use `std::mt19937_64` + `std::normal_distribution` - For parallel runs: per-thread RNG with deterministic seed schedule. ### 15.2 Hot loop design - Prefer templates or function objects to avoid virtual calls inside the MC loop. - Keep payoff polymorphism at the boundary (setup), not inside the innermost loop. ### 15.3 Data output - Output CSV with columns: - method, N, price, se, ci_low, ci_high, runtime_ms, seed, params... This makes Python plotting trivial. --- ## 16. Suggested Reading Map (aligned with requirements) - **Black–Scholes theory / derivations:** Shreve, *Stochastic Calculus for Finance II* - **Monte Carlo + variance reduction + Greeks:** Glasserman, *Monte Carlo Methods in Financial Engineering* - **C++ architecture for pricing libraries:** Joshi, *C++ Design Patterns and Derivatives Pricing* - **PDE + FEM track:** any numerical PDE/FEM text; treat BS PDE as a parabolic convection–diffusion–reaction equation. --- ## Appendix A — Analytic Greeks (for validation) ### Call Delta $$ \Delta_{\text{call}}=\Phi(d_1). $$ ### Put Delta $$ \Delta_{\text{put}}=\Phi(d_1)-1. $$ ### Vega (call = put) $$ \text{Vega}= S_0 \phi(d_1)\sqrt{T}, $$ where $\phi$ is standard normal PDF: $$ \phi(x)=\frac{1}{\sqrt{2\pi}}e^{-x^2/2}. $$ --- ## Appendix B — Put-Call Parity $$ C - P = S_0 - K e^{-rT}. $$ Use this as an additional correctness check. ---