Borrowing Predictive Strength: Hierarchical Bayes across Managers, Funds, and Deals

10/12/2025

Belief is a team sport. A deal lives inside a fund, which lives inside a manager. Hierarchical Bayes turns that nesting into math so thin data can lean on thicker neighbors without collapsing into one-size-fits-all averages. This post explains the interactives with equations and step-by-step reading guides.

TL;DR

  • Model deal returns inside funds, and funds inside managers.
  • Use partial pooling to shrink noisy estimates toward manager and global anchors.
  • Score predictions on held-out deals to confirm that strength sharing pays.

Quick Bayesian reminder

Updating beliefs is multiplication in disguise: p(ry)p(yr)p(r).p(r \mid y) \propto p(y \mid r)\,p(r).

  • The prior p(r)p(r) encodes what you believed before any evidence.
  • The likelihood p(yr)p(y \mid r) says how probable the observed data are if rr were true.
  • The posterior p(ry)p(r \mid y) blends the two by comparative uncertainty.

In a hierarchy, priors are built from higher levels, so information flows up and down.


Model in one picture

We use a three-level Normal model:

  • Deal: rd,f,mN(μf,m,σd2)r_{d,f,m} \sim \mathcal{N}(\mu_{f,m}, \sigma_d^2)
  • Fund: μf,mN(μm,τf2)\mu_{f,m} \sim \mathcal{N}(\mu_m, \tau_f^2)
  • Manager: μmN(μ0,τm2)\mu_m \sim \mathcal{N}(\mu_0, \tau_m^2)

Interpretation:

  • σd\sigma_d is deal noise.
  • τf\tau_f is dispersion across funds within a manager.
  • τm\tau_m is dispersion across managers.
  • μ0\mu_0 is the global sector anchor.

We simulate a full hierarchy and let you control its knobs. These parameters act both as data-generators and as priors the model uses to share strength.

Interactive - Hierarchy controls

Hierarchy controls
30%
90%
Hier log score ≈ 1.392
vs no-pool Δ ≈ 0.024
vs complete Δ ≈ 0.077
Holdout N = 252

These parameters generate a synthetic hierarchy and serve as priors for pooling. Bigger tau values imply more real dispersion to respect; bigger sigma_d implies noisier deals to shrink.

What to change and what happens:

  • Increase managers, funds per manager, or deals per fund to thicken the dataset. Scores should stabilize.
  • Increase τm\tau_m or τf\tau_f to encode more true dispersion. Pooling should respect real differences more.
  • Increase σd\sigma_d to make deals noisier. Shrinkage should increase.
  • Change the train fraction to alter how much data is held out for scoring.

Where the variance actually lives

For a fresh deal drawn from the hierarchy, the unconditional variance decomposes as

Var(rnew)τm2+τf2+σd2.\mathrm{Var}(r_{\text{new}}) \approx \tau_m^2 + \tau_f^2 + \sigma_d^2.
  • If τm\tau_m dominates, managers differ a lot.
  • If τf\tau_f dominates, funds differ inside each manager.
  • If σd\sigma_d dominates, deals are noisy even within a fund.

Interactive - Variance decomposition

Where uncertainty comes from
13.5%8.6%77.8%Manager (tau_m^2)Fund (tau_f^2)Deal (sigma_d^2)020406080100
Variance decomposition for a fresh deal (percent of total)Share (%)

How to read the chart:

  • Bars show the percent share of total variance attributed to manager, fund, and deal components. They sum to 100 percent.
  • If the manager bar grows while others shrink, cross-manager dispersion is the main thing to model. Expect bigger gains from manager-level pooling.

Shrinkage: raw to posterior at the fund level

Given a fund with nfn_f training deals and sample mean yˉf\bar{y}_f, the posterior mean for the fund-level parameter shrinks toward its manager:

μ^f  =  wfyˉf+(1wf)μ^m,wf  =  nf/σd2nf/σd2+1/τf2.\hat{\mu}_{f} \;=\; w_f\,\bar{y}_f + (1-w_f)\,\hat{\mu}_{m}, \quad w_f \;=\; \frac{n_f/\sigma_d^2}{n_f/\sigma_d^2 + 1/\tau_f^2}.

The manager posterior aggregates fund evidence with its own prior:

Var(μmdata)  =  [1τm2+f1τf2+σd2/nf]1,\mathrm{Var}(\mu_m \mid \text{data}) \;=\; \left[\frac{1}{\tau_m^2} + \sum_f \frac{1}{\tau_f^2 + \sigma_d^2/n_f}\right]^{-1},μ^m  =  Var(μmdata)[μ0τm2+fyˉfτf2+σd2/nf].\hat{\mu}_{m} \;=\; \mathrm{Var}(\mu_m \mid \text{data}) \left[ \frac{\mu_0}{\tau_m^2} + \sum_f \frac{\bar{y}_f}{\tau_f^2 + \sigma_d^2/n_f} \right].

Smaller nfn_f or larger σd\sigma_d makes wfw_f smaller, so the fund pulls harder toward the manager anchor.

Interactive - Shrinkage ladder

Cross-level shrinkage
−50510Manager 4 / Fund 1Manager 1 / Fund 1Manager 6 / Fund 4Manager 3 / Fund 1Manager 2 / Fund 5Manager 4 / Fund 6Manager 1 / Fund 3Manager 5 / Fund 3Manager 3 / Fund 3Manager 1 / Fund 6Manager 4 / Fund 3Manager 6 / Fund 2Manager 5 / Fund 2Manager 6 / Fund 1
Raw meanPosteriorManager anchorShrinkage: raw fund means to hierarchical posteriors (top movers)Quarterly return (%)
Avg |shrink| ≈ 1.08 pp
Funds sampled from largest shrink

Data-poor or high-vol funds move the most. The triangle shows the manager anchor each fund leans toward.

How to read the chart:

  • Each row is a fund. The circle is the raw mean yˉf\bar{y}_f. The square is the shrunken posterior μ^f\hat{\mu}_f. The triangle is its manager anchor μ^m\hat{\mu}_m.
  • The line between circle and square is the amount of shrinkage. Long lines are data-poor or high-volatility funds.
  • Sorting emphasizes the largest movers. If many long lines all point toward their manager, partial pooling is doing work.

What to try:

  • Increase σd\sigma_d or decrease deals per fund; lines should lengthen.
  • Increase τf\tau_f; funds get more autonomy, so lines shorten.

Predict a brand-new fund under a known manager

For a new fund under manager mm with expected nnewn_{\text{new}} deals, the predictive distribution of its average return yˉnew\bar{y}_{\text{new}} is

yˉnewdataN ⁣(μ^m,  τf2+σd2nnew+Var(μmdata)).\bar{y}_{\text{new}} \mid \text{data} \sim \mathcal{N}\!\left(\hat{\mu}_m,\; \tau_f^2 + \frac{\sigma_d^2}{n_{\text{new}}} + \mathrm{Var}(\mu_m \mid \text{data})\right).

Contrast with the global-only baseline that ignores manager identity:

yˉnewglobalN ⁣(μ^0,  σd2nnew),\bar{y}_{\text{new}} \mid \text{global} \sim \mathcal{N}\!\left(\hat{\mu}_0,\; \frac{\sigma_d^2}{n_{\text{new}}}\right),

where μ^0\hat{\mu}_0 is the global training mean. Tail odds follow from the Normal cdf, e.g. P(yˉnew<0)=Φ ⁣(μσ)P(\bar{y}_{\text{new}} < 0) = \Phi\!\left(-\frac{\mu}{\sigma}\right).

Interactive - New fund predictor

New fund predictor
−10−5051015051015
Global-onlyManager-informedPredict a brand-new fund average (manager vs global)Fund average return r̄_new (%)Density
Manager-informed mean ≈ 3.92%
sd ≈ 3.33%
P(r_bar_new < 0): hier 12.0% vs global 56.1%

Hierarchical prediction respects manager identity and fund dispersion. Global-only is optimistic when cross-fund spread is real.

How to read the chart:

  • Two curves: manager-informed vs global-only. Means can differ; the manager-informed curve is often a bit wider because it admits cross-fund dispersion and manager uncertainty.
  • The badges report E[yˉnew]\mathbb{E}[\bar{y}_{\text{new}}] and sd(yˉnew)\mathrm{sd}(\bar{y}_{\text{new}}), plus P(yˉnew<0)P(\bar{y}_{\text{new}} < 0) under each model.
  • If the manager has strong evidence, the manager-informed mean will move away from the global baseline and the sd may shrink.

What to try:

  • Increase nnewn_{\text{new}}; both curves tighten by σd2/nnew\sigma_d^2/n_{\text{new}} but only the manager-informed curve keeps τf2\tau_f^2 and Var(μmdata)\mathrm{Var}(\mu_m \mid \text{data}).
  • Pick a manager with many and strong funds; the manager-informed curve should shift meaningfully relative to global.

Out-of-sample scoring: does strength sharing pay?

We hold out a subset of deals and compare three models:

  • No pool: predict each deal using only its fund training mean.
  • Complete pooling: predict every deal with the single global training mean.
  • Hierarchical: borrow strength across manager and fund structure.

We score with two proper scoring rules:

Average log predictive density

LogScore  =  1Ni=1Nlogpi(ri),\text{LogScore} \;=\; \frac{1}{N}\sum_{i=1}^{N} \log p_i(r_i),

and Brier score for the event r<0r < 0

Brier  =  1Ni=1N(piyi)2,pi=P(ri<0),yi=1{ri<0}.\text{Brier} \;=\; \frac{1}{N}\sum_{i=1}^{N} \big(p_i - y_i\big)^2, \quad p_i = P(r_i < 0), \quad y_i = \mathbf{1}\{r_i < 0\}.

Interactive - Predictive scorecard

Predictive scorecard
1.3681.3151.392No poolCompleteHierarchical00.51
Avg log predictive density (holdout) - higher is betterAvg log score
0.2260.2550.221No poolCompleteHierarchical00.050.10.150.20.25
Brier score for event r < 0 (holdout) - lower is betterBrier
Log winner: Hierarchical
Brier winner: Hierarchical
Holdout N = 252
Hier vs Complete: +0.077 log
Hier vs Complete: +0.034 Brier improvement

Absolute view shows the raw metrics. Delta view re-expresses both metrics so higher is better: for log score, improvement = model - baseline; for Brier, improvement = baseline - model. The dotted line marks zero improvement.

Reading the scorecard

Two metrics, both proper scoring rules:

  1. Avg log predictive density (higher is better).
  2. Brier score for the event r<0r < 0 (lower is better).

The menu lets you switch between Absolute values and Delta vs a baseline (Complete or No pool). In Delta view both charts are normalized so that higher is better:

  • Log score improvement = logscoremodellogscorebaseline\text{logscore}_{model} - \text{logscore}_{baseline}.
  • Brier improvement = BrierbaselineBriermodel\text{Brier}_{baseline} - \text{Brier}_{model}.

The dotted horizontal line marks zero improvement.

What the log score measures

For each holdout deal with realized return rir_i and a predictive density pi(r)p_i(r), the contribution is logpi(ri)\log p_i(r_i). The chart displays the average:

1Ni=1Nlogpi(ri).\frac{1}{N}\sum_{i=1}^{N} \log p_i(r_i).

This rewards forecasts that put high probability mass near the truth, and it penalizes overconfidence. Reporting your true predictive distribution maximizes expected score.

What the Brier score is and why we use it

The Brier score evaluates probability forecasts for a binary event. Here the event is a loss: r<0r < 0. For each prediction we compute the model’s probability pi=P(ri<0)p_i = P(r_i < 0) and the realized outcome yi=1{ri<0}y_i = \mathbf{1}\{r_i < 0\}. The Brier is the mean squared error of those probabilities:

Brier=1Ni=1N(piyi)2.\text{Brier} = \frac{1}{N}\sum_{i=1}^{N} (p_i - y_i)^2.

Why this matters here:

  • Action relevance. Many portfolio actions hinge on a tail event (loss vs no loss, breach vs no breach). Brier directly scores the quality of those event probabilities, not just point predictions.
  • Calibration sensitive. If your probabilities are systematically off (say you predict 30% loss but it happens 50% of the time), Brier will punish you in proportion to the miscalibration.
  • Proper rule. Like the log score, Brier is proper: your expected score is best when you report your true probability, so it discourages hedged or exaggerated forecasts.
  • Complement to log score. Log score cares about full density shape; Brier isolates the decision boundary. Seeing both gives a fuller picture of sharpness and calibration.

Interpretation tips:

  • Lower Brier is better in Absolute view.
  • In Delta view we flip it to an improvement scale so higher is better: BrierbaselineBriermodel\text{Brier}_{baseline} - \text{Brier}_{model}.
  • A Brier of 0.25 corresponds to flipping a fair coin for a balanced event; competent models on financial returns should target far lower than that for meaningful edges.

How to diagnose with the charts

  • If No pool beats Complete on both metrics, your cross-fund differences are real and large.
  • If Hierarchical beats both, pooled estimates are balancing variance (No pool) and bias (Complete).
  • If Hierarchical wins on log score but not on Brier, the densities may be sharp but miscalibrated near the r<0r < 0 boundary; consider heavier tails or revisiting variance terms.

Implementation notes

  • Work in log-return units for additivity. Convert to percents only for display.
  • Treat σd\sigma_d, τf\tau_f, and τm\tau_m as learnable scale parameters in production; here they are dials for pedagogy.
  • Posterior predictive variance must include parameter uncertainty. For a new deal, Var(rnewdata)σd2+Var(μfdata)\mathrm{Var}(r_{\text{new}} \mid \text{data}) \approx \sigma_d^2 + \mathrm{Var}(\mu_f \mid \text{data}).
  • Unit test edge cases: empty funds, tiny nfn_f, and very large τ\tau values.

Connecting the dots

  • The variance decomposition explains why partial pooling should help: when τm2\tau_m^2 or τf2\tau_f^2 is sizable, sharing information reduces estimation error.
  • The shrinkage ladder shows the micro-mechanism that creates the macro win in scores: data-poor funds move toward more reliable anchors, stabilizing out-of-sample predictions.
  • The new fund predictor translates the structure into actionable odds for underwriting or pacing, via P(yˉnew<0)P(\bar{y}_{\text{new}} < 0) and E[yˉnew]\mathbb{E}[\bar{y}_{\text{new}}].

Notes: multi-strategy managers and whether to share parameters

Some managers run multiple strategies—buyout, growth, credit, special opportunities. The question is how much information should flow across strategies.

Model extension. Add a strategy index s{1,,S}s \in \{1,\dots,S\} and let funds live inside (m,s)(m,s):

rd,f,m,sN(μf,m,s,σd,s2),μf,m,sN(μm,s,τf,s2),μm,NS(μ0,,Σm).r_{d,f,m,s} \sim \mathcal{N}(\mu_{f,m,s}, \sigma_{d,s}^2), \quad \mu_{f,m,s} \sim \mathcal{N}(\mu_{m,s}, \tau_{f,s}^2), \quad \mu_{m,\cdot} \sim \mathcal{N}_S(\mu_{0,\cdot}, \Sigma_m).
  • σd,s\sigma_{d,s} lets deal noise differ by strategy.
  • τf,s\tau_{f,s} lets fund dispersion differ by strategy.
  • μm,\mu_{m,\cdot} is an SS-vector of manager means across strategies.
  • Σm=DRD\Sigma_m = D\,R\,D with D=diag(τm,1,,τm,S)D = \mathrm{diag}(\tau_{m,1},\dots,\tau_{m,S}) and RR a correlation matrix (e.g., LKJ prior). Correlations in RR encode how much a house style carries across strategies.

When to keep shared parameters. Keep cross-strategy sharing if posteriors for off-diagonal correlations in RR are materially positive and concentrated. A practical rule:

  • If E[ρs,sdata]\mathbb{E}[\rho_{s,s'} \mid \text{data}] is moderately large (0.2–0.6) and the 80% interval stays away from 0, keep sharing across ss and ss'.
  • If ρs,s\rho_{s,s'} is near 0 with wide uncertainty, treat strategies as independent: set RIR \approx I or separate the models.

What shrinks where. For a new fund in strategy ss under manager mm:

yˉnewdataN ⁣(μ^m,s, τf,s2+σd,s2nnew+Var(μm,sdata)).\bar y_{\text{new}} \mid \text{data} \sim \mathcal{N}\!\Big(\hat{\mu}_{m,s},\ \tau_{f,s}^2 + \tfrac{\sigma_{d,s}^2}{n_{\text{new}}} + \mathrm{Var}(\mu_{m,s} \mid \text{data})\Big).

If RR has positive correlations, μ^m,s\hat{\mu}_{m,s} benefits from evidence in the other strategies through the multivariate posterior. If strategies are truly distinct, the posterior naturally shuts down that pathway.

Diagnostics to add later.

  • Posterior of RR with uncertainty bands.
  • Per-strategy shrinkage weights wf,s=nf/σd,s2nf/σd,s2+1/τf,s2w_{f,s} = \frac{n_f/\sigma_{d,s}^2}{n_f/\sigma_{d,s}^2 + 1/\tau_{f,s}^2}.
  • Out-of-sample scorecards stratified by strategy.

Notes: why hierarchical Bayes beats normalization and lasso here

People often normalize returns (demean by sector or vintage, z-score by volatility) or fit a lasso on flattened data. Both moves are helpful, but they do not solve the nesting or the uncertainty accounting.

Normalization limits. Demeaning and z-scoring remove average level and scale, but they do not adaptively shrink noisy group means. They also do not propagate parameter uncertainty into predictions.

Lasso limits. The lasso solves an optimization with a global penalty:

minβ i(yixiβ)2+λβ1,\min_{\beta} \ \sum_i (y_i - x_i^\top \beta)^2 + \lambda \lVert \beta \rVert_1,

which encourages some coefficients to be exactly zero. That is great for sparse feature selection, not for borrowing strength across nested groups whose reliability varies with sample size.

What the hierarchy does instead. The posterior for a fund mean is a data-weighted average:

μ^f=wfyˉf+(1wf)μ^m,wf=nf/σd2nf/σd2+1/τf2.\hat{\mu}_f = w_f\,\bar y_f + (1 - w_f)\,\hat{\mu}_m, \quad w_f = \frac{n_f/\sigma_d^2}{n_f/\sigma_d^2 + 1/\tau_f^2}.
  • The weight wfw_f depends on nfn_f and σd2\sigma_d^2: thin and noisy funds shrink more.
  • The manager posterior μ^m\hat{\mu}_m itself is a shrinkage estimate that pools across funds, with uncertainty that flows down into Var(μfdata)\mathrm{Var}(\mu_f \mid \text{data}).

Ridge connection, lasso contrast. If you write a random effect ufN(0,τf2)u_f \sim \mathcal{N}(0, \tau_f^2), the posterior mean of ufu_f equals a ridge solution with penalty λ=1/τf2\lambda = 1/\tau_f^2. Hierarchical Bayes learns τf\tau_f from the data and adjusts shrinkage per group through nfn_f and σd2\sigma_d^2. Lasso has a fixed λ\lambda that does not care whether a fund had 6 or 60 deals and does not yield full predictive distributions.

Prediction matters. For a new deal, the hierarchical predictive variance includes both deal noise and parameter uncertainty:

Var(rnewdata)σd2+Var(μfdata).\mathrm{Var}(r_{\text{new}} \mid \text{data}) \approx \sigma_d^2 + \mathrm{Var}(\mu_f \mid \text{data}).

Normalization and lasso give you a point estimate plus residual variance, but they do not decompose or propagate group-level uncertainty in a principled way.


Notes: from hierarchy to systematic manager factors and a scorecard

The same scaffolding gives you cleaner style estimates and a defendable manager scorecard.

Hierarchical factor model. Let XtX_t be a set of systematic factors (market, rates, credit, sector). For returns indexed by time tt:

rd,f,m,s,t=αm,s+βmXt+γfZd,t+εd,f,m,s,t,r_{d,f,m,s,t} = \alpha_{m,s} + \beta_{m}^\top X_t + \gamma_{f}^\top Z_{d,t} + \varepsilon_{d,f,m,s,t},

with priors like

αm,sN(μ0,s,τα,s2),βmN(β0,diag(τβ2)),γfN(0,diag(τγ2)).\alpha_{m,s} \sim \mathcal{N}(\mu_{0,s}, \tau_{\alpha,s}^2), \quad \beta_{m} \sim \mathcal{N}(\beta_0, \mathrm{diag}(\tau_{\beta}^2)), \quad \gamma_f \sim \mathcal{N}(0, \mathrm{diag}(\tau_{\gamma}^2)).
  • αm,s\alpha_{m,s} is a manager-by-strategy intercept that shrinks to the strategy base rate.
  • βm\beta_m are manager style loadings that shrink to a cross-manager mean.
  • γf\gamma_f are fund idiosyncratic tilts that shrink to 0.

You can make the loadings dynamic with a random walk if you need time variation:

βm,t=βm,t1+ηm,t,ηm,tN(0,Q).\beta_{m,t} = \beta_{m,t-1} + \eta_{m,t}, \quad \eta_{m,t} \sim \mathcal{N}(0, Q).

Systematic factors for a manager. The posterior of βm\beta_m is a stable estimate of the manager’s systematic tilts. You can:

  • Use E[βmdata]\mathbb{E}[\beta_m \mid \text{data}] as the manager’s style vector.
  • Track βm,t\beta_{m,t} over time for drift.
  • Form a manager factor-mimicking portfolio by regressing fund returns on XtX_t with the hierarchical prior to stabilize exposures.

Manager scorecard blueprint. Build tiles from posterior and predictive objects:

  • Skill and uncertainty: E[αm,]\mathbb{E}[\alpha_{m,\cdot}], sd(αm,)\mathrm{sd}(\alpha_{m,\cdot}), and P(αm,>0)P(\alpha_{m,\cdot} > 0).
  • Cross-fund dispersion: E[τf,]\mathbb{E}[\tau_{f,\cdot}] and a league table of shrinkage weights wf,w_{f,\cdot}.
  • Tail risk: P(r<L)P(r < L) for user threshold LL and predictive quantiles for next period.
  • Calibration: average log score and Brier on rolling holdouts.
  • Stability: change in βm\beta_m over time, e.g., βm,tβm,t12\lVert \beta_{m,t} - \beta_{m,t-1} \rVert_2.
  • House style: correlation matrix of μm,s\mu_{m,s} across strategies to quantify internal consistency.

Why this is systematic. Shrinkage keeps exposures from overfitting thin histories, and the hierarchical prior aligns managers to a common yardstick. Scores are not just numbers; they are posterior quantities with uncertainty that you can audit and track.

Manager benchmark: a simple map from hierarchy to decisions

We benchmark a manager eqnmeqn m against a reference eqnbeqn b.

  • Today: eqnb=globaleqn b = \text{global} (the cross‑manager base rate eqnμ0eqn \mu_0).
  • Factor‑ready: eqnb=style‑matchedeqn b = \text{style‑matched} once you estimate exposures.

1) Skill vs benchmark

Manager skill relative to a benchmark is the difference in pooled means:

αm(b)  =  μm    μb.\alpha_m(b) \;=\; \mu_m \;-\; \mu_b.

We report the posterior mean and a band, plus a credibility number:

Pr[αm(b)>0]  =  Φ ⁣(μmμbsd(μm)).\Pr[\alpha_m(b) > 0] \;=\; \Phi\!\Big(\frac{\mu_m - \mu_b}{\mathrm{sd}(\mu_m)}\Big).
  • Today, eqnμb=μ0eqn \mu_b = \mu_0 (global).
  • Factor‑ready note: when styles are in play, set eqnμmαm,s+βmXˉeqn \mu_m \leftarrow \alpha_{m,s} + \beta_m^\top \bar X and eqnμbμ0,s+β0Xˉeqn \mu_b \leftarrow \mu_{0,s} + \beta_0^\top \bar X for your next‑period factor view eqnXˉeqn \bar X.

2) New‑fund odds (underwriting lens)

For a new fund average eqnrˉneweqn \bar r_{\text{new}} with eqnneqn n deals, the manager‑informed predictive is

rˉnewdataN ⁣(μm,  τf2  +  σd2n  +  Var(μm)).\bar r_{\text{new}} \mid \text{data} \sim \mathcal{N}\!\left(\mu_m,\; \tau_f^2 \;+\; \frac{\sigma_d^2}{n} \;+\; \mathrm{Var}(\mu_m)\right).

We show the density, shade the tail eqnr<Leqn r<L, and display:

  • eqnE[rˉnew]=μmeqn \mathbb{E}[\bar r_{\text{new}}] = \mu_m
  • eqnsd(rˉnew)=τf2+σd2/n+Var(μm)eqn \mathrm{sd}(\bar r_{\text{new}}) = \sqrt{\tau_f^2 + \sigma_d^2/n + \mathrm{Var}(\mu_m)}
  • eqnPr(rˉnew<L)=Φ ⁣(Lμmsd)eqn \Pr(\bar r_{\text{new}} < L) = \Phi\!\Big(\frac{L - \mu_m}{\mathrm{sd}}\Big)

Factor‑ready note: replace eqnμmeqn \mu_m as above, and optionally add eqnXˉVar(βm)Xˉeqn \bar X^\top \mathrm{Var}(\beta_m)\bar X to the variance if you want factor‑view uncertainty.

3) Pooling anatomy (why the estimates are stable)

Funds shrink toward their manager with precision weights

wf  =  nf/σd2nf/σd2+1/τf2,μ^f  =  wfyˉf+(1wf)μm.w_f \;=\; \frac{n_f/\sigma_d^2}{n_f/\sigma_d^2 + 1/\tau_f^2}, \qquad \hat\mu_f \;=\; w_f\,\bar y_f + (1-w_f)\,\mu_m.

A quick MoM estimate for within‑manager dispersion helps sanity‑check the prior:

τ^f,MoM2  =  max ⁣{0, Var(yˉf)E ⁣[σd2nf]}.\hat\tau_{f,\text{MoM}}^2 \;=\; \max\!\left\{0,\ \mathrm{Var}(\bar y_f) - \mathbb{E}\!\Big[\frac{\sigma_d^2}{n_f}\Big]\right\}.

4) Calibration vs a baseline

We evaluate the hierarchical predictive on holdouts with two proper scores:

  • Avg log predictive density (higher better) 1Ni=1Nlogpi(ri).\frac{1}{N}\sum_{i=1}^{N} \log p_i(r_i).

  • Brier for event eqnr<Leqn r<L (lower better) 1Ni=1N(piyi)2,pi=Pr(ri<L), yi=1{ri<L}.\frac{1}{N}\sum_{i=1}^{N} (p_i - y_i)^2,\quad p_i = \Pr(r_i<L),\ y_i=\mathbf{1}\{r_i<L\}.

The scorecard also shows Delta vs global for quick benchmarking:

ΔLog=LogmLogglobal,ΔBrier=BrierglobalBrierm.\Delta \text{Log} = \text{Log}_m - \text{Log}_{\text{global}},\qquad \Delta \text{Brier} = \text{Brier}_{\text{global}} - \text{Brier}_m.

Why Brier here? Many actions hinge on the loss event. Brier directly tests the quality of eqnPr(r<L)eqn \Pr(r<L), punishing miscalibration at the threshold even if the mean looks fine.

5) What the tiles mean at a glance

  • Skill tile: eqnαm(global)=μmμ0eqn \alpha_m(\text{global}) = \mu_m - \mu_0 and eqnPr[αm>0]eqn \Pr[\alpha_m>0].
  • Dispersion tile: prior eqnτfeqn \tau_f vs eqnτ^f,MoMeqn \hat\tau_{f,\text{MoM}}.
  • Underwriting tile: eqnPr(rˉnew<L)eqn \Pr(\bar r_{\text{new}}<L) with eqnneqn n deals.
  • Calibration tiles: avg log and Brier, plus deltas vs global.
  • Shrinkage table: top eqnμ^fyˉfeqn |\hat\mu_f - \bar y_f| with weights eqnwfeqn w_f.
Manager scorecard
−10−50510150510
Tail r_bar_new < 0.00%Predictive densityThreshold LManager-informed predictive for new fund average (normal)r_bar_new (%)Density
alpha ≈ 2.72% ± 1.05%
E[r_bar_new] ≈ 3.92%
sd[r_bar_new] ≈ 3.10%
P(r_bar_new < L) ≈ 10.3%
2.00%2.45%prior tau_festimated tau_f00.511.522.5
Cross-fund dispersion tau_f: prior vs estimated (manager-specific MoM)Percent (%)
tau_f (prior) ≈ 2.00%
tau_f (est) ≈ 2.45%
Log score ≈ 1.387
Brier(L) ≈ 0.144
Holdout N = 42
Fundnraw (%)post (%)weight w|shrink| (pp)
Fund 199.346.630.502.71
Fund 390.512.210.501.71
Fund 696.675.290.501.38
Fund 592.163.040.500.88
Fund 294.464.190.500.27
Fund 493.863.890.500.03

Alpha is the manager mean minus the global base rate. tau_f compares the spread of fund means within this manager to the measurement noise implied by sigma_d and fund sample sizes. Calibration uses only holdout deals and the hierarchical predictive; the Brier event is r < L. Use the toggle to estimate tau_f per manager rather than the prior.

Factor-ready checklist

To upgrade the benchmark from global to style‑matched:

  1. Estimate eqnβmeqn \beta_m and optionally eqnβm,teqn \beta_{m,t} with hierarchical shrinkage.
  2. Swap means in formulas: eqnμmαm,s+βmXˉeqn \mu_m \rightarrow \alpha_{m,s} + \beta_m^\top \bar X, eqnμ0μ0,s+β0Xˉeqn \mu_0 \rightarrow \mu_{0,s} + \beta_0^\top \bar X.
  3. Optionally add eqnXˉVar(βm)Xˉeqn \bar X^\top \mathrm{Var}(\beta_m)\bar X to predictive variance.
  4. Reuse the same tiles and scores. Only the benchmark changed.