`Control(rng="R")` performance

Control(rng="R") drives the fast-S resample loop through pylmrob.r_set_seed + r_sample_noreplace / r_subsample_nonsingular so subset draws are byte-identical to robustbase. This page benchmarks the wall-clock cost vs the default PCG64 path.

Companion of the auto-generated bench-report.md, which compares pylmrob (PCG64, default) against R itself.

Setup

pylmrob 0.5.16
Cython R RNG kernel enabled (pylmrob._core._r_rng)
Python 3.13.12, NumPy + scipy with system OpenBLAS, Linux x86_64
5 reps per case, median reported

Wall-clock: PCG64 vs R-mode

case	n	p	PCG64 (ms)	R-mode (ms)	R-mode / PCG64
stackloss	21	4	27.3	9.7	0.36x
synth_500_5	500	5	42.9	80.1	1.87x
synth_2000_10	2000	10	246.7	389.9	1.58x

Control(rng="R") is faster than PCG64 on small problems (the formula parser and PCG64 BitGenerator spinup amortise better at scale) and slower on larger problems by ~1.6-1.9x. The slowdown comes from:

Forced n_workers=1 (R-mode is sequential; no thread-pool benefit).
Forced engine_c=False (the monolithic Cython engine owns its BitGenerator and can’t share R-mode state).
The R-style subset draw is a full permutation per attempt (n unif_rand draws); the default Cython Floyd’s combination does p per attempt.

Microbench: draw helpers

The Cython R-RNG kernel (pylmrob._core._r_rng) gives substantial speedups over the pure-Python fallback for the standalone helpers:

op	pure Python	Cython	speedup
`r_sample_noreplace(n=200, k=200)` × 500	133 ms	1.0 ms	~130x
`r_subsample_nonsingular(n=200, p=5)` × 500	162 ms	1.5 ms	~108x

When to use `rng="R"`

Reproducing a fit from an R analysis where someone published set.seed(seed).
Generating reference data for cross-language validation.
Tighter agreement with R’s lmrob (rtol ~1.7e-5 on coefs vs ~1e-3 for PCG64 on the basin-drift cases — see numerical-notes.md).

Not recommended for hot loops on large data: the default rng="PCG64" with threading is faster and statistically just as valid.

Remaining work

The residual rtol~1e-5 gap to R’s lmrob lives in the resample loop’s “associated scale” computation (R uses find_scale; pylmrob uses _mscale_generic in the Cython kernel). Porting refine_fast_s + find_scale to Cython end-to-end is tracked in plan.md §11.

Control(rng="R") performance