# Developing NMath
**TL;DR**: To rewrite R, one must implement `nmath`. `Nmath` is a monstrous library that provides distribution functions, and is the foundation of almost everything R does. It is also extremely difficult to follow, and even harder to translate into a modern language. It was not a fun time.
---
It was a bright, joyful day. The sun was shining. The birds were singing. I was working on my statistics homework and had the great idea to look at what statistics tools were available for this newfangled language I was dabbling in called Rust.
You see, I was a Computer Science major that had decided to double major in Data Science, so I had the mind of a programmer walking into the statistics world. I had started my programming journey in Python and loved its simplicity and robust library choices. School had started to teach me Java, but I decided that I wanted to use a language with more flexibility that didn't look like it came from the early 2000's (sorry not sorry!). Then I had been introduced to C/C++ and learned the power that came with a systems-level language. The control! The speed! The superpowers! The only problem was stupid errors, segfaults, and general confusion that surround any C/C++ project. At this point Rust had started to really take off (this was ca. 2017). I knew that this amazing language that was simple, flexible, safe, and powerful met everything that I had come to love about programming, so I wanted to invest in it's ecosystem.
The only problem was the statistics packages for Rust were non-existent, abandoned, or extremely young. When I see a need unfulfilled, I fill it.
"How hard could it be", I thought.
"It'll be fun", I thought.
"I might learn something", I thought.
**One** of those things were true.
---
Armed with my statistics textbooks, the internet, RStudio, and trusty old Rust I set out after a dream of making a robust, defacto statistics library for Rust that would be everything I dreamed of. It was *insanity*!
At first I tried to manually implement my own versions of linear modeling and plotting using textbooks and internet help. `Plotters` was a massively useful tool for plotting, and combined with the matrix mathematics provided by `nalgebra` I was able to get some really decent results, but found that the answers I was generating were different than R, not significantly so, but enough that I was uncomfortable. I wanted to have "correct" answers, and so I decided to use R's answers as the gold standard for what is correct. This meant I would have to create my library based off of R's source code.
I had to spend some time investigating how R works, and I discovered it's built in layers. Most people will use top-level functions like `lm`, `glm`, or `plot`, but as I discovered, these functions all depend upon other functions within R's `stats` package, which in turn depend on everything provided by `base`. This in turn depends upon a mess of code called `nmath` that sits at the core of R. This foundation layer provides, most notably, all of R's distribution functions like `dbinom`, `qnorm`, `rexp`, etc. Almost every function within R uses distribution functions in some way, so in order to reimplement R I would first have to reimplement `nmath` first.
| R Structure |
| :-----------------------------: |
| other packages (glm) |
| stats (lm), graphics (plot) |
| base (operators like +, -, %*%) |
| nmath (dbinom, qnorm, rexp) |
`Nmath` is a complex beast. It's is a combination of old translations of Fortran into C, countless edge-case handling snippets, and a tangle of C functions that all depend on each other. It is almost impossible to find a singular function that can be implemented without having another function already implemented, and those have *a ton* of lines of code. I knew this would be a complicated job, so I started snooping around for something to ease my future suffering and found the `c2rs` project., a program with the ability to translate C into Rust. It seemed like the perfect solution to my problem. It took some fiddling to get it working, but after discovering nmath could be compiled independently of the rest of R I was able to generate the translated code off of a simple `make` command (`c2rs` is amazing and I highly recommend it!).
The beginning block of the original code for `polygamma.c` looked like this:
```C
#include "nmath.h"
#ifdef MATHLIB_STANDALONE
#include <errno.h>
#endif
/* From R, currently only used for kode = 1, m = 1 : */
void dpsifn(double x, int n, int kode, int m, double *ans, int *nz, int *ierr)
{
const static double bvalues[] = { /* Bernoulli Numbers */
1.00000000000000000e+00,
-5.00000000000000000e-01,
1.66666666666666667e-01,
-3.33333333333333333e-02,
2.38095238095238095e-02,
-3.33333333333333333e-02,
7.57575757575757576e-02,
-2.53113553113553114e-01,
1.16666666666666667e+00,
-7.09215686274509804e+00,
5.49711779448621554e+01,
-5.29124242424242424e+02,
6.19212318840579710e+03,
-8.65802531135531136e+04,
1.42551716666666667e+06,
-2.72982310678160920e+07,
6.01580873900642368e+08,
-1.51163157670921569e+10,
4.29614643061166667e+11,
-1.37116552050883328e+13,
4.88332318973593167e+14,
-1.92965793419400681e+16
};
int i, j, k, nn, np, nx, fn;
double arg, den, eps, fx, rxsq,
s, t, ta, tk, tol, tols, tss, tst,
tt, t1, t2, xdmln, xdmy, xinc,
xm, xmin, xq;
double trm[23], trmr[n_max + 1];
*ierr = 0;
*nz = 0;
if (n < 0 || kode < 1 || kode > 2 || m < 1) {
*ierr = 1;
return;
}
if (x <= 0.) {
/* use Abramowitz & Stegun 6.4.7 "Reflection Formula", p.260
* psi(n, 1-x) + (-1)^(n+1) psi(n, x) = (-1)^n pi (d/dx)^n cot(pi*x) * psi(n, x) = (-1)^n psi(n, 1-x) - pi^{n+1} d_n(pi*x), where d_n(x) := (d/dx)^n cot(x) */ if (x == round(x)) {
/* non-positive integer : +Inf or NaN depends on n */
for(j=0; j < m; j++) /* k = j + n : */
ans[j] = ((j+n) % 2) ? ML_POSINF : ML_NAN;
return;
}
/* This could cancel badly */
dpsifn(1. - x, n, /*kode = */ 1, m, ans, nz, ierr);
/* ans[j] == (-1)^(k+1) / gamma(k+1) * psi(k, 1 - x)
* for j = 0:(m-1) , k = n + j */
/* For now: only work for n in {0,1,..,5} : */ if(n > 5) {
/* not yet implemented for x < 0 and n >= 6 */
*ierr = 4; return;
}
// tt := d_n(pi * x)
x *= M_PI; /* pi * x */
// t := pi^(n+1) * d_n(x) / gamma(n+1) t1 = t2 = s = 1.;
for(k=0, j=k-n; j < m; k++, j++, s = -s) {
/* k == n+j , s = (-1)^k */
t1 *= M_PI;/* t1 == pi^(k+1) */
if(k >= 2)
t2 *= k;/* t2 == k! == gamma(k+1) */
if(j >= 0) /* now using d_k(x) */
ans[j] = s*(ans[j] + t1/t2 * d_n_cot(x, k));
}
/* if (n == 0 && kode == 2) -- nonsense for x < 0 !
* ans[0] += log(x); */ return;
} /* x <= 0 */
/* else : x > 0 */ double xln = log(x);
if(kode == 1 && m == 1) {/* the R case --- for very large x: */
double lrg = 1/(2. * DBL_EPSILON);
if(n == 0 && x * xln > lrg) {
ans[0] = -xln;
return;
}
else if(n >= 1 && x > n * lrg) {
ans[0] = exp(-n * xln)/n; /* == x^-n / n == 1/(n * x^n) */
return;
}
}
nx = imin2(-Rf_i1mach(15), Rf_i1mach(16));/* = 1021 */
const double
r1m5 = Rf_d1mach(5), // = M_LOG10_2 = log10(2) = 0.30103..
r1m4 = Rf_d1mach(4) * 0.5, // = DBL_EPSILON * 0.5 = 2^-53 = 1.110223e-16
wdtol = fmax2(r1m4, 0.5e-18); /* = 2^-53 = 1.11e-16 */
/* elim = approximate exponential over and underflow limit */ double elim = 2.302 * (nx * r1m5 - 3.0);/* = 700.6174... */
const double
rln = fmin2(r1m5 * Rf_i1mach(14), 18.06);// = 0.30103 * 53 = 15.95.. ~= #{decimals}
double fln = fmax2(rln, 3.0) - 3.0; // = 12.95..
const double yint = 3.50 + 0.40 * fln, // = 8.6818..
slope = 0.21 + fln * (0.0006038 * fln + 0.008677); // = 0.4237..
int mm = m;
for(;;) {
nn = n + mm - 1;
fn = nn;
t = (fn + 1) * xln;
/* overflow and underflow test for small and large x */
if (fabs(t) > elim) {
if (t <= 0.0) {
*ierr = 2;
return;
}
}
else {
if (x < wdtol) {
ans[0] = R_pow_di(x, -n-1);
if (mm != 1) {
for(k = 1; k < mm ; k++)
ans[k] = ans[k-1] / x;
}
if (n == 0 && kode == 2)
ans[0] += xln;
return;
}
/* compute xmin and the number of terms of the series, fln+1 */
xm = yint + slope * fn;
int mx = (int)xm + 1;
xmin = mx;
if (n != 0) {
xm = -2.302 * rln - fmin2(0.0, xln);
arg = fmin2(0., xm / n);
eps = exp(arg);
xm = (fabs(arg) < 1.0e-3) ? -arg : 1. - eps;
fln = x * xm / eps;
xm = xmin - x;
if (xm > 7.0 && fln < 15.0)
break;
}
xdmy = x;
xdmln = xln;
xinc = 0.0;
if (x < xmin) {
nx = (int)x;
xinc = xmin - nx;
xdmy = x + xinc;
xdmln = log(xdmy);
}
/* generate w(n+mm-1, x) by the asymptotic expansion */
t = fn * xdmln;
t1 = xdmln + xdmln;
t2 = t + xdmln;
tk = fmax2(fabs(t), fmax2(fabs(t1), fabs(t2)));
if (tk <= elim) /* for all but large x */
goto L10;
}
nz++; /* nz := #{underflows} */
mm--;
ans[mm] = 0.;
if (mm == 0)
return;
} /* end{for()} */
nn = (int)fln + 1;
np = n + 1;
t1 = (n + 1) * xln;
t = exp(-t1);
s = t;
den = x;
for(i=1; i <= nn; i++) {
den += 1.;
trm[i] = pow(den, (double)-np);
s += trm[i];
}
ans[0] = s;
if (n == 0 && kode == 2)
ans[0] = s + xln;
if (mm != 1) { /* generate higher derivatives, j > n */
tol = wdtol / 5.0;
for(j = 1; j < mm; j++) {
t /= x;
s = t;
tols = t * tol;
den = x;
for(i=1; i <= nn; i++) {
den += 1.;
trm[i] /= den;
s += trm[i];
if (trm[i] < tols)
break;
}
ans[j] = s;
}
}
return;
L10:
tss = exp(-t);
tt = 0.5 / xdmy;
t1 = tt;
tst = wdtol * tt;
if (nn != 0)
t1 = tt + 1.0 / fn;
rxsq = 1.0 / (xdmy * xdmy);
ta = 0.5 * rxsq;
t = (fn + 1) * ta;
s = t * bvalues[2];
if (fabs(s) >= tst) {
tk = 2.0;
for(k = 4; k <= 22; k++) {
t = t * ((tk + fn + 1)/(tk + 1.0))*((tk + fn)/(tk + 2.0)) * rxsq;
trm[k] = t * bvalues[k-1];
if (fabs(trm[k]) < tst)
break;
s += trm[k];
tk += 2.;
}
}
s = (s + t1) * tss;
if (xinc != 0.0) {
/* backward recur from xdmy to x */
nx = (int)xinc;
np = nn + 1;
if (nx > n_max) {
*ierr = 3;
return;
}
else {
if (nn==0)
goto L20;
xm = xinc - 1.0;
fx = x + xm;
/* this loop should not be changed. fx is accurate when x is small */
for(i = 1; i <= nx; i++) {
trmr[i] = pow(fx, (double)-np);
s += trmr[i];
xm -= 1.;
fx = x + xm;
}
}
}
ans[mm-1] = s;
if (fn == 0)
goto L30;
/* generate lower derivatives, j < n+mm-1 */
for(j = 2; j <= mm; j++) {
fn--;
tss *= xdmy;
t1 = tt;
if (fn!=0)
t1 = tt + 1.0 / fn;
t = (fn + 1) * ta;
s = t * bvalues[2];
if (fabs(s) >= tst) {
tk = 4 + fn;
for(k=4; k <= 22; k++) {
trm[k] = trm[k] * (fn + 1) / tk;
if (fabs(trm[k]) < tst)
break;
s += trm[k];
tk += 2.;
}
}
s = (s + t1) * tss;
if (xinc != 0.0) {
if (fn == 0)
goto L20;
xm = xinc - 1.0;
fx = x + xm;
for(i=1 ; i<=nx ; i++) {
trmr[i] = trmr[i] * fx;
s += trmr[i];
xm -= 1.;
fx = x + xm;
}
}
ans[mm - j] = s;
if (fn == 0)
goto L30;
}
return;
L20:
for(i = 1; i <= nx; i++)
s += 1. / (x + (nx - i)); /* avoid disastrous cancellation, PR#13714 */
L30:
if (kode != 2) /* always */
ans[0] = s - xdmln;
else if (xdmy != x) {
xq = xdmy / x;
ans[0] = s - log(xq);
}
return;
} /* dpsifn() */
```
The generated code for the `polygamma.rs` snippet looked something like this:
```rust
use ::libc;
extern "C" {
#[no_mangle]
fn cos(_: libc::c_double) -> libc::c_double;
#[no_mangle]
fn sin(_: libc::c_double) -> libc::c_double;
#[no_mangle]
fn exp(_: libc::c_double) -> libc::c_double;
#[no_mangle]
fn log(_: libc::c_double) -> libc::c_double;
#[no_mangle]
fn pow(_: libc::c_double, _: libc::c_double) -> libc::c_double;
#[no_mangle]
fn fabs(_: libc::c_double) -> libc::c_double;
#[no_mangle]
fn round(_: libc::c_double) -> libc::c_double;
#[no_mangle]
fn R_pow_di(_: libc::c_double, _: libc::c_int) -> libc::c_double;
#[no_mangle]
fn Rf_i1mach(_: libc::c_int) -> libc::c_int;
#[no_mangle]
fn Rf_d1mach(_: libc::c_int) -> libc::c_double;
#[no_mangle]
static mut R_NaN: libc::c_double;
#[no_mangle]
static mut R_PosInf: libc::c_double;
#[no_mangle]
fn dcgettext(
__domainname: *const libc::c_char,
__msgid: *const libc::c_char,
__category: libc::c_int,
) -> *mut libc::c_char;
#[no_mangle]
fn Rf_warning(_: *const libc::c_char, _: ...);
/* General Support Functions */
#[no_mangle]
fn Rf_imin2(_: libc::c_int, _: libc::c_int) -> libc::c_int;
#[no_mangle]
fn Rf_fmax2(_: libc::c_double, _: libc::c_double) -> libc::c_double;
#[no_mangle]
fn Rf_fmin2(_: libc::c_double, _: libc::c_double) -> libc::c_double;
}
/* From R, currently only used for kode = 1, m = 1 : */
#[no_mangle]
pub unsafe extern "C" fn Rf_dpsifn(
mut x: libc::c_double,
mut n: libc::c_int,
mut kode: libc::c_int,
mut m: libc::c_int,
mut ans: *mut libc::c_double,
mut nz: *mut libc::c_int,
mut ierr: *mut libc::c_int,
) {
let mut current_block: u64; /* x <= 0 */
static mut bvalues: [libc::c_double; 22] = [
1.00000000000000000e+00f64,
-5.00000000000000000e-01f64,
1.66666666666666667e-01f64,
-3.33333333333333333e-02f64,
2.38095238095238095e-02f64,
-3.33333333333333333e-02f64,
7.57575757575757576e-02f64,
-2.53113553113553114e-01f64,
1.16666666666666667e+00f64,
-7.09215686274509804e+00f64,
5.49711779448621554e+01f64,
-5.29124242424242424e+02f64,
6.19212318840579710e+03f64,
-8.65802531135531136e+04f64,
1.42551716666666667e+06f64,
-2.72982310678160920e+07f64,
6.01580873900642368e+08f64,
-1.51163157670921569e+10f64,
4.29614643061166667e+11f64,
-1.37116552050883328e+13f64,
4.88332318973593167e+14f64,
-1.92965793419400681e+16f64,
];
let mut i: libc::c_int = 0;
let mut j: libc::c_int = 0;
let mut k: libc::c_int = 0;
let mut mm: libc::c_int = 0;
let mut mx: libc::c_int = 0;
let mut nn: libc::c_int = 0;
let mut np: libc::c_int = 0;
let mut nx: libc::c_int = 0;
let mut fn_0: libc::c_int = 0;
let mut arg: libc::c_double = 0.;
let mut den: libc::c_double = 0.;
let mut elim: libc::c_double = 0.;
let mut eps: libc::c_double = 0.;
let mut fln: libc::c_double = 0.;
let mut fx: libc::c_double = 0.;
let mut rln: libc::c_double = 0.;
let mut rxsq: libc::c_double = 0.;
let mut r1m4: libc::c_double = 0.;
let mut r1m5: libc::c_double = 0.;
let mut s: libc::c_double = 0.;
let mut slope: libc::c_double = 0.;
let mut t: libc::c_double = 0.;
let mut ta: libc::c_double = 0.;
let mut tk: libc::c_double = 0.;
let mut tol: libc::c_double = 0.;
let mut tols: libc::c_double = 0.;
let mut tss: libc::c_double = 0.;
let mut tst: libc::c_double = 0.;
let mut tt: libc::c_double = 0.;
let mut t1: libc::c_double = 0.;
let mut t2: libc::c_double = 0.;
let mut wdtol: libc::c_double = 0.;
let mut xdmln: libc::c_double = 0.;
let mut xdmy: libc::c_double = 0.;
let mut xinc: libc::c_double = 0.;
let mut xln: libc::c_double = 0.0f64;
let mut xm: libc::c_double = 0.;
let mut xmin: libc::c_double = 0.;
let mut xq: libc::c_double = 0.;
let mut yint: libc::c_double = 0.;
let mut trm: [libc::c_double; 23] = [0.; 23];
let mut trmr: [libc::c_double; 101] = [0.; 101];
*ierr = 0 as libc::c_int;
if n < 0 as libc::c_int
|| kode < 1 as libc::c_int
|| kode > 2 as libc::c_int
|| m < 1 as libc::c_int
{
*ierr = 1 as libc::c_int;
return;
}
if x <= 0.0f64 {
/* use Abramowitz & Stegun 6.4.7 "Reflection Formula"
* psi(k, x) = (-1)^k psi(k, 1-x) - pi^{n+1} (d/dx)^n cot(x) */ if x == round(x) {
/* non-positive integer : +Inf or NaN depends on n */
j = 0 as libc::c_int;
while j < m {
/* k = j + n : */
*ans.offset(j as isize) = if (j + n) % 2 as libc::c_int != 0 {
R_PosInf
} else {
R_NaN
};
j += 1
}
return;
}
/* This could cancel badly */
Rf_dpsifn(1.0f64 - x, n, 1 as libc::c_int, m, ans, nz, ierr);
/* ans[j] == (-1)^(k+1) / gamma(k+1) * psi(k, 1 - x)
* for j = 0:(m-1) , k = n + j */ /* Cheat for now: only work for m = 1, n in {0,1,2,3} : */ if m > 1 as libc::c_int || n > 3 as libc::c_int {
/* doesn't happen for digamma() .. pentagamma() */
/* not yet implemented */ *ierr = 4 as libc::c_int; /* pi * x */
return;
}
x *= 3.14159265358979323846f64;
if n == 0 as libc::c_int {
tt = cos(x) / sin(x)
} else if n == 1 as libc::c_int {
tt = -(1 as libc::c_int) as libc::c_double / R_pow_di(sin(x), 2 as libc::c_int)
} else if n == 2 as libc::c_int {
tt = 2 as libc::c_int as libc::c_double * cos(x) / R_pow_di(sin(x), 3 as libc::c_int)
} else if n == 3 as libc::c_int {
tt = -(2 as libc::c_int) as libc::c_double
* (2 as libc::c_int as libc::c_double * R_pow_di(cos(x), 2 as libc::c_int) + 1.0f64)
/ R_pow_di(sin(x), 4 as libc::c_int)
} else {
/* can not happen! */
tt = R_NaN
}
/* end cheat */
s = if n % 2 as libc::c_int != 0 {
-1.0f64
} else {
1.0f64
}; /* s = (-1)^n */
/* t := pi^(n+1) * d_n(x) / gamma(n+1) , where * d_n(x) := (d/dx)^n cot(x)*/ s = 1.0f64;
t2 = s;
t1 = t2;
k = 0 as libc::c_int;
j = k - n;
while j < m {
/* k == n+j , s = (-1)^k */
t1 *= 3.14159265358979323846f64; /* t1 == pi^(k+1) */
if k >= 2 as libc::c_int {
t2 *= k as libc::c_double
} /* t2 == k! == gamma(k+1) */
if j >= 0 as libc::c_int {
/* by cheat above, tt === d_k(x) */
*ans.offset(j as isize) = s * (*ans.offset(j as isize) + t1 / t2 * tt)
}
k += 1;
j += 1;
s = -s
}
if n == 0 as libc::c_int && kode == 2 as libc::c_int {
/* unused from R, but "wrong": xln === 0 :*/
*ans.offset(0 as libc::c_int as isize) += xln
}
return;
}
/* else : x > 0 */
*nz = 0 as libc::c_int;
xln = log(x);
if kode == 1 as libc::c_int && m == 1 as libc::c_int {
/* the R case --- for very large x: */
let mut lrg: libc::c_double =
1 as libc::c_int as libc::c_double / (2.0f64 * 2.2204460492503131e-16f64); /* == x^-n / n == 1/(n * x^n) */
if n == 0 as libc::c_int && x * xln > lrg {
*ans.offset(0 as libc::c_int as isize) = -xln; /* = 1021 */
return;
} else {
if n >= 1 as libc::c_int && x > n as libc::c_double * lrg {
*ans.offset(0 as libc::c_int as isize) =
exp(-n as libc::c_double * xln) / n as libc::c_double; /* 1.11e-16 */
return;
}
}
}
mm = m;
nx = Rf_imin2(-Rf_i1mach(15 as libc::c_int), Rf_i1mach(16 as libc::c_int));
r1m5 = Rf_d1mach(5 as libc::c_int);
r1m4 = Rf_d1mach(4 as libc::c_int) * 0.5f64;
wdtol = Rf_fmax2(r1m4, 0.5e-18f64);
/* elim = approximate exponential over and underflow limit */
elim = 2.302f64 * (nx as libc::c_double * r1m5 - 3.0f64); /* = 700.6174... */
loop {
nn = n + mm - 1 as libc::c_int; /* end{for()} */
fn_0 = nn;
t = (fn_0 + 1 as libc::c_int) as libc::c_double * xln;
/* overflow and underflow test for small and large x */
if fabs(t) > elim {
if t <= 0.0f64 {
*nz = 0 as libc::c_int;
*ierr = 2 as libc::c_int;
return;
}
} else {
if x < wdtol {
*ans.offset(0 as libc::c_int as isize) = R_pow_di(x, -n - 1 as libc::c_int);
if mm != 1 as libc::c_int {
k = 1 as libc::c_int;
while k < mm {
*ans.offset(k as isize) = *ans.offset((k - 1 as libc::c_int) as isize) / x;
k += 1
}
}
if n == 0 as libc::c_int && kode == 2 as libc::c_int {
*ans.offset(0 as libc::c_int as isize) += xln
}
return;
}
/* compute xmin and the number of terms of the series, fln+1 */
rln = r1m5 * Rf_i1mach(14 as libc::c_int) as libc::c_double;
rln = Rf_fmin2(rln, 18.06f64);
fln = Rf_fmax2(rln, 3.0f64) - 3.0f64;
yint = 3.50f64 + 0.40f64 * fln;
slope = 0.21f64 + fln * (0.0006038f64 * fln + 0.008677f64);
xm = yint + slope * fn_0 as libc::c_double;
mx = xm as libc::c_int + 1 as libc::c_int;
xmin = mx as libc::c_double;
if n != 0 as libc::c_int {
xm = -2.302f64 * rln - Rf_fmin2(0.0f64, xln);
arg = xm / n as libc::c_double;
arg = Rf_fmin2(0.0f64, arg);
eps = exp(arg);
xm = 1.0f64 - eps;
if fabs(arg) < 1.0e-3f64 {
xm = -arg
}
fln = x * xm / eps;
xm = xmin - x;
if xm > 7.0f64 && fln < 15.0f64 {
current_block = 11865390570819897086;
break;
}
}
xdmy = x;
xdmln = xln;
xinc = 0.0f64;
if x < xmin {
nx = x as libc::c_int;
xinc = xmin - nx as libc::c_double;
xdmy = x + xinc;
xdmln = log(xdmy)
}
/* generate w(n+mm-1, x) by the asymptotic expansion */
t = fn_0 as libc::c_double * xdmln; /* underflow */
t1 = xdmln + xdmln;
t2 = t + xdmln;
tk = Rf_fmax2(fabs(t), Rf_fmax2(fabs(t1), fabs(t2)));
if tk <= elim {
current_block = 4510563271777997454;
break;
}
}
nz = nz.offset(1);
mm -= 1;
*ans.offset(mm as isize) = 0.0f64;
if mm == 0 as libc::c_int {
return;
}
}
match current_block {
11865390570819897086 => {
nn = fln as libc::c_int + 1 as libc::c_int;
np = n + 1 as libc::c_int;
t1 = (n + 1 as libc::c_int) as libc::c_double * xln;
t = exp(-t1);
s = t;
den = x;
i = 1 as libc::c_int;
while i <= nn {
den += 1.0f64;
trm[i as usize] = pow(den, -np as libc::c_double);
s += trm[i as usize];
i += 1
}
*ans.offset(0 as libc::c_int as isize) = s;
if n == 0 as libc::c_int && kode == 2 as libc::c_int {
*ans.offset(0 as libc::c_int as isize) = s + xln
}
if mm != 1 as libc::c_int {
/* generate higher derivatives, j > n */
tol = wdtol / 5.0f64;
j = 1 as libc::c_int;
while j < mm {
t /= x;
s = t;
tols = t * tol;
den = x;
i = 1 as libc::c_int;
while i <= nn {
den += 1.0f64;
trm[i as usize] /= den;
s += trm[i as usize];
if trm[i as usize] < tols {
break;
}
i += 1
}
*ans.offset(j as isize) = s;
j += 1
}
}
return;
}
_ =>
/* for all but large x */
{
tss = exp(-t);
tt = 0.5f64 / xdmy;
t1 = tt;
tst = wdtol * tt;
if nn != 0 as libc::c_int {
t1 = tt + 1.0f64 / fn_0 as libc::c_double
}
rxsq = 1.0f64 / (xdmy * xdmy);
ta = 0.5f64 * rxsq;
t = (fn_0 + 1 as libc::c_int) as libc::c_double * ta;
s = t * bvalues[2 as libc::c_int as usize];
if fabs(s) >= tst {
tk = 2.0f64;
k = 4 as libc::c_int;
while k <= 22 as libc::c_int {
t = t
* ((tk + fn_0 as libc::c_double + 1 as libc::c_int as libc::c_double)
/ (tk + 1.0f64))
* ((tk + fn_0 as libc::c_double) / (tk + 2.0f64))
* rxsq;
trm[k as usize] = t * bvalues[(k - 1 as libc::c_int) as usize];
if fabs(trm[k as usize]) < tst {
break;
}
s += trm[k as usize];
tk += 2.0f64;
k += 1
}
}
s = (s + t1) * tss;
if xinc != 0.0f64 {
/* backward recur from xdmy to x */
nx = xinc as libc::c_int;
np = nn + 1 as libc::c_int;
if nx > 100 as libc::c_int {
*nz = 0 as libc::c_int;
*ierr = 3 as libc::c_int;
return;
} else if nn == 0 as libc::c_int {
current_block = 10489131089047693169;
} else {
xm = xinc - 1.0f64;
fx = x + xm;
/* this loop should not be changed. fx is accurate when x is small */
i = 1 as libc::c_int;
while i <= nx {
trmr[i as usize] = pow(fx, -np as libc::c_double);
s += trmr[i as usize];
xm -= 1.0f64;
fx = x + xm;
i += 1
}
current_block = 6893286596494697181;
}
} else {
current_block = 6893286596494697181;
}
match current_block {
6893286596494697181 => {
*ans.offset((mm - 1 as libc::c_int) as isize) = s;
if fn_0 == 0 as libc::c_int {
current_block = 9437230201039677552;
} else {
/* generate lower derivatives, j < n+mm-1 */
j = 2 as libc::c_int; /* avoid disastrous cancellation, PR#13714 */
loop {
if !(j <= mm) {
current_block = 5085513901186096404;
break;
}
fn_0 -= 1;
tss *= xdmy;
t1 = tt;
if fn_0 != 0 as libc::c_int {
t1 = tt + 1.0f64 / fn_0 as libc::c_double
}
t = (fn_0 + 1 as libc::c_int) as libc::c_double * ta;
s = t * bvalues[2 as libc::c_int as usize];
if fabs(s) >= tst {
tk = (4 as libc::c_int + fn_0) as libc::c_double;
k = 4 as libc::c_int;
while k <= 22 as libc::c_int {
trm[k as usize] = trm[k as usize]
* (fn_0 + 1 as libc::c_int) as libc::c_double
/ tk;
if fabs(trm[k as usize]) < tst {
break;
}
s += trm[k as usize];
tk += 2.0f64;
k += 1
}
}
s = (s + t1) * tss;
if xinc != 0.0f64 {
if fn_0 == 0 as libc::c_int {
current_block = 10489131089047693169;
break;
}
xm = xinc - 1.0f64;
fx = x + xm;
i = 1 as libc::c_int;
while i <= nx {
trmr[i as usize] = trmr[i as usize] * fx;
s += trmr[i as usize];
xm -= 1.0f64;
fx = x + xm;
i += 1
}
}
*ans.offset((mm - j) as isize) = s;
if fn_0 == 0 as libc::c_int {
current_block = 9437230201039677552;
break;
}
j += 1
}
match current_block {
9437230201039677552 => {}
10489131089047693169 => {}
_ => return,
}
}
}
_ => {}
}
match current_block {
10489131089047693169 => {
i = 1 as libc::c_int;
while i <= nx {
s += 1.0f64 / (x + (nx - i) as libc::c_double);
i += 1
}
}
_ => {}
}
if kode != 2 as libc::c_int {
/* always */
*ans.offset(0 as libc::c_int as isize) = s - xdmln
} else if xdmy != x {
xq = xdmy / x;
*ans.offset(0 as libc::c_int as isize) = s - log(xq)
}
return;
}
};
}
/* dpsifn() */
```
We go from one complicated set of code (in C) to another (in Rust) that is arguably *more* complicated. It didn't compile and it required quite a lot of unsafe, strictly typed, yucky, un-Rustaceous code. After looking at some of these files to figure out what `c2rs` had done to the code, I noticed that quite a bit of it was just extra pieces I could remove or parts I could translate standard Rust. Specific variable typing (like `libc::c_int`) could be translated into their R types (i.e. `i32`) or in many cases removed altogether. Most of the `extern` dependencies were actually calls into mathematical functions (like `cos`) that Rust provides via methods on its `f64` type, so those were easily translated, and the remaining calls could be turned into `use`'s. So, I built a program (lovingly and creatively called `refactor` because I'm great at names) to go through each of the generated files and remove the excess parts. In fairness, I probably spent just as much time on this whole process as I would have going line-by-line reimplementing the original source code instead.
After a great deal of fiddling, fixing, replacing, and editing the code later I was able to get a build that would actually ... well, build, which for Rust means that what I had written at least made sense on *some* level. The only problem was, the answers it was generating were not correct. After all of that work, I had failed to actually reimplement nmath. It was honestly quite devastating and I had to take a break for a few months.
---
![Inner Peace](../../posts-img/developing-nmath/inner-peace.jpg)
After I had rested and had a fresh set of eyes I came back and started to painstakingly create tests, compare results, and print debug every part of nmath, slowly bringing my code more and more inline with what R's nmath was told me I was supposed to have. Through this process I also was able to remove all the `unsafe` blocks as I wasn't doing any crazy C stuff, and that led to much greater stability
The hardest files were those that had `goto` statements, as Rust does not have a direct analog for a `goto`. Adding insult to injury, `c2rs` decided to translate these into very strange `loop`'s with `match` statements that matched based on a numbered block value (you can see this in the Rust code above at line 302). It wasn't until recently (a few weeks ago) I actually decided to fix these into a more rust-like function call setup, where each `goto` statement is actually split into its own function with a shared `state` variable that is passed through each function call. Prior to this these functions were just a haven for buggy behavior and unexpected responses.
After all the fixes polygamma looks like this:
```rust
use num_traits::{Float, FloatConst};
use strafe_consts::LOG10_2;
use strafe_type::{FloatConstraint, Positive64, PositiveInteger64, Real64};
use crate::traits::DPQ;
#[derive(Clone, Debug, Default)]
struct DPsiState {
i: usize,
j: isize,
k: usize,
nn: usize,
np: usize,
nx: usize,
_fn: usize,
arg: f64,
den: f64,
eps: f64,
fx: f64,
rxsq: f64,
s: f64,
t: f64,
ta: f64,
tk: f64,
tol: f64,
tols: f64,
tss: f64,
tst: f64,
tt: f64,
t1: f64,
t2: f64,
xdmln: f64,
xdmy: f64,
xinc: f64,
xm: f64,
xmin: f64,
xq: f64,
trm: Vec<f64>,
trmr: Vec<f64>,
exit: bool,
}
pub fn dpsi<R: Into<Real64>>(
x: R,
n: usize,
kode: i32,
m: usize,
ans: &mut [Real64],
nz: &mut i32,
ierr: &mut i32,
) {
let mut x = x.into().unwrap();
let mut state = DPsiState::default();
state.trm = vec![0.0; 23];
state.trmr = vec![0.0; n_max + 1];
let bvalues = [
/* Bernoulli Numbers */
1.00000000000000000e+00,
-5.00000000000000000e-01,
1.66666666666666667e-01,
-3.33333333333333333e-02,
2.38095238095238095e-02,
-3.33333333333333333e-02,
7.57575757575757576e-02,
-2.53113553113553114e-01,
1.16666666666666667e+00,
-7.09215686274509804e+00,
5.49711779448621554e+01,
-5.29124242424242424e+02,
6.19212318840579710e+03,
-8.65802531135531136e+04,
1.42551716666666667e+06,
-2.72982310678160920e+07,
6.01580873900642368e+08,
-1.51163157670921569e+10,
4.29614643061166667e+11,
-1.37116552050883328e+13,
4.88332318973593167e+14,
-1.92965793419400681e+16,
];
*ierr = 0;
*nz = 0;
if kode < 1 || kode > 2 || m < 1 {
*ierr = 1;
return;
}
if x <= 0.0 {
/* use Abramowitz & Stegun 6.4.7 "Reflection Formula", p.260
* psi(n, 1-x) + (-1)^(n+1) psi(n, x) = (-1)^n pi (d/dx)^n cot(pi*x) * psi(n, x) = (-1)^n psi(n, 1-x) - pi^{n+1} d_n(pi*x), where d_n(x) := (d/dx)^n cot(x) */ if x == x.round() {
/* non-positive integer : +Inf or NaN depends on n */
state.j = 0;
while state.j < m as isize {
/* k = j + n : */
ans[state.j as usize] = if (state.j as usize + n) % 2 != 0 {
f64::infinity().into()
} else {
f64::nan().into()
};
state.j += 1;
}
return;
}
/* This could cancel badly */
dpsi(1.0 - x, n, /*kode = */ 1, m, ans, nz, ierr);
/* ans[j] == (-1)^(k+1) / gamma(k+1) * psi(k, 1 - x)
* for j = 0:(m-1) , k = n + j */
/* For now: only work for n in {0,1,..,5} : */ if n > 5 {
/* not yet implemented for x < 0 and n >= 6 */
*ierr = 4;
return;
}
// tt := d_n(pi * x)
x *= f64::PI(); /* pi * x */
// t := pi^(n+1) * d_n(x) / gamma(n+1) state.t1 = 1.0;
state.t2 = 1.0;
state.s = 1.0;
state.k = 0;
state.j = state.k as isize - n as isize;
while state.j < m as isize {
/* k == n+j , s = (-1)^k */
state.t1 *= f64::PI(); /* t1 == pi^(k+1) */
if state.k >= 2 {
state.t2 *= state.k as f64; /* t2 == k! == gamma(k+1) */
}
if state.j >= 0 {
/* now using d_k(x) */
ans[state.j as usize] = (state.s
* (ans[state.j as usize].unwrap() + state.t1 / state.t2 * d_n_cot(x, state.k)))
.into();
}
state.k += 1;
state.j += 1;
state.s = -state.s;
}
/* if (n == 0 && kode == 2) -- nonsense for x < 0 !
* ans[0] += log(x); */ return;
} /* x <= 0 */
/* else : x > 0 */ let xln = x.ln();
if kode == 1 && m == 1 {
/* the R case --- for very large x: */
let lrg = 1.0 / (2.0 * f64::epsilon());
if n == 0 && x * xln > lrg {
ans[0] = (-xln).into();
return;
} else if n >= 1 && x > n as f64 * lrg {
ans[0] = ((-(n as f64) * xln).exp() / n as f64).into(); /* == x^-n / n == 1/(n * x^n) */
return;
}
}
state.nx = (-f64::MIN_EXP).min(f64::MAX_EXP) as usize; /* = 1021 */
let r1m5 = LOG10_2; // = M_LOG10_2 = log10(2) = 0.30103..
let r1m4 = f64::EPSILON * 0.5; // = DBL_EPSILON * 0.5 = 2^-53 = 1.110223e-16
let wdtol = r1m4.max(0.5e-18); /* = 2^-53 = 1.11e-16 */
/* elim = approximate exponential over and underflow limit */ let elim = 2.302 * (state.nx as f64 * r1m5 - 3.0); /* = 700.6174... */
let rln = (r1m5 * f64::MANTISSA_DIGITS as f64).min(18.06); // = 0.30103 * 53 = 15.95.. ~= #{decimals}
let mut fln = rln.max(3.0) - 3.0; // = 12.95..
let yint = 3.50 + 0.40 * fln; // = 8.6818..
let slope = 0.21 + fln * (0.0006038 * fln + 0.008677); // = 0.4237..
let mut mm = m;
loop {
state.nn = n + mm - 1;
state._fn = state.nn;
state.t = (state._fn + 1) as f64 * xln;
/* overflow and underflow test for small and large x */
if state.t.abs() > elim {
if state.t.abs() <= 0.0 {
*ierr = 2;
return;
}
} else {
if x < wdtol {
ans[0] = x.pow_di(-(n as isize) - 1).into();
if mm != 1 {
state.k = 1;
while state.k < mm {
ans[state.k] = (ans[state.k - 1].unwrap() / x).into();
state.k += 1;
}
}
if n == 0 && kode == 2 {
ans[0] = (ans[0].unwrap() + xln).into();
}
return;
}
/* compute xmin and the number of terms of the series, fln+1 */
state.xm = yint + slope * state._fn as f64;
let mx = state.xm as usize + 1;
state.xmin = mx as f64;
if n != 0 {
state.xm = -2.302 * rln - 0.0.min(xln);
state.arg = 0.0.min(state.xm / n as f64);
state.eps = state.arg.exp();
state.xm = if state.arg.abs() < 1.0e-3 {
-state.arg
} else {
1.0 - state.eps
};
fln = x * state.xm / state.eps;
state.xm = state.xmin - x;
if state.xm > 7.0 && fln < 15.0 {
break;
}
}
state.xdmy = x;
state.xdmln = xln;
state.xinc = 0.0;
if x < state.xmin {
state.nx = x as usize;
state.xinc = state.xmin - state.nx as f64;
state.xdmy = x + state.xinc;
state.xdmln = state.xdmy.ln();
}
/* generate w(n+mm-1, x) by the asymptotic expansion */
state.t = state._fn as f64 * state.xdmln;
state.t1 = state.xdmln + state.xdmln;
state.t2 = state.t + state.xdmln;
state.tk = state.t.abs().max(state.t1.abs().max(state.t2.abs()));
if state.tk <= elim {
/* for all but large x */
l10(&mut state, x, kode, ans, ierr, &bvalues, wdtol, mm);
if state.exit {
return;
}
}
}
*nz += 1; /* nz := #{underflows} */
mm -= 1;
ans[mm] = 0.0.into();
if mm == 0 {
return;
}
} /* end{for()} */
state.nn = fln as usize + 1;
state.np = n + 1;
state.t1 = (n + 1) as f64 * xln;
state.t = (-state.t1).exp();
state.s = state.t;
state.den = x;
state.i = 1;
while state.i <= state.nn {
state.den += 1.0;
state.trm[state.i] = state.den.powf(-(state.np as f64));
state.s += state.trm[state.i];
state.i += 1;
}
ans[0] = state.s.into();
if n == 0 && kode == 2 {
ans[0] = (state.s + xln).into();
}
if mm != 1 {
/* generate higher derivatives, j > n */
state.tol = wdtol / 5.0;
state.j = 1;
while state.j < mm as isize {
state.t /= x;
state.s = state.t;
state.tols = state.t * state.tol;
state.den = x;
state.i = 1;
while state.i <= state.nn {
state.den += 1.0;
state.trm[state.i] /= state.den;
state.s += state.trm[state.i];
if state.trm[state.i] < state.tols {
break;
}
state.i += 1;
}
ans[state.j as usize] = state.s.into();
state.j += 1;
}
}
}
fn l10(
state: &mut DPsiState,
x: f64,
kode: i32,
ans: &mut [Real64],
ierr: &mut i32,
bvalues: &[f64],
wdtol: f64,
mm: usize,
) {
state.tss = (-state.t).exp();
state.tt = 0.5 / state.xdmy;
state.t1 = state.tt;
state.tst = wdtol * state.tt;
if state.nn != 0 {
state.t1 = state.tt + 1.0 / state._fn as f64;
}
state.rxsq = 1.0 / (state.xdmy * state.xdmy);
state.ta = 0.5 * state.rxsq;
state.t = (state._fn + 1) as f64 * state.ta;
state.s = state.t * bvalues[2];
if state.s.abs() >= state.tst {
state.tk = 2.0;
state.k = 4;
while state.k <= 22 {
state.t = state.t
* ((state.tk + state._fn as f64 + 1.0) / (state.tk + 1.0))
* ((state.tk + state._fn as f64) / (state.tk + 2.0))
* state.rxsq;
state.trm[state.k] = state.t * bvalues[state.k - 1];
if (state.trm[state.k]).abs() < state.tst {
break;
}
state.s += state.trm[state.k];
state.tk += 2.0;
state.k += 1;
}
}
state.s = (state.s + state.t1) * state.tss;
if state.xinc != 0.0 {
/* backward recur from xdmy to x */
state.nx = state.xinc as usize;
state.np = state.nn + 1;
if state.nx > n_max {
*ierr = 3;
state.exit = true;
return;
} else {
if state.nn == 0 {
l20(state, x, kode, ans);
if state.exit {
return;
}
}
state.xm = state.xinc - 1.0;
state.fx = x + state.xm;
/* this loop should not be changed. fx is accurate when x is small */
state.i = 1;
while state.i <= state.nx {
state.trmr[state.i] = state.fx.powf(-(state.np as f64));
state.s += state.trmr[state.i];
state.xm -= 1.0;
state.fx = x + state.xm;
state.i += 1;
}
}
}
ans[mm - 1] = state.s.into();
if state._fn == 0 {
l30(state, x, kode, ans);
if state.exit {
return;
}
}
/* generate lower derivatives, j < n+mm-1 */
state.j = 2;
while state.j <= mm as isize {
state._fn -= 1;
state.tss *= state.xdmy;
state.t1 = state.tt;
if state._fn != 0 {
state.t1 = state.tt + 1.0 / state._fn as f64;
}
state.t = (state._fn + 1) as f64 * state.ta;
state.s = state.t * bvalues[2];
if state.s.abs() >= state.tst {
state.tk = (4 + state._fn) as f64;
state.k = 4;
while state.k <= 22 {
state.trm[state.k] = state.trm[state.k] * (state._fn + 1) as f64 / state.tk;
if state.trm[state.k].abs() < state.tst {
break;
}
state.s += state.trm[state.k];
state.tk += 2.0;
state.k += 1;
}
}
state.s = (state.s + state.t1) * state.tss;
if state.xinc != 0.0 {
if state._fn == 0 {
l20(state, x, kode, ans);
if state.exit {
return;
}
}
state.xm = state.xinc - 1.0;
state.fx = x + state.xm;
state.i = 1;
while state.i <= state.nx {
state.trmr[state.i] = state.trmr[state.i] * state.fx;
state.s += state.trmr[state.i];
state.xm -= 1.0;
state.fx = x + state.xm;
state.i += 1;
}
}
ans[(mm as isize - state.j) as usize] = state.s.into();
if state._fn == 0 {
l30(state, x, kode, ans);
if state.exit {
return;
}
}
state.j += 1;
}
state.exit = true;
}
fn l20(state: &mut DPsiState, x: f64, kode: i32, ans: &mut [Real64]) {
state.i = 1;
while state.i <= state.nx {
state.s += 1.0 / (x + (state.nx as f64 - state.i as f64)); /* avoid disastrous cancellation, PR#13714 */
state.i += 1
}
l30(state, x, kode, ans)
}
fn l30(state: &mut DPsiState, x: f64, kode: i32, ans: &mut [Real64]) {
if kode != 2 {
/* always */
ans[0] = (state.s - state.xdmln).into();
} else if state.xdmy != x {
state.xq = state.xdmy / x;
ans[0] = (state.s - state.xq.ln()).into();
}
state.exit = true;
}
```
Admittedly still hard to follow (for me at least), but way better than before, and something I can actually walk through with some debug printing!
---
The worst file to work on though, was `toms708.c`. This behemoth of a file was Fortran written in the 1990's and later translated into 2,300 lines of C that has a massive amount of poorly labelled alphabet-soup variables, cryptic function names (like `bup` and `bgrat`), `goto` statements up the whazoo, and the general screams of many of my nightmares all rolled into one neat little terror package. I fixed literally every single other file in `nmath` before even taking a look at `toms708`. I even updated the rest of the codebase to a newer version of R before looking at `toms708`. This thing scared me.
**And it should have scared me more.**
I spent weeks working on `toms708`, going line by line through everything, translating, doing the function-state trick on the `goto`'s, fixing it's error handling to be more like proper Rust, and using everything I learned about being a good Rust programmer, but it wasn't enough. After going through all 2,300 lines *it didn't compile*. And after I got it to compile *it crashed*. And after it crashed *it didn't give the correct answers*. And I quit for a few more months.
---
![Inner Peace](../../posts-img/developing-nmath/finally-inner-peace.gif)
(Sorry, just watched Kung Fu Panda 4 in the theaters and I've got it on the brain)
After coming back I started over *again* on `toms708` with another rewrite using the tricks I'd picked up and just letting some things go (like the not-so-great error handling system) and ... it just worked (minus a couple of minor typos and bugs). I have no idea what was wrong before, but it's not wrong anymore! Honestly, at this point I can't be bothered to look over it again because it'll probably give me nightmares. After a few hundred coverage tests and countless bugfixes I feel confident that my translation of nmath is *mostly* correct, and now that nmath is done I can use it to do *actual statistics*!!! Crazy!
---
So conclusion: `nmath` is a complicated monster made by people a whole lot smarter than me. I can't hold a candle to them. Trying to get this code translated taught me an incredible amount about C/C++, R, Rust, and coding in general. It was difficult, obtuse, and probably didn't need to be rewritten in Rust since Rust does have C FFI, but I do feel better knowing I can provide a pure, safe Rust crate with all the guarantees that come with that (regarding memory and the such).
---
> Author: Neek-sss
> URL: https://neek-sss.gitlab.io/yarsh-technologies-blog/posts/developing-nmath/