Strafe 0.1.1 Release

This is it, the first release of strafe, the Rust statistics library based on the R programming language! I’ve been working on this project since Fall of 2018, and it’s finally to ta minimum viable product.

Included in This Release

This release focuses on foundational calculations (like StatisticalSlice and Distribution), the first pass for linear models (LeastSquaresRegression, RobustLinearRegression, GeneralizedLinearRegression, and RankFit, a rank-based linear regression alternative), and the basic plotting system (based on plotters).

Here is a quick demonstration of what strafe is currently capable of:

Code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
use std::error::Error;  
  
use polars::prelude::ChunkCompare;  
use strafe_datasets::iris;  
use strafe_plot::{  
    model_plot::ModelPlot,  
    prelude::{IntoDrawingArea, SVGBackend},  
};  
use strafe_tests::two_way::LeastSquaresRegressionBuilder;  
use strafe_trait::ModelBuilder;  
use strafe_type::ModelMatrix;  

fn main() -> Result<(), Box<dyn Error>> {  
    let iris = iris()?;  
  
    println!("{iris}");  
  
    let mask = iris.column("Species")?.equal("setosa")?;  
  
    let x = iris  
        .filter(&mask)?  
        .column("Sepal.Width")?  
        .f64()?  
        .to_vec()  
        .iter()  
        .flatten()  
        .cloned()  
        .collect::<Vec<_>>();  
    let y = iris  
        .filter(&mask)?  
        .column("Sepal.Length")?  
        .f64()?  
        .to_vec()  
        .iter()  
        .flatten()  
        .cloned()  
        .collect::<Vec<_>>();  
  
    let mut model = LeastSquaresRegressionBuilder::new()  
        .with_x(&ModelMatrix::from(x))  
        .with_y(&ModelMatrix::from(y))  
        .build();  
  
    println!("{model}");  
  
    let root = SVGBackend::new("fit.svg", (1024, 768)).into_drawing_area();  
    model.plot_fit(&root, &Default::default());  
  
    Ok(())  
}

Output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
shape: (150, 5)
┌──────────────┬─────────────┬──────────────┬─────────────┬───────────┐
│ Sepal.Length ┆ Sepal.Width ┆ Petal.Length ┆ Petal.Width ┆ Species   │
│ ---          ┆ ---         ┆ ---          ┆ ---         ┆ ---       │
│ f64          ┆ f64         ┆ f64          ┆ f64         ┆ str       │
╞══════════════╪═════════════╪══════════════╪═════════════╪═══════════╡
│ 5.1          ┆ 3.5         ┆ 1.4          ┆ 0.2         ┆ setosa    │
│ 4.9          ┆ 3.0         ┆ 1.4          ┆ 0.2         ┆ setosa    │
│ 4.7          ┆ 3.2         ┆ 1.3          ┆ 0.2         ┆ setosa    │
│ 4.6          ┆ 3.1         ┆ 1.5          ┆ 0.2         ┆ setosa    │
│ 5.0          ┆ 3.6         ┆ 1.4          ┆ 0.2         ┆ setosa    │
│ …            ┆ …           ┆ …            ┆ …           ┆ …         │
│ 6.7          ┆ 3.0         ┆ 5.2          ┆ 2.3         ┆ virginica │
│ 6.3          ┆ 2.5         ┆ 5.0          ┆ 1.9         ┆ virginica │
│ 6.5          ┆ 3.0         ┆ 5.2          ┆ 2.0         ┆ virginica │
│ 6.2          ┆ 3.4         ┆ 5.4          ┆ 2.3         ┆ virginica │
│ 5.9          ┆ 3.0         ┆ 5.1          ┆ 1.8         ┆ virginica │
└──────────────┴─────────────┴──────────────┴─────────────┴───────────┘
Residuals:
┌─────────┬──────────────┬────────┬──────────────┬─────────┐
│ Minimum │ 1st Quantile │ Median │ 3rd Quantile │ Maximum │
╞═════════╪══════════════╪════════╪══════════════╪═════════╡
│ -0.5247 │ -0.1628      │ 0.0216 │ 0.1383       │ 0.4442  │
└─────────┴──────────────┴────────┴──────────────┴─────────┘

Coefficients:
┌────┬──────────┬─────────────────────────┬─────────────────────────┬─────────┬────────────┐
│    │ Estimate │ Confidence Interval (L) │ Confidence Interval (U) │ T-Value │ P-Value    │
╞════╪══════════╪═════════════════════════╪═════════════════════════╪═════════╪════════════╡
│ x0 │ 2.6390   │ 1.5566                  │ 3.7213                  │ 8.5125  │ 3.7424e-11 │
│ x1 │ 0.6904   │ 0.3766                  │ 1.0043                  │ 7.6807  │ 6.7098e-10 │
└────┴──────────┴─────────────────────────┴─────────────────────────┴─────────┴────────────┘

Tests:
┌──────────────────────────────┬───────────┬────────────┬────────┐
│                              │ Statistic │ P-Value    │ Alpha  │
╞══════════════════════════════╪═══════════╪════════════╪════════╡
│ Multiple R-squared (Robust)  │ 0.9977    │ 0.0022     │ 0.1500 │
│ Significance of Regression   │ 58.993    │ 6.7098e-10 │ 0.05   │
│ Shapiro-Wilk Normal Residual │ 0.9805    │ 0.5733     │ 0.05   │
└──────────────────────────────┴───────────┴────────────┴────────┘

Fit Plot

Note

You CANNOT build strafe on the current nightly compiler. The last nightly it runs on in nightly-2024-02-05. This is do to an indirect dependency via plotters called pathfinder-simd that requires a stdsimd, but this is no longer provided. I’m hoping that it will be fixed and updated soon, or I might have to switch away from plotters, but I guess that remains to be seen.

Future Plans

  • Heatmaps
  • Barcharts
  • Stats functions
  • Mass functions
  • cars library
  • Tighter polars integration
0%