Introduction
matten is a developer-experience-first multidimensional array (tensor) library
for Rust — the family car for small numerical and data-exploration
proof-of-concept work.
Maturity labels in this book — such as production-ready — describe stability within that scope, not performance or scale.
mattenoptimizes for time to a runnable PoC, not benchmark leadership.
This book is organized by reader:
- New users — philosophy and a quick start.
- Reference — the rules that shape the public API.
- Contributors — project layout, milestones, and process.
This documentation tracks the current 0.28 family, which moves the
matten-ndarraybridge tondarray0.17 (RFC-062), on top of the completed companion-maturity line:matten-ndarrayis production-ready andmatten-mlprepandmatten-dataare production-ready candidates.
Philosophy
matten optimizes for time to a runnable PoC, not benchmark leadership.
- One primary type. You work through
Tensor; no generic dtype parameters and no visible lifetimes in ordinary code. - Predictable, readable failures. Convenience APIs panic with actionable
messages; boundaries return
Result. - Start now, optimize later. When a prototype becomes performance-critical,
hand
matten’s flat data to a specialized crate such asndarray,nalgebra, orcandle.
matten is intentionally not a full dataframe engine, an ML framework, or a
GPU/sparse/distributed array library.
Quick start
#![allow(unused)]
fn main() {
use matten::Tensor;
let a = Tensor::new(vec![1.0, 2.0, 3.0, 4.0], &[2, 2]);
assert_eq!(a.shape(), &[2, 2]);
assert_eq!(a.ndim(), 2);
}
Install the lean core only:
matten = { version = "0.28", default-features = false }
Start here
This is the recommended learning path for matten.
Numeric tensors
If your data is already clean numeric values, follow these examples in order:
| Step | Example | What you learn |
|---|---|---|
| 1 | cargo run --example 00_quickstart | Create, add, reshape |
| 2 | cargo run --example 01_create_tensor | All construction APIs |
| 3 | cargo run --example 02_shape_and_size | Shape inspection |
| 4 | cargo run --example 04_elementwise_ops | Element-wise arithmetic |
| 5 | cargo run --example 06_broadcasting | NumPy-style broadcasting |
| 6 | cargo run --example 08_slicing_builder | Slice builder API |
| 7 | cargo run --example 22_matrix_multiplication | dot / matmul |
| 8 | cargo run --example 27_axis_reductions | Row/column reductions |
| 9 | cargo run --example 12_boundary_error_handling | Safe error handling |
After these nine examples you understand the numeric core.
Dynamic ingestion: messy data with dynamic
If your input has missing values, mixed types, or dirty CSV/JSON:
| Step | Example | What you learn |
|---|---|---|
| 1 | cargo run --example dynamic_00_quickstart --features dynamic,json,csv | Dynamic lifecycle |
| 2 | cargo run --example dynamic_02_missing_values --features dynamic,csv | Missing values |
| 3 | cargo run --example dynamic_05_dirty_csv_cleanup --features dynamic,csv | Dirty CSV |
| 4 | cargo run --example dynamic_07_on_ramp_summary --features dynamic | Full on-ramp |
| 5 | cargo run --example dynamic_06_numeric_policy --features dynamic | Conversion policy |
The lifecycle rule
Always follow this pattern with dynamic data:
messy input
→ ingest as dynamic tensor (from_json_dynamic / from_csv_dynamic)
→ inspect (schema_summary, numeric_mask, count_none)
→ clean (fill_none, forward_fill_none)
→ convert (try_numeric / try_numeric_with)
→ numeric tensor computation (&a + &b, matmul, sum_axis, …)
Never call arithmetic, reductions, or slicing on a dynamic tensor directly —
those APIs reject dynamic tensors with a clear message directing you to
try_numeric() first.
When to graduate from matten
matten is the family car: easy to start, honest about its limits. When you
need performance, static shapes, or advanced linear algebra, see
Migration to specialised libraries.
Examples index
All matten examples live in examples/. They are grouped by purpose.
Core examples (numeric Tensor)
These examples demonstrate the default matten API. No extra features required.
| File | What it shows |
|---|---|
00_quickstart.rs | First look: create, add, reshape |
01_create_tensor.rs | All construction APIs |
02_shape_and_size.rs | Shape inspection |
03_reshape_flatten.rs | Reshape and flatten |
04_elementwise_ops.rs | Element-wise arithmetic |
05_scalar_ops.rs | Scalar multiplication and division |
06_broadcasting.rs | NumPy-style broadcasting |
07_transpose_swap_axes.rs | Axis permutation |
08_slicing_builder.rs | Slice builder API (canonical) |
09_slice_str.rs | String slice API (convenience) |
10_json_roundtrip.rs | JSON serialization round-trip |
11_csv_numeric_loading.rs | Numeric CSV loading |
12_boundary_error_handling.rs | Handling errors at data boundaries |
13_resource_limits.rs | MattenLimits, try_zeros/try_ones/try_full |
14_concatenate_stack.rs | Shape composition: concatenate and stack (RFC-039) |
15_norm_trace_outer.rs | Linalg core-lite: norm, trace, outer (RFC-041) |
16_variance_std.rs | Statistics core-lite: var/std, var_axis/std_axis (RFC-040) |
Math examples
| File | What it shows |
|---|---|
20_dot_product.rs | Vector dot product |
21_matrix_vector_product.rs | Matrix × vector |
22_matrix_multiplication.rs | Matrix × matrix |
23_sum_mean.rs | Whole-tensor and axis reductions |
24_min_max.rs | Min and max with NaN policy |
25_normalize_vector.rs | L2 normalisation |
26_cosine_similarity.rs | Cosine similarity |
27_axis_reductions.rs | Axis reductions and NaN propagation |
28_column_statistics.rs | Per-column statistics workflow |
Applied problems (famous small math)
Recognizable small math / numerical-computing problems, used to show what a
Tensor can represent. These live in a fresh 30+ band so the core suite above
stays stable. Write-ups: Beginner applied math,
Matrix iteration,
Numerical methods, and ML-like.
| File | What it shows |
|---|---|
30_magic_square_checker.rs | Row/column/diagonal sums via get |
31_fibonacci_matrix_power.rs | Fibonacci via repeated matmul |
32_graph_path_counting.rs | Walk counting via adjacency-matrix powers |
33_markov_chain_weather.rs | Distribution over time via vector × matrix matmul |
34_tiny_pagerank.rs | PageRank power iteration via matrix × vector matmul |
35_linear_regression_gradient_descent.rs | Batch gradient descent via matmul + transpose |
36_heat_equation_1d.rs | Explicit finite-difference stencil as matmul iteration |
37_kmeans_small.rs | Lloyd’s k-means on a [points, features] data matrix |
38_nearest_neighbor_classification.rs | 1-NN classification over a labeled data matrix |
39_finite_difference_derivative.rs | Central-difference derivative on a linspace grid |
40_trapezoidal_integration.rs | Trapezoidal rule via linspace + elementwise + sum |
Vector distance and cosine similarity are already covered above — see
54_pairwise_distance.rs, 25_normalize_vector.rs, and 26_cosine_similarity.rs
rather than a duplicate in this band.
Practical numeric recipes (50_–56_)
Common data-processing patterns that combine multiple primitives. See Practical numeric recipes for the full write-up.
| File | What it shows |
|---|---|
50_rowwise_scoring.rs | Row-wise weighted scoring |
51_standardize_columns.rs | Column standardisation (z-score) |
52_minmax_scaling.rs | Min-max feature scaling |
53_gram_matrix.rs | Gram matrix (X × Xᵀ) |
54_pairwise_distance.rs | Pairwise Euclidean distances |
55_moving_average.rs | Simple moving average |
56_rolling_windows_basic.rs | Rolling window sum and max |
Dynamic examples (--features dynamic)
These require the dynamic feature for heterogeneous data ingestion. JSON and CSV are
equal on-ramps here: from_json_dynamic and from_csv_dynamic differ only in the input
format — both land messy data in a dynamic tensor that the same inspect → clean → convert
workflow turns into a numeric Tensor.
| File | Features | What it shows |
|---|---|---|
dynamic_00_quickstart.rs | dynamic,json,csv | Dynamic lifecycle overview |
dynamic_01_mixed_elements.rs | dynamic | Mixed Element types |
dynamic_02_missing_values.rs | dynamic,csv | Missing value detection |
dynamic_03_fill_none.rs | dynamic | Filling missing values |
dynamic_04_numeric_coercion.rs | dynamic | Element-level coercion |
dynamic_05_dirty_csv_cleanup.rs | dynamic,csv | Real-world CSV cleanup |
dynamic_06_numeric_policy.rs | dynamic | NumericPolicy API |
dynamic_07_on_ramp_summary.rs | dynamic | Complete on-ramp workflow |
dynamic_08_json_ingestion.rs | dynamic,json | JSON ingestion (mixed/missing → clean f64) |
Companion crate examples
These live in each companion crate’s own examples/ directory, not in core
matten. See Companion crate examples for the write-up.
| Crate | Example | What it shows |
|---|---|---|
matten-ndarray | from_arrayd, to_arrayd | ArrayD ↔ Tensor interop (copies, shape-preserving) |
matten-mlprep | mlprep_standardize_columns, mlprep_minmax_scale, mlprep_add_bias_column, mlprep_train_test_split | Small deterministic preprocessing |
matten-data | csv_to_tensor | CSV → clean → numeric Tensor (production-ready candidate) |
Running examples
# Numeric core (no features needed):
cargo run --example 00_quickstart
cargo run --example 27_axis_reductions
# Dynamic:
cargo run --example dynamic_06_numeric_policy --features dynamic
cargo run --example dynamic_07_on_ramp_summary --features dynamic,csv
cargo run --example dynamic_08_json_ingestion --features dynamic,json
Scope rule
Every example demonstrates accepted APIs only. Examples are not a back door for adding new mathematical operations, dataframe behavior, or ML scope.
Beginner applied math
A small set of recognizable math problems that show what a matten::Tensor can
represent and how short vector/matrix algorithms look in matten. They use only
the default numeric Tensor API — no extra features, no external crates, and small
hard-coded inputs with stable output.
These examples are teaching examples, not a production algorithm package. They sit
in a 30+ filename band so the established 00_–28_ suite stays untouched.
Examples
30_magic_square_checker.rs
Difficulty: Beginner. Checks whether a square matrix is a magic square — every
row, column, and both diagonals share one sum. Demonstrates 2-D Tensor::new,
shape, and element access with get(&[row, col]). Uses the classic 3×3 Lo Shu
square (magic constant 15).
cargo run --example 30_magic_square_checker
Source: 30_magic_square_checker.rs
31_fibonacci_matrix_power.rs
Difficulty: Beginner. Computes Fibonacci numbers from the identity
Q^n = [[F(n+1), F(n)], [F(n), F(n-1)]] with Q = [[1, 1], [1, 0]]. Demonstrates
repeated Tensor::matmul (recall that * is element-wise, never a matrix product)
and reading one element with get. A demonstration of the identity, not a
big-integer routine.
cargo run --example 31_fibonacci_matrix_power
Source: 31_fibonacci_matrix_power.rs
32_graph_path_counting.rs
Difficulty: Beginner. Counts walks in a directed graph using the fact that
(A^k)[i, j] is the number of walks of length k from node i to node j.
Demonstrates representing a graph as an adjacency Tensor and taking matrix powers
via matmul. Note the distinction between a walk (may repeat nodes/edges) and a
simple path (may not).
cargo run --example 32_graph_path_counting
Source: 32_graph_path_counting.rs
Already covered (cross-references)
Two classic beginner problems already ship as examples, so this band does not add duplicates:
- Vector distance —
54_pairwise_distance.rs(and25_normalize_vector.rs). - Cosine similarity —
26_cosine_similarity.rs.
What this is not
These examples do not imply that matten is a graph library, a number-theory
package, or an ML framework. They are single-file demonstrations of accepted APIs.
Matrix iteration
Intermediate examples built on repeated matrix/vector multiplication. They show
how an iterative process — a probability distribution evolving over time, or a
ranking settling to a fixed point — is just Tensor::matmul applied in a loop.
Like the rest of the applied band, these use only the default numeric Tensor API, small hard-coded inputs, and deterministic output. They are teaching examples, not a graph or probability library.
Examples
33_markov_chain_weather.rs
Difficulty: Intermediate. Models a two-state (Sunny / Rainy) weather process with
a row-stochastic transition matrix P. Each day applies v_next = v · P via
vector × matrix matmul, and the distribution converges to the stationary
π = [5/6, 1/6].
cargo run --example 33_markov_chain_weather
Source: 33_markov_chain_weather.rs
34_tiny_pagerank.rs
Difficulty: Intermediate. Ranks the nodes of a tiny directed graph with PageRank.
A column-stochastic link matrix M is power-iterated with damping
(r_next[i] = (1 - d)/N + d·(M·r)[i]) using matrix × vector matmul; the
best-connected node wins, and the link-less node keeps only its teleport share.
cargo run --example 34_tiny_pagerank
Source: 34_tiny_pagerank.rs
What this is not
These are single-file demonstrations of accepted APIs. They do not imply a graph framework, a probability toolkit, or a production PageRank implementation.
Numerical methods
Small numerical-method examples that demonstrate how iterative and sampled-grid
algorithms look in matten. They use only the default numeric Tensor API (plus the
RFC-038 comfort APIs), small hard-coded inputs, and deterministic output.
These are teaching examples, not a SciPy replacement.
Examples
35_linear_regression_gradient_descent.rs
Difficulty: Advanced-small. Fits y = w·x + b by batch gradient descent on
mean-squared error. The data is stacked into a design matrix with a bias column, so
predictions are X · θ and the gradient is (2/n)·Xᵀ·(ŷ - y) — one matmul for
each, with transpose forming Xᵀ once. Converges to the true line y = 2x + 1.
cargo run --example 35_linear_regression_gradient_descent
Source: 35_linear_regression_gradient_descent.rs
36_heat_equation_1d.rs
Difficulty: Advanced-small. Evolves the 1D heat equation on a rod with fixed-end
temperatures using the explicit (forward-Euler) finite-difference update. The stencil
is encoded as a tridiagonal matrix A (with identity rows at the boundaries), so each
time step is u_next = A · u. The profile converges to the steady-state straight line
between the boundary temperatures.
cargo run --example 36_heat_equation_1d
Source: 36_heat_equation_1d.rs
39_finite_difference_derivative.rs
Difficulty: Intermediate. Approximates the derivative of f(x) = x³ sampled on a
linspace grid using the central difference (f(x+h) − f(x−h)) / (2h). The grid and
the function values are Tensors (the latter via elementwise &x * &x). For a cubic
the central-difference error is exactly h², so the example shows the approximation
quality directly. It is a numerical approximation, not symbolic differentiation.
cargo run --example 39_finite_difference_derivative
Source: 39_finite_difference_derivative.rs
40_trapezoidal_integration.rs
Difficulty: Intermediate. Approximates ∫₀¹ x² dx with the composite trapezoidal
rule and compares against the known exact value 1/3. The grid comes from linspace,
the values from elementwise squaring, and the running total from a Tensor::sum
reduction. It is a numerical approximation, not an integration library.
cargo run --example 40_trapezoidal_integration
Source: 40_trapezoidal_integration.rs
What this is not
These are single-file demonstrations of accepted APIs. They do not imply that
matten is an optimization library, a PDE/finite-element framework, or a SciPy
replacement.
ML-like
Two small algorithms often associated with machine learning, written with matten
to show that a Tensor is enough for recognizable ML-shaped tasks. They use only the
default numeric Tensor API, small hard-coded inputs, and deterministic output.
The boundary is deliberate: these are algorithm demonstrations, not an ML
framework. There is no training loop abstraction, no model object, no autograd, and
no randomness — k, initial centroids, labels, and iteration counts are all fixed and
explicit. Both find the nearest point with Tensor::argmin (RFC-038).
Examples
37_kmeans_small.rs
Difficulty: Advanced-small. Clusters six 2-D points into two groups with Lloyd’s algorithm: assign each point to the nearest centroid, then move each centroid to the mean of its points. Deterministic initial centroids make the run reproducible; it converges to the two obvious clusters.
cargo run --example 37_kmeans_small
Source: 37_kmeans_small.rs
38_nearest_neighbor_classification.rs
Difficulty: Beginner. Classifies a query point by the label of its single nearest
training point (1-NN) over a labeled [samples, features] data matrix. No training
step, no fitted parameters — just a nearest-point search.
cargo run --example 38_nearest_neighbor_classification
Source: 38_nearest_neighbor_classification.rs
What this is not
These are single-file demonstrations of accepted APIs. They do not imply that
matten is an ML framework, a clustering/classification library, or a replacement for
a dedicated ML toolkit.
Practical numeric recipes
A set of small, self-contained numeric recipes that combine core matten primitives
into common data-processing patterns. Each file is a single runnable example with
hard-coded data, assertions, and stable output.
These live in the 50_–56_ band, separate from the core tutorial (01_–13_),
the numeric building blocks (20_–28_), and the famous-problem examples (30_–40_).
Examples
50_rowwise_scoring.rs
Row-wise weighted scoring: multiply each row of a feature matrix by a weight vector,
then sum across columns to produce one score per row. Shows broadcasting between a
[rows, cols] tensor and a [cols] weight vector, followed by sum_axis.
cargo run --example 50_rowwise_scoring
Source: 50_rowwise_scoring.rs
51_standardize_columns.rs
Z-score normalisation of each column (zero mean, unit variance) using only
mean_axis, broadcasting, and element-wise arithmetic — no external crate needed.
cargo run --example 51_standardize_columns
Source: 51_standardize_columns.rs
52_minmax_scaling.rs
Min-max (0–1) scaling of each column using min_axis, max_axis, and broadcasting.
A common feature-normalisation step before ML algorithms.
cargo run --example 52_minmax_scaling
Source: 52_minmax_scaling.rs
53_gram_matrix.rs
Gram matrix: G = X · Xᵀ, computed with matmul. Used in kernel methods and
feature covariance. Shows that a single matmul call produces a symmetric
[n, n] similarity matrix from an [n, d] data matrix.
cargo run --example 53_gram_matrix
Source: 53_gram_matrix.rs
54_pairwise_distance.rs
Pairwise Euclidean distances between rows using the identity
‖a−b‖² = ‖a‖² + ‖b‖² − 2aᵀb, computed with broadcasting and matmul.
Demonstrates efficient distance computation without an explicit loop over pairs.
cargo run --example 54_pairwise_distance
Source: 54_pairwise_distance.rs
55_moving_average.rs
Simple moving average over a 1-D series using slice windows (slice_str).
Shows a sliding-window pattern with overlapping slices and mean reduction.
cargo run --example 55_moving_average
Source: 55_moving_average.rs
56_rolling_windows_basic.rs
Rolling window sum and max over overlapping slices of a 1-D series. Extends the moving-average idea to multiple aggregations in one pass.
cargo run --example 56_rolling_windows_basic
Source: 56_rolling_windows_basic.rs
What this is not
These recipes show how to compose accepted matten APIs into common patterns. They
do not imply that matten is a feature-engineering framework, a signal-processing
library, or a statistics package. For preprocessing helpers with a proper API, see
matten-mlprep.
matten-data — table to Tensor
matten-data is a small, production-ready candidate companion crate for the boring step between a
small table-like input (such as a CSV) and a numeric [matten::Tensor]. It is a
conversion helper, not a dataframe library or query engine.
For joins, group-by, lazy queries, datetime handling, or large/streaming data, use
Polars, DataFusion, or Pandas.
matten-data deliberately does none of those.
Install
[dependencies]
matten = "0.28"
matten-data = "0.28"
Both crates share one lock-step family version (RFC-030); maturity is a per-crate Status label, not a separate version number.
Quickstart
use matten::Tensor;
use matten_data::Table;
fn main() -> Result<(), matten_data::MattenDataError> {
let csv = "region,sales,cost\nnorth,100,40\nsouth,150,\neast,120,55";
let tensor: Tensor = Table::from_csv_str(csv)?
.select_columns(["sales", "cost"])? // choose columns by name, in this order
.fill_missing(0.0)? // the missing south/cost becomes 0.0
.try_numeric()? // strict, explicit conversion to f64
.to_tensor()?; // a normal [rows, columns] Tensor
assert_eq!(tensor.shape(), &[3, 2]);
Ok(())
}
The example suite
The numbered tutorial suite teaches one step at a time; csv_to_tensor is a single
comprehensive overview.
| Example | What it shows |
|---|---|
data_00_quickstart | The full happy path in one place |
data_01_schema_summary | Row/column counts, names, missing counts, inferred kinds |
data_02_select_columns | Select by name; output order matches the request |
data_03_missing_values | Missing values never become zero silently; explicit fill |
data_04_to_tensor | Output shape, row-major order, core matten interop |
data_05_errors | Duplicate header, ragged row, non-numeric, missing-at-conversion |
csv_to_tensor | Comprehensive overview of the whole workflow |
cargo run -p matten-data --example data_00_quickstart
Output Tensor shape
to_tensor produces a tensor of shape [rows, columns], where rows are the data
rows (the header is not a row) and columns are the selected columns in the order you
requested them. The data is row-major: row 0’s values come first, then row 1’s,
and so on. Once converted, the result is an ordinary matten::Tensor — every core
operation applies.
Missing-value policy
Missing cells are never silently turned into 0. A missing value that reaches
numeric conversion is a precise MissingValue { column, row } error (the row is the
1-based CSV line number). You decide what a missing value means by calling
fill_missing with an explicit value before converting.
Numeric conversion policy
Conversion is strict and explicit (try_numeric then to_tensor): integers and
floats become f64; booleans and non-numeric text are rejected (they are never
coerced to 1/0); and a remaining missing cell is rejected. This keeps the
boundary between “table-like text” and “numbers” honest and visible.
Limitations
matten-data has no joins, group-by, pivot, query DSL, lazy execution,
indexing/loc/iloc, rolling/window operations, datetime engine, categorical dtype
system, or large-data streaming. It is for small, application-validated or trusted
data. When you need those capabilities, reach for a dataframe/query engine
(Polars, DataFusion, Pandas) instead.
Status and maturity
Production-ready candidate (0.28.x family). The table-to-Tensor API is mostly stable but pre-1.0;
pin the minor version. The crate’s scope is locked and enforced in CI (RFC-042), and
core matten never depends on it (RFC-022).
Companion crate examples
Each companion crate ships its own runnable examples, living in that crate’s
examples/ directory (never in core matten). They are small, deterministic, and
self-checking, and they all respect the one-way dependency rule: companions depend
on matten, but core matten depends on no companion.
These examples were audited and improved in place under RFC-048; the program does not add duplicate or renamed companion examples.
matten-ndarray — interop with ndarray
| Example | What it shows |
|---|---|
from_arrayd | ndarray::ArrayD<f64> → matten::Tensor, including a transposed (non-contiguous) input |
to_arrayd | matten::Tensor → ndarray::ArrayD<f64> |
Both conversions copy data (no zero-copy claim) and preserve shape. Only numeric
tensors convert to ndarray. The full conversion rules are documented as a
bridge conversion contract; the
bridge-crate policy covers how bridge crates are
structured (own their target dependency, never re-export Tensor).
cargo run -p matten-ndarray --example from_arrayd
cargo run -p matten-ndarray --example to_arrayd
matten-mlprep — small preprocessing
| Example | What it shows |
|---|---|
mlprep_standardize_columns | Per-column z-score (zero mean, unit std) |
mlprep_minmax_scale | Per-column scaling into [0, 1] |
mlprep_add_bias_column | Prepend a constant intercept column |
mlprep_train_test_split | Deterministic, ordered train/test split |
Convention throughout: rows are samples, columns are features; every transform is deterministic with no hidden randomness and no model training.
cargo run -p matten-mlprep --example mlprep_standardize_columns
cargo run -p matten-mlprep --example mlprep_train_test_split
matten-data — table-to-Tensor (production-ready candidate)
| Example | What it shows |
|---|---|
data_00_quickstart | The full happy path in one place |
data_01_schema_summary | Inspect rows, columns, names, missing counts, kinds |
data_02_select_columns | Select by name; output order matches the request |
data_03_missing_values | Missing values never become zero silently |
data_04_to_tensor | Output shape, row-major order, core interop |
data_05_errors | The common boundary errors |
csv_to_tensor | Comprehensive overview of the whole workflow |
matten-data is a production-ready candidate and intentionally small. It is not a dataframe:
no group-by, join, merge, pivot, or query. Missing values and numeric conversion are
explicit, never silent. See matten-data: table to Tensor for the full guide.
cargo run -p matten-data --example csv_to_tensor
What this is not
Companion examples demonstrate accepted bridge/preprocessing APIs. They do not imply
that matten is a dataframe engine, an ML framework, a linear-algebra backend, or a
replacement for ndarray, nalgebra, NumPy, or Pandas.
Error model
matten uses a single public error type, MattenError, and splits every API
into one of two zones. Understanding the split is the key to writing correct
code with matten.
Panic zone vs Result zone
| Zone | When | How |
|---|---|---|
| Panic zone | Local, developer-authored PoC code where shapes are known | API panics with an actionable matten <category> error in <operation>: ... message |
| Result zone | Any external boundary — parsing, file I/O, user-supplied shapes | API returns Result<Tensor, MattenError> and never panics on ordinary invalid input |
Rule of thumb: if the shape or data comes from outside your code (a file,
a web request, user input), use the try_* form.
#![allow(unused)]
fn main() {
use matten::{MattenError, Tensor};
// Panic zone: shape is a trusted literal
let t = Tensor::new(vec![1.0, 2.0, 3.0, 4.0], &[2, 2]);
// Result zone: shape comes from somewhere external
let result = Tensor::try_new(data, &user_shape);
match result {
Ok(t) => println!("{t:?}"),
Err(e) => eprintln!("bad input: {e}"),
}
}
MattenError variants
#![allow(unused)]
fn main() {
#[derive(Debug)]
#[non_exhaustive]
pub enum MattenError {
Shape { operation: &'static str, message: String },
Broadcast { left: Vec<usize>, right: Vec<usize> },
Allocation { requested_elements: usize, message: String },
Slice { input: Option<String>, message: String },
Parse { format: DataFormat, message: String },
Io { path: std::path::PathBuf, source: std::io::Error },
Unsupported { operation: &'static str, message: String },
InvalidArgument { operation: &'static str, argument: &'static str, message: String },
}
}
MattenError is #[non_exhaustive], so match it with a wildcard arm to stay
forward-compatible.
DataFormat identifies which parser produced a Parse error:
#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[non_exhaustive]
pub enum DataFormat { Json, Csv }
}
Variant guide
| Variant | Produced by |
|---|---|
Shape | construction mismatch, reshape, invalid arange arguments |
Broadcast | incompatible operand shapes in arithmetic |
Allocation | shape product overflow or arange element-count limit |
Slice | slice builder bounds errors, slice_str parse/bounds errors |
Parse | from_json, from_csv, and their file-loading variants |
Io | load_json, load_csv file I/O errors |
Unsupported | disabled-feature or not-yet-implemented operation, or a numeric-only API called on a dynamic tensor |
InvalidArgument | a supported operation given an out-of-range/ill-defined argument (e.g. clip with min > max); distinct from Unsupported |
Matching errors
MattenError embeds std::io::Error in Io, which is neither Clone nor
PartialEq. Never compare with ==; always match by variant.
#![allow(unused)]
fn main() {
let err = Tensor::try_new(vec![1.0], &[2, 2]).unwrap_err();
// correct
assert!(matches!(err, MattenError::Shape { .. }));
// correct
if let MattenError::Shape { operation, message } = &err {
println!("{operation}: {message}");
}
// will not compile — MattenError does not implement PartialEq
// assert_eq!(err, MattenError::Shape { .. });
}
Panic message format
Panic-zone APIs always begin with "matten":
matten shape error in reshape: cannot reshape tensor with 6 elements
from shape [2, 3] into shape [4, 2] requiring 8 elements
The format is matten <category> error in <operation>: <detail>. When
something panics unexpectedly, this prefix makes it easy to grep.
Using ? in application code
MattenError implements std::error::Error, so it works with ? and
Box<dyn Error>:
#![allow(unused)]
fn main() {
fn load_and_process(path: &str) -> Result<Tensor, Box<dyn std::error::Error>> {
let t = Tensor::load_json(path)?; // Io or Parse on failure
let flat = t.try_reshape(&[t.len()])?; // Shape on mismatch
Ok(flat)
}
}
Construction and conversion
All matten construction produces an owned, contiguous, row-major Vec<f64>
paired with a validated shape. Fields are private; users interact only through
methods.
Core constructors
#![allow(unused)]
fn main() {
// From data + shape (panic zone)
let t = Tensor::new(vec![1.0, 2.0, 3.0, 4.0], &[2, 2]);
// From data + shape (Result zone)
let t = Tensor::try_new(vec![1.0, 2.0, 3.0, 4.0], &[2, 2])?;
// 1-D from flat vector
let t = Tensor::from_vec(vec![1.0, 2.0, 3.0]); // shape [3]
}
new panics on mismatch; try_new returns MattenError::Shape or
MattenError::Allocation.
Fill constructors
#![allow(unused)]
fn main() {
let z = Tensor::zeros(&[3, 4]); // all 0.0, shape [3, 4]
let o = Tensor::ones(&[3, 4]); // all 1.0
let f = Tensor::full(&[3, 4], -1.0); // all -1.0
let s = Tensor::scalar(42.0); // shape [], len 1
}
All fill constructors validate the shape before allocating — a bad shape panics with an actionable message.
Range constructor
#![allow(unused)]
fn main() {
// Half-open, step > 0: [0.0, 1.0, 2.0, 3.0, 4.0]
let r = Tensor::arange(0.0, 5.0, 1.0);
// Negative step: [3.0, 2.0, 1.0]
let r = Tensor::arange(3.0, 0.0, -1.0);
// Result zone (step or bounds from user input)
let r = Tensor::try_arange(start, end, step)?;
}
arange rejects zero or non-finite step, non-finite bounds, and a computed
element count above the allocation limit (2²⁸).
Evenly spaced values and identity (RFC-038)
#![allow(unused)]
fn main() {
// `count` evenly spaced values, inclusive of both endpoints:
let xs = Tensor::linspace(0.0, 1.0, 5); // [0.0, 0.25, 0.5, 0.75, 1.0]
let one = Tensor::linspace(2.0, 9.0, 1); // [2.0]
// n × n identity matrix:
let i3 = Tensor::eye(3); // 1.0 on the diagonal, 0.0 elsewhere
// Result zone:
let xs = Tensor::try_linspace(start, end, count)?;
let i = Tensor::try_eye(n)?;
}
linspace includes both endpoints when count >= 2, returns [start] when
count == 1, and rejects count == 0. eye produces shape [n, n] and rejects
n == 0. Both are budget-checked like the fill constructors (oversized results
yield MattenError::Allocation).
Shape model
Shapes are runtime Vec<usize>. There is no const-generic or type-level
shape arithmetic.
| Shape | Meaning |
|---|---|
[] | scalar — len() == 1, is_scalar() == true |
[n] | 1-D vector — is_vector() == true |
[rows, cols] | 2-D matrix — is_matrix() == true |
[d0, …, d7] | up to rank 8 |
Rules enforced on every constructor:
- Zero-sized dimensions are rejected (deferred to a future RFC).
- Rank may not exceed 8.
- Shape product is computed with checked arithmetic; overflow returns
MattenError::Allocation.
Nested row construction
#![allow(unused)]
fn main() {
// Panic zone (convenience for trusted literals)
let t: Tensor = vec![vec![1.0, 2.0], vec![3.0, 4.0]].into();
// Result zone (ragged rows return Err)
let t = Tensor::try_from_rows(vec![vec![1.0, 2.0], vec![3.0, 4.0]])?;
}
From<Vec<Vec<f64>>> panics on ragged rows with an actionable message.
try_from_rows returns MattenError::Shape with the ragged-row detail.
Inspection
#![allow(unused)]
fn main() {
t.shape() // &[usize] — no allocation
t.ndim() // usize — shape().len()
t.len() // usize — element count
t.is_scalar() // bool — ndim() == 0
t.is_vector() // bool — ndim() == 1
t.is_matrix() // bool — ndim() == 2
t.as_slice() // &[f64] — flat row-major view
}
Conversion out
#![allow(unused)]
fn main() {
let v: Vec<f64> = t.to_vec(); // clone
let v: Vec<f64> = t.into_vec(); // move, no copy
let v: Vec<f64> = Vec::from(&t); // borrow-clone
let v: Vec<f64> = t.into(); // consuming From
let rows: Vec<Vec<f64>> = t.try_into()?; // fails for non-rank-2
}
Migration to faster libraries
When a PoC moves to a performance-sensitive path, hand the flat data to a specialised crate:
#![allow(unused)]
fn main() {
let flat: Vec<f64> = tensor.into_vec(); // zero-copy move
// pass `flat` to ndarray, nalgebra, candle, etc.
}
Operators and broadcasting
matten implements element-wise arithmetic for borrowed tensors with
NumPy-style right-aligned broadcasting. All results are new owned tensors;
operands are never mutated.
Element-wise operators
#![allow(unused)]
fn main() {
use matten::Tensor;
let a = Tensor::new(vec![1.0, 2.0, 3.0, 4.0], &[2, 2]);
let b = Tensor::full(&[2, 2], 10.0);
let c = &a + &b; // [11.0, 12.0, 13.0, 14.0]
let d = &a - &b; // [-9.0, -8.0, -7.0, -6.0]
let e = &a * &b; // [10.0, 20.0, 30.0, 40.0] ← element-wise, not matmul
let f = &a / &b; // [0.1, 0.2, 0.3, 0.4]
let g = -&a; // [-1.0, -2.0, -3.0, -4.0]
}
* is always element-wise. Matrix multiplication is explicit via matmul / dot.
Scalar operators
All eight scalar forms are supported:
#![allow(unused)]
fn main() {
let t = Tensor::new(vec![1.0, 2.0, 3.0], &[3]);
// tensor on left
let r = &t + 10.0; // [11.0, 12.0, 13.0]
let r = &t * 2.0; // [2.0, 4.0, 6.0]
// scalar on left
let r = 10.0 + &t; // [11.0, 12.0, 13.0]
let r = 2.0 * &t; // [2.0, 4.0, 6.0]
}
Broadcasting rules
Shapes are compatible when aligned from the right and each dimension pair satisfies one of:
- dimensions are equal;
- one dimension is
1(it broadcasts to match the other); - one operand has fewer dimensions (the missing leading axes are treated as
1).
| Left | Right | Result |
|---|---|---|
[] | [3, 4] | [3, 4] — scalar broadcasts everywhere |
[4] | [3, 4] | [3, 4] — row vector broadcasts across rows |
[3, 1] | [1, 4] | [3, 4] — outer product pattern |
[2, 3] | [2] | incompatible — panics |
#![allow(unused)]
fn main() {
// bias addition: add a [3] bias vector to every row of a [2, 3] matrix
let matrix = Tensor::new(vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0], &[2, 3]);
let bias = Tensor::new(vec![10.0, 20.0, 30.0], &[3]);
let result = &matrix + &bias;
// [[11.0, 22.0, 33.0],
// [14.0, 25.0, 36.0]]
}
Incompatible shapes
Incompatible shapes panic in operator code with an actionable message:
matten broadcast error in add: shapes [2, 3] and [2] are not compatible
IEEE 754 semantics
matten does not intercept NaN or inf:
- Division by zero produces
inf,-inf, orNaNper IEEE 754. NaNpropagates through all arithmetic.- No silent sanitisation.
No intermediate copies
The broadcast implementation maps result coordinates directly to source element indices using zero-stride tricks. No expanded broadcast copies of the operands are allocated.
Elementwise comfort math (RFC-038)
Beyond the operators above, Tensor provides a few familiar elementwise
transforms. Each preserves shape, follows ordinary f64 NaN/Inf behavior, and
panics on dynamic tensors (call try_numeric() first):
| Method | Effect |
|---|---|
abs() | absolute value |
sqrt() | square root (negative → NaN) |
exp() | e^x |
ln() | natural log (ln(0.0) → -inf, negative → NaN) |
clip(min, max) | clamp each element into [min, max] |
#![allow(unused)]
fn main() {
use matten::Tensor;
let t = Tensor::from_vec(vec![-5.0, 0.5, 9.0]);
assert_eq!(t.clip(0.0, 1.0).as_slice(), &[0.0, 0.5, 1.0]);
}
clip panics if min > max; try_clip(min, max) returns
MattenError::InvalidArgument instead (or MattenError::Unsupported on a dynamic
tensor).
Shape operations
All shape-transformation methods return new independent owned tensors. The numeric core copies data internally; no view lifetime is ever exposed.
Reshape
#![allow(unused)]
fn main() {
let t = Tensor::new(vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0], &[2, 3]);
// Panic zone
let r = t.reshape(&[3, 2]); // shape [3, 2], same flat order
// Result zone
let r = t.try_reshape(&[3, 2])?; // MattenError::Shape on mismatch
}
Only the element count matters — reshape never fails because of memory layout. Flat data order (row-major) is preserved unchanged.
#![allow(unused)]
fn main() {
// Any compatible shape works
let flat = t.reshape(&[6]); // [6]
let col = t.reshape(&[6, 1]); // [6, 1]
let cube = t.reshape(&[1, 2, 3]); // [1, 2, 3]
}
Panic message on mismatch:
matten shape error in reshape: cannot reshape tensor with 6 elements
from shape [2, 3] into shape [4, 2] requiring 8 elements
Flatten
#![allow(unused)]
fn main() {
let t = Tensor::new(vec![1.0, 2.0, 3.0, 4.0], &[2, 2]);
let f = t.flatten(); // shape [4]
// A scalar becomes shape [1]
let s = Tensor::scalar(7.0).flatten(); // shape [1]
}
Transpose
transpose() reverses the axis order. t() is an alias.
#![allow(unused)]
fn main() {
// 2-D: swap rows and columns
let m = Tensor::new(vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0], &[2, 3]);
let mt = m.transpose();
// shape [3, 2], data [1.0, 4.0, 2.0, 5.0, 3.0, 6.0]
// Higher rank: axes are fully reversed
// [d0, d1, d2] → [d2, d1, d0]
let t3 = Tensor::new((1..=24).map(|x| x as f64).collect(), &[2, 3, 4]);
let t3t = t3.transpose(); // shape [4, 3, 2]
}
Transposing twice is the identity:
#![allow(unused)]
fn main() {
assert_eq!(t.transpose().transpose(), t);
}
Transposing a scalar panics — there are no axes to reverse.
Swap axes
#![allow(unused)]
fn main() {
let t = Tensor::new((1..=24).map(|x| x as f64).collect(), &[2, 3, 4]);
let s = t.swap_axes(0, 2); // shape [4, 3, 2]
}
Swapping an axis with itself is a no-op. Out-of-range axes panic:
matten shape error in swap_axes: axis 5 is out of range for rank-3 tensor
Squeeze and expand_dims (RFC-038)
#![allow(unused)]
fn main() {
use matten::Tensor;
// squeeze: drop every length-1 axis (data order unchanged)
let t = Tensor::new(vec![1.0, 2.0, 3.0], &[1, 3, 1]);
let s = t.squeeze(); // shape [3]
// an all-ones shape squeezes to a scalar
let one = Tensor::new(vec![5.0], &[1, 1]).squeeze(); // shape []
// expand_dims: insert a length-1 axis at `axis` (0..=ndim)
let v = Tensor::from_vec(vec![1.0, 2.0, 3.0]);
let row = v.expand_dims(0); // [1, 3]
let col = v.expand_dims(1); // [3, 1]
// Result zone: axis > ndim is an InvalidArgument
let r = v.try_expand_dims(axis)?;
}
squeeze removes all length-1 axes and never fails (a scalar stays a scalar).
expand_dims accepts axis in 0..=ndim; an out-of-range axis panics, while
try_expand_dims returns MattenError::InvalidArgument. Both clone data and reject
dynamic tensors (call try_numeric() first).
Element access
#![allow(unused)]
fn main() {
let t = Tensor::new(vec![1.0, 2.0, 3.0, 4.0], &[2, 2]);
t.get(&[0, 1]) // Some(2.0)
t.get(&[5, 0]) // None — out of bounds
t.get(&[0]) // None — rank mismatch
// Scalar element
Tensor::scalar(99.0).get(&[]) // Some(99.0)
}
get returns Option<f64> and never panics. There is no mutable element
setter.
Numeric Tensor ownership note
Every method above clones or physically reorders data into a fresh contiguous
buffer. This keeps the API lifetime-free and predictable, at the cost of
higher allocation than a view-based library. When this matters for large data,
migrate to ndarray or nalgebra using tensor.into_vec().
See also
To join several tensors into one — along an existing axis (concatenate) or a new
axis (stack) — see Shape composition.
Shape composition
Shape composition joins several tensors into one. matten provides two functions
(RFC-039), both on the numeric Tensor only:
concatenate— join along an existing axis.stack— join along a new axis.
Each has a panicking convenience form and a non-panicking try_* form. Both take a
borrowed slice &[&Tensor], so callers never have to clone inputs just to pass them.
Dynamic tensors are rejected — convert with try_numeric() first.
repeat, tile, and meshgrid are intentionally deferred (see RFC-039 §8):
they need a separate indexing and allocation policy, and are not part of the API.
concatenate
#![allow(unused)]
fn main() {
Tensor::concatenate(tensors: &[&Tensor], axis: usize) -> Tensor
Tensor::try_concatenate(tensors: &[&Tensor], axis: usize) -> Result<Tensor, MattenError>
}
All inputs must have the same rank and the same size on every axis except
axis. The output axis size is the sum of the inputs’ axis sizes; all other
axes are unchanged. axis must be in 0..rank.
[2, 3] ++ [4, 3] along axis 0 -> [6, 3]
[2, 3] ++ [2, 5] along axis 1 -> [2, 8]
A single-element list returns a clone of that tensor (after validating the axis and dynamic status).
stack
#![allow(unused)]
fn main() {
Tensor::stack(tensors: &[&Tensor], axis: usize) -> Tensor
Tensor::try_stack(tensors: &[&Tensor], axis: usize) -> Result<Tensor, MattenError>
}
All inputs must have identical shapes. A new axis of size n (the number of
inputs) is inserted at position axis, so the output rank is the input rank plus
one. axis may be 0..=rank.
three [2, 4] tensors stacked at axis 0 -> [3, 2, 4]
three [2, 4] tensors stacked at axis 1 -> [2, 3, 4]
three [2, 4] tensors stacked at axis 2 -> [2, 4, 3]
A single-element list inserts a length-1 axis (the analogue of expand_dims).
Errors
Both functions follow the same error policy:
| Condition | try_* returns |
|---|---|
| empty input list | InvalidArgument { argument: "tensors" } |
| any dynamic input | Unsupported (convert with try_numeric() first) |
| rank / dimension / shape mismatch | Shape |
axis out of range (0..rank for concatenate, 0..=rank for stack) | Shape |
| result exceeds the allocation limit | Allocation |
The convenience forms (concatenate, stack) panic with the same message the
try_* forms would return.
Allocation safety
The output shape is checked against MattenLimits before any
data is copied, so an oversized result fails with Allocation (or Shape when the
stacked rank would exceed the dimension limit) rather than attempting a huge
allocation.
Example
See 14_concatenate_stack.rs
for a runnable walkthrough.
Slicing
matten provides two slicing APIs. The builder is the canonical form; slice_str
is a NumPy-like convenience. Both return owned tensors and never produce view
lifetimes.
Builder API (canonical)
#![allow(unused)]
fn main() {
use matten::Tensor;
let t = Tensor::new(vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0], &[2, 3]);
// One method call per axis; finish with .build()
let row = t.slice().index(0).all().build()?; // shape [3]
let top2 = t.slice().range(0..2).all().build()?; // shape [2, 3]
let col1 = t.slice().all().index(1).build()?; // shape [2]
}
Builder methods:
| Method | Meaning |
|---|---|
.all() | all elements along this axis (:), axis kept |
.index(n) | single element, axis removed from output shape |
.range(0..2) | half-open range, axis kept |
.range(1..) | from index 1 to end |
.range(..3) | from start to index 3 (exclusive) |
.range(..) | entire axis (same as .all()) |
.range(0..=2) | inclusive range → converted to 0..3 |
.build() | validate and materialise, returns Result<Tensor, MattenError> |
Index semantics follow NumPy: index(n) removes the axis, collapsing one
dimension. range keeps it.
#![allow(unused)]
fn main() {
// Shape [2, 3]: index one axis
let scalar_result = t.slice().index(0).index(1).build()?;
assert!(scalar_result.is_scalar()); // both axes indexed out → shape []
}
slice_str (convenience)
#![allow(unused)]
fn main() {
let row = t.slice_str("0, :")?; // first row
let top2 = t.slice_str("0:2, :")?; // first two rows
let step = t.slice_str("::2")?; // every other element in a 1-D tensor
}
Grammar:
| Pattern | Meaning |
|---|---|
: | all (All) |
n | single index (Index(n)) |
start:end | half-open range |
start: | from start to axis end |
:end | from axis start to end |
start:end:step | stepped range |
Whitespace around tokens is ignored: "0:2, :" and " 0:2 , : " are
equivalent.
slice_str always returns Result and never panics on malformed input.
It rejects specs longer than 512 bytes.
Builder vs slice_str
The builder is the primary API because it is type-checked at the call site.
slice_str is useful for exploratory work and tutorials where NumPy-familiar
syntax is more readable.
#![allow(unused)]
fn main() {
// These produce the same tensor
let a = t.slice().range(0..2).all().build()?;
let b = t.slice_str("0:2, :")?;
assert_eq!(a, b);
}
When in doubt, use the builder — it gives better error messages and is documented in examples as canonical.
Numeric Tensor ownership
Every slice result is a new contiguous owned tensor. No borrowed view of the source tensor is returned. This means slicing always allocates, but the API is lifetime-free and safe to pass across function boundaries without lifetime annotation.
Error handling
build() and slice_str() both return MattenError::Slice on:
- number of specs ≠ tensor rank;
- index out of bounds;
- range start > end or end > dimension;
slice_strparse error (carries the original spec string).
#![allow(unused)]
fn main() {
let err = t.slice().all().build().unwrap_err(); // too few specs for rank-2
assert!(matches!(err, MattenError::Slice { .. }));
}
Boundary integration
All external-input APIs in matten are Result-zone: they never panic on
malformed data and always return Result<Tensor, MattenError>.
JSON
Canonical object form
The preferred form for programmatic use — unambiguous for any rank:
#![allow(unused)]
fn main() {
use matten::Tensor;
let t = Tensor::from_json(
r#"{"shape":[2,2],"data":[1.0,2.0,3.0,4.0]}"#
).unwrap();
assert_eq!(t.shape(), &[2, 2]);
}
Convenience nested-array form
Rank 1 and rank 2 nested arrays are also accepted:
#![allow(unused)]
fn main() {
let t = Tensor::from_json("[[1.0,2.0],[3.0,4.0]]").unwrap();
assert_eq!(t.shape(), &[2, 2]);
let v = Tensor::from_json("[1.0,2.0,3.0]").unwrap();
assert!(v.is_vector());
}
Ragged arrays and non-numeric values return MattenError::Parse:
#![allow(unused)]
fn main() {
assert!(Tensor::from_json("[[1.0,2.0],[3.0]]").is_err()); // ragged
assert!(Tensor::from_json(r#"[[1.0,"text"]]"#).is_err()); // non-numeric
}
Serde integration
Tensor implements Serialize and Deserialize using the canonical object
form (requires the serde or json feature, both on by default):
#![allow(unused)]
fn main() {
use matten::Tensor;
let t = Tensor::new(vec![1.0, 2.0, 3.0, 4.0], &[2, 2]);
let json = serde_json::to_string(&t).unwrap();
let t2: Tensor = serde_json::from_str(&json).unwrap();
assert_eq!(t, t2);
}
File loading
#![allow(unused)]
fn main() {
let t = Tensor::load_json("examples/data/tensor_2x2.json")?;
}
File errors map to MattenError::Io; parse errors to MattenError::Parse.
CSV
Numeric CSV ingestion accepts rectangular numeric-only CSV. Shape is inferred as
[rows, cols].
#![allow(unused)]
fn main() {
let t = Tensor::from_csv("1.0,2.0,3.0\n4.0,5.0,6.0\n")?;
assert_eq!(t.shape(), &[2, 3]);
assert_eq!(t.as_slice(), &[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]);
}
Errors include row and column context:
matten csv parse error: at row 1, column 1: expected f64, got "active"
#![allow(unused)]
fn main() {
let t = Tensor::load_csv("examples/data/numeric_2x3.csv")?;
}
Cargo features
| Feature | Default | What it enables |
|---|---|---|
serde | yes | Serialize/Deserialize for Tensor |
json | yes (implies serde) | from_json, load_json |
csv | yes | from_csv, load_csv |
Lean install (no I/O dependencies):
matten = { version = "0.28", default-features = false }
Error mapping
| Situation | Error variant |
|---|---|
| Malformed JSON, wrong type, ragged array | MattenError::Parse { format: DataFormat::Json, .. } |
| Non-numeric CSV field, ragged rows | MattenError::Parse { format: DataFormat::Csv, .. } |
| File not found, permission error | MattenError::Io { path, source } |
| Shape/data length mismatch in JSON payload | MattenError::Parse (wraps the shape error message) |
Reductions and matrix multiplication
matten provides whole-tensor reductions, axis reductions, and explicit
matrix/vector multiplication. * remains element-wise — matrix multiplication
always requires matmul or dot.
Whole-tensor reductions
#![allow(unused)]
fn main() {
use matten::Tensor;
let v = Tensor::from_vec(vec![1.0, 2.0, 3.0, 4.0]);
v.sum() // 10.0
v.mean() // 2.5
v.min() // 1.0
v.max() // 4.0
}
All four return f64. sum and mean propagate NaN naturally (IEEE 754).
min and max return NaN if any element is NaN — this is deliberate
and documented (see below).
NaN / Inf policy
| Operation | NaN behaviour |
|---|---|
sum | propagates (NaN + x = NaN) |
mean | propagates |
min | returns NaN if any element is NaN |
max | returns NaN if any element is NaN |
argmin / argmax | error/panic if any element is NaN (an index is ill-defined) |
#![allow(unused)]
fn main() {
let t = Tensor::from_vec(vec![1.0, f64::NAN, 3.0]);
assert!(t.min().is_nan());
assert!(t.max().is_nan());
}
Inf is handled normally: it participates in comparisons as expected.
Implementation note: min/max detect NaN explicitly and
short-circuit. They do not use f64::min/f64::max (which silently
ignore NaN).
Index reductions (argmin / argmax, RFC-038)
argmin/argmax return the flat, row-major index of the smallest/largest
element, with the first occurrence winning ties:
#![allow(unused)]
fn main() {
use matten::Tensor;
let t = Tensor::new(vec![2.0, 9.0, 3.0, 1.0, 0.0, 4.0], &[2, 3]);
assert_eq!(t.argmin(), 4); // the 0.0
assert_eq!(t.argmax(), 1); // the 9.0
}
Unlike the value reductions above, an index is ill-defined when any element is
NaN. These therefore follow the selection branch of the NaN policy:
try_argmin/try_argmax return MattenError::InvalidArgument, and the convenience
argmin/argmax panic with the same context. (On a dynamic tensor the try_* forms
return MattenError::Unsupported; call try_numeric() first.)
Axis reductions
#![allow(unused)]
fn main() {
// [[1,2,3],[4,5,6]]
let m = Tensor::new(vec![1.0,2.0,3.0,4.0,5.0,6.0], &[2,3]);
m.sum_axis(0) // column sums -> shape [3] -> [5,7,9]
m.sum_axis(1) // row sums -> shape [2] -> [6,15]
m.mean_axis(0) // column means -> shape [3] -> [2.5,3.5,4.5]
m.mean_axis(1) // row means -> shape [2] -> [2.0,5.0]
}
The reduced axis is removed from the output shape. Reducing a vector along its only axis gives a scalar-shaped tensor.
Both panic with an actionable message if axis >= ndim.
Vector dot product
#![allow(unused)]
fn main() {
let a = Tensor::from_vec(vec![1.0, 2.0, 3.0]);
let b = Tensor::from_vec(vec![4.0, 5.0, 6.0]);
let d = a.dot(&b);
assert!(d.is_scalar());
assert_eq!(d.as_slice(), &[32.0]); // 1*4 + 2*5 + 3*6
}
dot on two vectors [n] and [n] returns a scalar tensor (shape []).
Matrix multiplication
matmul is an alias for dot. Use whichever reads more clearly.
| Left shape | Right shape | Result shape |
|---|---|---|
[n] | [n] | [] scalar |
[m, n] | [n] | [m] |
[n] | [n, p] | [p] |
[m, n] | [n, p] | [m, p] |
#![allow(unused)]
fn main() {
let a = Tensor::new(vec![1.0,2.0,3.0,4.0], &[2,2]);
let b = Tensor::new(vec![5.0,6.0,7.0,8.0], &[2,2]);
let c = a.matmul(&b);
// [[19,22],[43,50]]
assert_eq!(c.as_slice(), &[19.0, 22.0, 43.0, 50.0]);
}
Incompatible shapes panic with an actionable message including both shapes. Batched matmul (rank > 2) is out of scope for the numeric core.
Axis reductions (min and max)
min_axis and max_axis reduce along an axis, removing it from the output
shape, and propagate NaN the same way min and max do.
#![allow(unused)]
fn main() {
use matten::Tensor;
// [[3,1,4],[1,5,9]]
let m = Tensor::new(vec![3.0,1.0,4.0,1.0,5.0,9.0], &[2,3]);
m.min_axis(0) // column minimums -> shape [3] -> [1.0, 1.0, 4.0]
m.max_axis(0) // column maximums -> shape [3] -> [3.0, 5.0, 9.0]
m.min_axis(1) // row minimums -> shape [2] -> [1.0, 1.0]
m.max_axis(1) // row maximums -> shape [2] -> [4.0, 9.0]
}
NaN propagation: if any element along the reduced axis is NaN, the output
for that position is NaN.
* is always element-wise
#![allow(unused)]
fn main() {
let a = Tensor::new(vec![1.0,2.0,3.0,4.0], &[2,2]);
let b = Tensor::new(vec![5.0,6.0,7.0,8.0], &[2,2]);
let elem = &a * &b; // [5, 12, 21, 32] ← element-wise
let mat = a.matmul(&b); // [19, 22, 43, 50] ← matrix product
}
matten never overloads * for matrix multiplication. If you need the matrix
product, always call matmul or dot explicitly.
Performance note
matmul uses plain nested loops — correct and readable, but not
cache-optimised. For large matrices, migrate the flat data to ndarray or
nalgebra:
#![allow(unused)]
fn main() {
let flat: Vec<f64> = tensor.into_vec();
// hand off to your preferred crate
}
See also
For the three linalg-adjacent helpers norm, trace, and outer — and the list
of advanced linear algebra that is intentionally out of core scope — see
Linear algebra (core-lite).
For population variance and standard deviation — var, std, var_axis,
std_axis — see Statistics (core-lite).
Linear algebra (core-lite)
Core
mattenprovides small linalg-adjacent helpers, not a linear algebra backend.mattenprioritizes PoC ergonomics, not numerical linear algebra performance or stability leadership.
matten offers exactly three linalg-adjacent helpers (RFC-041), alongside the
dot/matmul already in Reductions and matrix multiplication:
norm— L2 / Frobenius norm over all elements.trace— diagonal sum of a rank-2 tensor.outer— rank-1 × rank-1 outer product.
norm
#![allow(unused)]
fn main() {
Tensor::norm(&self) -> f64
}
The L2 / Frobenius norm over all elements: sqrt(sum(x_i^2)). It works at any
rank — for a matrix this is the Frobenius norm. NaN propagates (any NaN element
yields NaN). No overflow-avoidance scaling is applied, so extreme magnitudes may
overflow to infinity.
Like the other value reductions (sum, mean), norm has no try_* form; it
panics on a dynamic tensor (convert with try_numeric() first).
norm([3, 4]) = 5 // sqrt(9 + 16)
norm([[1, 2], [2, 4]]) = 5 // Frobenius: sqrt(1 + 4 + 4 + 16)
trace
#![allow(unused)]
fn main() {
Tensor::trace(&self) -> f64
Tensor::try_trace(&self) -> Result<f64, MattenError>
}
The sum of the diagonal of a rank-2 tensor. Rectangular matrices are allowed:
the trace sums self[i, i] for i in 0..min(rows, cols).
trace([[1, 2], [3, 4]]) = 5 // 1 + 4
trace([[1, 2, 3], [4, 5, 6]]) = 6 // min(2,3)=2 -> self[0,0] + self[1,1]
try_trace returns MattenError::Shape if the tensor is not rank-2, or
MattenError::Unsupported on a dynamic tensor; trace panics in those cases.
outer
#![allow(unused)]
fn main() {
Tensor::outer(&self, other: &Tensor) -> Tensor
Tensor::try_outer(&self, other: &Tensor) -> Result<Tensor, MattenError>
}
The outer product of two rank-1 tensors: out[i, j] = self[i] * other[j], with
shape [self.len(), other.len()]. The output is checked against
MattenLimits before allocation.
[1, 2, 3] ⊗ [4, 5] -> [[4, 5], [8, 10], [12, 15]] // shape [3, 2]
try_outer returns MattenError::Shape if either input is not rank-1,
MattenError::Unsupported on a dynamic tensor, or MattenError::Allocation if the
result exceeds the limit; outer panics in those cases.
Out of scope for core
The following are intentionally not in core matten (RFC-041 §5):
inverse determinant solve least_squares
eigenvalues eigenvectors SVD QR
LU Cholesky sparse BLAS / LAPACK
For serious numerical linear algebra, use a specialized crate such as nalgebra
or ndarray-linalg. A future matten-nalgebra / matten-ndarray-linalg bridge
would require its own RFC.
Example
See 15_norm_trace_outer.rs
for a runnable walkthrough.
Statistics (core-lite)
Core matten provides exactly four statistics reductions (RFC-040), alongside the
mean/mean_axis already in Reductions and matrix multiplication:
var/std— population variance / std over all elements.var_axis/std_axis— the same along one axis.
Anything with significant statistical policy — quantile, percentile, histogram,
covariance, correlation, z-score, sample variance — is out of core scope (a
possible future matten-stats companion). matten is a family-car PoC library,
not a statistics package.
Population variance, not sample variance
All four use population variance (ddof = 0):
mean = sum(x) / n
var = sum((x - mean)^2) / n
std = sqrt(var)
There is no sample-variance (ddof = 1) variant, no var_with_ddof, and no
nanvar/nanstd in core. A single-element tensor has variance 0.0. A two-pass
algorithm is used (mean first, then squared deviations) to avoid the avoidable
cancellation of the naive one-pass E[x^2] - E[x]^2.
NaN propagates: any NaN element yields NaN (per-slice for the axis variants),
consistent with the other f64 reductions. Use try_numeric() to convert a
dynamic tensor first; the statistics methods reject dynamic tensors.
var / std
#![allow(unused)]
fn main() {
Tensor::var(&self) -> f64
Tensor::std(&self) -> f64
Tensor::try_var(&self) -> Result<f64, MattenError>
Tensor::try_std(&self) -> Result<f64, MattenError>
}
[1, 2, 3, 4] -> mean 2.5, var 1.25, std sqrt(1.25) ≈ 1.118
The try_* forms return MattenError::Unsupported on a dynamic tensor. They also
guard the empty-tensor case with MattenError::InvalidArgument, but matten
forbids zero-sized dimensions, so an empty tensor is not constructible and that
branch is unreachable through normal construction.
var_axis / std_axis
#![allow(unused)]
fn main() {
Tensor::var_axis(&self, axis: usize) -> Tensor
Tensor::std_axis(&self, axis: usize) -> Tensor
Tensor::try_var_axis(&self, axis: usize) -> Result<Tensor, MattenError>
Tensor::try_std_axis(&self, axis: usize) -> Result<Tensor, MattenError>
}
The reduced axis is removed from the output shape (no keepdims), matching the
existing axis reductions (mean_axis, sum_axis):
[[1, 2, 3], [4, 5, 6]] var_axis(0) -> [2.25, 2.25, 2.25] // shape [3], per column
[[1, 2, 3], [4, 5, 6]] var_axis(1) -> [2/3, 2/3] // shape [2], per row
The try_* forms return MattenError::Shape if axis >= rank, or
MattenError::Unsupported on a dynamic tensor.
Out of scope for core
sample variance (ddof = 1) quantile percentile
histogram covariance correlation
z-score nanvar/nanstd statistical tests
These are deferred to a possible future matten-stats companion, which would only
be created once at least three clearly-useful, well-scoped APIs are accepted
(RFC-040 §9). Some (z-score) overlap with matten-mlprep and must not be
duplicated.
Example
See 16_variance_std.rs
for a runnable walkthrough.
Dynamic feature (Element model)
The dynamic feature enables heterogeneous dynamic tensors. Enable it in
Cargo.toml:
matten = { version = "0.28", features = ["dynamic"] }
matten is not a dataframe library. The dynamic feature is for ingesting
and cleaning messy PoC data before converting to numeric tensors or handing off
to a specialised crate.
Element variants
#![allow(unused)]
fn main() {
use matten::Element;
Element::Float(1.5) // IEEE 754 f64
Element::Int(42) // i64
Element::text("active") // UTF-8 text (Arc<str> internally)
Element::Bool(true) // boolean
Element::None // missing / null
}
size_of::<Element>() == 24 bytes on 64-bit targets (all text representations
give the same size; Arc<str> was chosen for cheap clone in CoW slices).
Constructing dynamic tensors
#![allow(unused)]
fn main() {
use matten::{Element, Tensor};
let t = Tensor::from_elements(
vec![
Element::Float(1.0), Element::text("ok"), Element::Bool(true),
Element::Int(2), Element::None, Element::Bool(false),
],
&[2, 3],
);
// Boundary-safe variant:
let t = Tensor::try_from_elements(data, &[2, 3])?;
}
Element predicates and coercion
#![allow(unused)]
fn main() {
Element::None.is_none() // true
Element::Float(1.0).is_numeric() // true
Element::Int(42).is_numeric() // true
Element::Bool(true).is_numeric() // false — no silent bool coercion
Element::Float(1.5).try_as_f64() // Some(1.5)
Element::Int(7).try_as_f64() // Some(7.0)
Element::text("3").try_as_f64() // None — no silent text coercion
Element::None.try_as_f64() // None
}
Coercion policy (RFC-011 §11)
| From | To f64 | Allowed? |
|---|---|---|
Float(f64) | itself | yes |
Int(i64) | cast | yes |
Bool | — | no |
Text | — | no |
None | — | no |
Use fill_none or explicit conversion helpers to clean data before arithmetic.
Accessing elements
#![allow(unused)]
fn main() {
t.get_element(&[0, 1]) // Option<Element> — None if out of bounds
t.is_dynamic() // true for dynamic tensors
t.to_elements() // Vec<Element> in row-major order
}
Missing-value utilities
#![allow(unused)]
fn main() {
use matten::{Element, Tensor};
let t = Tensor::from_elements(
vec![Element::Float(1.0), Element::None, Element::Float(3.0), Element::None],
&[4],
);
// Count None values
t.count_none() // 2
// Boolean-like mask: 1.0 where None, 0.0 elsewhere (numeric f64 tensor)
let mask = t.none_mask(); // [0.0, 1.0, 0.0, 1.0]
// RFC-011 named alias:
let mask = t.is_none_mask(); // identical result
// Constant fill
let filled = t.fill_none(Element::Float(0.0)); // [1.0, 0.0, 3.0, 0.0]
// Forward-fill: carry last non-None value forward (fallback for leading None)
let t2 = Tensor::from_elements(
vec![Element::None, Element::Float(1.0), Element::None, Element::Float(4.0)],
&[4],
);
let fwd = t2.forward_fill_none(Element::Float(-1.0));
// [-1.0, 1.0, 1.0, 4.0] (leading None takes fallback)
// Sum skipping None (panics on non-numeric non-None elements)
t.sum_skip_none() // 4.0 (1.0 + 3.0, None values skipped)
}
Parsing mixed data
#![allow(unused)]
fn main() {
// JSON: null→None, booleans→Bool, strings→Text, integers→Int, floats→Float
#[cfg(feature = "json")]
let t = Tensor::from_json_dynamic(r#"[[1, "active", true], [2, null, false]]"#)?;
// CSV: empty field→None, "true"/"false"→Bool, integers→Int, floats→Float, rest→Text
#[cfg(feature = "csv")]
let t = Tensor::from_csv_dynamic("1,active,true\n2,,false\n")?;
}
Current limitations (guard model)
In the current release, many numeric operations reject dynamic
tensors with a clear matten unsupported error message. You must convert
to a numeric tensor first using try_numeric().
Guarded (will panic or return Err):
reshape,flatten,transpose,swap_axesslice()builder andslice_str()→MattenError::Unsupported- all arithmetic operators and reductions
dot/matmulas_slice,to_vec,into_vec,get,get_flatSerialize/ serde
The underlying Arc-based CoW storage (DynamicTensor) is implemented
internally and will back future public dynamic slicing and reshape in a later
release.
#![allow(unused)]
fn main() {
// Correct pattern: ingest → clean → convert → arithmetic
let raw = Tensor::from_csv_dynamic("1.0,2.0\n3.0,4.0\n")?;
let filled = raw.fill_none(Element::Float(0.0));
let numeric: Tensor = filled.try_numeric()?; // convert to numeric
let result = &numeric * 2.0; // numeric arithmetic
}
Workflow pattern
#![allow(unused)]
fn main() {
use matten::{Element, Tensor};
fn process_messy_csv(input: &str) -> Result<Tensor, Box<dyn std::error::Error>> {
// 1. Ingest as dynamic
let raw = Tensor::from_csv_dynamic(input)?;
// 2. Fill missing values
let clean = raw.fill_none(Element::Float(0.0));
// 3. Convert to numeric tensor for arithmetic
let numeric = clean.try_numeric()?;
// 4. Use numeric arithmetic, reductions, matmul...
Ok(numeric)
}
}
Limitations
- No dataframe joins, group-by, pivot, or query operations.
- No date/time dtype.
- No categorical dtype.
- No silent text-to-number or bool-to-number coercion.
- Batched matmul on dynamic tensors requires
try_numericfirst. - For large datasets, consider specialised crates (
polars,ndarray).
Migration to specialised libraries
For the full narrative guide — when to stay vs. migrate, a target-selection matrix, and per-target playbooks — see the Production migration guide. This reference page is the quick, copy-paste companion: data-export snippets and minimal conversions.
matten is a starting point, not an endpoint. When a PoC graduates to
production or numerical performance becomes critical, migrate the data to a
specialised crate. This page shows how.
When to migrate
| Signal | Recommended path |
|---|---|
| Matrix operations on > 1 000 × 1 000 data | ndarray + BLAS, or nalgebra |
| Machine learning / automatic differentiation | candle, burn, or tch |
| Large sparse data | sprs or domain-specific crates |
| Web API payloads needing serde but no math | stay with matten |
| Mixed messy data → clean numeric → arithmetic | stay with matten dynamic |
Exporting data from matten
Every matten tensor exposes its flat row-major data. The simplest data-export
path is:
#![allow(unused)]
fn main() {
let flat: Vec<f64> = tensor.into_vec(); // consuming, no copy
// or
let flat: Vec<f64> = tensor.to_vec(); // borrowing clone
}
The shape is available as:
#![allow(unused)]
fn main() {
let shape: &[usize] = tensor.shape();
}
To ndarray
The bridge-first path uses the matten-ndarray crate (copies, numeric-only, rejects
dynamic tensors, preserves logical row-major order — see the
bridge contract):
#![allow(unused)]
fn main() {
use matten::Tensor;
use matten_ndarray::to_arrayd;
let t = Tensor::new(vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0], &[2, 3]);
let arr = to_arrayd(&t)?; // ArrayD<f64>, logical row-major
println!("{arr}");
}
Without the bridge crate, convert manually from the flat Vec<f64> plus shape:
#![allow(unused)]
fn main() {
use matten::Tensor;
use ndarray::ArrayD;
let t = Tensor::new(vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0], &[2, 3]);
let shape: Vec<usize> = t.shape().to_vec();
let flat: Vec<f64> = t.into_vec();
let arr = ArrayD::from_shape_vec(shape, flat).unwrap();
println!("{arr}");
}
ndarray supports BLAS-backed matrix multiplication, advanced indexing,
views, and strided arrays.
To nalgebra
#![allow(unused)]
fn main() {
use matten::Tensor;
use nalgebra::DMatrix;
let t = Tensor::new(vec![1.0, 2.0, 3.0, 4.0], &[2, 2]);
let flat: Vec<f64> = t.into_vec();
// DMatrix is column-major; transpose if needed
let mat = DMatrix::from_row_slice(2, 2, &flat);
println!("{mat}");
}
nalgebra provides static and dynamic matrices, LU/QR/SVD decomposition,
and linear algebra operations.
To candle (ML tensors)
#![allow(unused)]
fn main() {
use matten::Tensor;
// candle_core = { version = "0.x", features = ["..."] }
let t = Tensor::new(vec![1.0, 2.0, 3.0, 4.0], &[2, 2]);
// matten uses f64; convert to f32 if the Candle workflow wants f32.
let flat_f32: Vec<f32> = t.as_slice().iter().map(|&v| v as f32).collect();
let shape = t.shape().to_vec();
// let candle_t = candle_core::Tensor::from_vec(flat_f32, shape, &device)?;
println!("data ready for candle: {flat_f32:?}, shape: {shape:?}");
}
candle targets GPU-accelerated ML workflows (transformers, training loops).
Dynamic tensors: clean then migrate
If your data went through matten’s dynamic feature, convert to a numeric tensor
numeric first:
#![allow(unused)]
fn main() {
use matten::{Element, Tensor};
let raw = Tensor::from_csv_dynamic("1.0,2.0\n3.0,4.0\n")?;
let filled = raw.fill_none(Element::Float(0.0));
let numeric: Tensor = filled.try_numeric()?; // MattenError if non-numeric
let flat: Vec<f64> = numeric.into_vec(); // hand off
}
Allocation warning
matten clones on every reshape and slice. For large datasets,
migrate before performing many transformations:
#![allow(unused)]
fn main() {
// Prefer this pattern for large data:
let result = compute_in_matten(&small_data);
let flat = result.into_vec();
// then pass `flat` to ndarray/nalgebra for the heavy lifting
}
Compatibility promise (v0.x)
During v0.x, API changes are allowed but minimised after a release. The core
Tensor type, the public error model, and the panic-vs-Result split are stable
design decisions and will not change without a documented breaking change. See the
public API snapshot and the CHANGELOG for the exact
current export surface.
v1.0.0 requires explicit maintainer confirmation and a full public API
review. See the project CHANGELOG for migration notes on any breaking changes.
Compatibility and stability policy
Public API contract
matten exports the following public names from the crate root:
#![allow(unused)]
fn main() {
use matten::Tensor; // always
use matten::{MattenError, DataFormat}; // always
use matten::MattenLimits; // always (RFC-018)
use matten::SliceBuilder; // always; returned by Tensor::slice()
use matten::Element; // #[cfg(feature = "dynamic")]
use matten::NumericPolicy; // #[cfg(feature = "dynamic")] (RFC-017)
}
SliceBuilder is returned by Tensor::slice() and is held by value; users
do not need to import it by name in the common case.
IntoSliceRange and SliceConvert are hidden implementation plumbing for
SliceBuilder::range. They are exported #[doc(hidden)] and use a private
sealed::Sealed supertrait so downstream crates cannot meaningfully implement
them. Users never need to name them in imports.
SliceSpecRepr is #[doc(hidden)]; it is a visibility-chain artefact and
not part of the stable API.
Panic zone vs Result zone
This split is a permanent design decision and will not change:
| Zone | When | Guarantee |
|---|---|---|
| Panic | Local, trusted, literal construction | Rich matten … error in …: message |
| Result | Any external boundary (parsing, files, user shapes) | Result<Tensor, MattenError> — never panics on ordinary input |
See Error model for the full list of each zone’s APIs.
Feature flags
| Feature | Default | Stability |
|---|---|---|
serde | yes | stable |
json | yes | stable |
csv | yes | stable |
dynamic | no | stable (dynamic ingestion) |
Disabling default features is supported: default-features = false gives
the lean core. Enabling dynamic does not rename or remove any numeric Tensor API.
v0.x compatibility
matten is on the v0.x line. The policy:
- Breaking changes are allowed but must be documented in CHANGELOG.
- Public API churn decreases after each minor release.
- Feature-gated additions (new
#[cfg]methods) are not breaking. #[non_exhaustive]onMattenErrorandDataFormatmeans match arms must include a wildcard — new variants may be added without a semver break.
v1.0 requirements
v1.0.0 requires explicit maintainer confirmation. Before that can happen:
- public API review must be complete;
cargo public-apisnapshot must be taken and approved;- the panic/Result split must be finalised;
- the
serdecanonical format must be declared stable; - limitations and non-goals must be clearly documented.
MSRV
rust-version = "1.85" (Rust 2024 edition). The MSRV may be relaxed in a
future release; it will not be raised without a documented breaking change.
Deferred items
The following items were considered and explicitly deferred:
| Item | Status | Reason |
|---|---|---|
is_empty() | Deferred | Zero-sized dims rejected; always false. Future RFC. |
set_flat | Not implemented | Mutation API deferred. |
arange max elements | 1<<20 (~1 M) | Lowered from 1<<28 in v0.12.0 for OOM safety. |
get_flat | Implemented | Tensor::get_flat(index) -> Option<f64> added in v0.11.0. |
| Negative slice indices | Deferred | Not in RFC-008 grammar for 0.1.0. |
Step slicing ::2 | Supported | slice_str("0:10:2") grammar works. |
| Mutable element API | Deferred | Internal Arc-shared storage / CoW is implemented; the public mutation API that would expose CoW is intentionally deferred. |
| Dynamic slicing via builder | Deferred | slice().build() works on numeric tensors only. Use get_element column-by-column for dynamic. |
| Batched matmul (rank > 2) | Deferred | RFC-010 scope: [m,n]×[n,p] maximum. |
| Axis reductions on dynamic | Not needed yet | Convert with try_numeric() first. |
Phase status
The v0.20 family completed the materialization phase: the core numeric comfort
APIs (RFC-038 — elementwise, selection, creation, and shape helpers) and the
30_–40_ famous-problem examples program (RFC-043–048). The matten-data
CSV→tensor ingestion API first shipped in this family (RFC-034, RFC-035).
The v0.21 family delivered selective boundary implementation: shape composition
(concatenate / stack), small statistics (var / std), linalg-lite helpers
(norm / trace / outer), and the matten-data scope guard. These are additive
under lock-step family versioning (RFC-030).
The v0.22 family promotes matten-data to Beta: the RFC-036 example suite
(data_00–data_05) plus an explicit malformed-CSV test complete the documented
Beta gate (RFC-023 §9). Maturity is a per-crate Status label, not a separate version,
under lock-step family versioning (RFC-030).
The v0.23 family adds the production migration guide (RFC-050–052): when to stay vs.
migrate, per-target playbooks (ndarray, nalgebra, Polars/Pandas, Candle, NumPy), and the
bridge conversion-contract template with the matten-ndarray reference contract. This is
documentation only — no public API, runtime, or dependency change, and core matten gains
no dependency.
The v0.24 family completes the reduction surface (RFC-055 / RFC-056): every scalar value
reduction (try_sum / try_mean / try_min / try_max / try_norm) and every axis reduction
(try_sum_axis / try_mean_axis / try_min_axis / try_max_axis) now has a non-panicking
Result form, joining try_var / try_std and their axis variants. The panic forms are
unchanged in behaviour and remain convenience wrappers. These are additive under lock-step
family versioning (RFC-030); no existing signature, numeric result, output shape, NaN policy,
or dependency changes, and core matten gains no dependency.
The v0.25 family opens the companion-maturity line by promoting matten-ndarray from
production-ready candidate to production-ready (RFC-057). This is a maturity Status label
only — no API, runtime, error-variant, dependency, copy-semantics, or ndarray-version change —
and it does not imply v1.0, which still requires explicit maintainer confirmation. Under
lock-step family versioning (RFC-030) the crate stays on the shared family version. matten-mlprep
and matten-data remain at Beta pending their own maturity decisions.
The v0.26 family continues the companion-maturity line by promoting matten-mlprep from
Beta to production-ready candidate (RFC-058). Label/docs only — no API, runtime,
error-variant, or dependency change. The candidate rung reflects an honest limitation:
train_test_split is ordered-only (no shuffle), acceptable if that documented limit is
acceptable; full production-ready is deferred (RFC-058 §5.1). This does not imply v1.0.
matten-ndarray remains production-ready; matten-data remains Beta pending its own
maturity decision.
The v0.27 family completes the companion-maturity line by promoting matten-data from
Beta to production-ready candidate (RFC-059), with two promotion-blocking hygiene fixes
first (a maturity-neutral package description; required-features = ["csv"] on the data_0X
examples). Label/docs/packaging only — no API, runtime, error-variant, or dependency change, and
no scope expansion: the RFC-042 lock holds (still a CSV→tensor on-ramp, not a dataframe
engine). Full production-ready is deferred to a separate future review. This does not imply
v1.0. The ladder now reads matten-ndarray production-ready, matten-mlprep and matten-data
production-ready candidates.
The v0.28 family moves the matten-ndarray bridge to ndarray 0.17 (RFC-062): the
supported requirement changes from the 0.16 minor to 0.17 (CI targets 0.17.2). Because
to_arrayd/from_arrayd expose ndarray::ArrayD<f64>, the supported ndarray minor is part of the
bridge’s public type identity — consumers build against ndarray 0.17. (ndarray 0.17.0 is yanked;
use a non-yanked 0.17 patch.) RFC-062 first evaluated supporting 0.16 and 0.17 together via a
bounded range; the maintainer chose the single-version requirement to keep Cargo.toml simple and
readable — the architect ruling listed this as an acceptable alternative. No bridge API, behavior,
copy-semantics, error, or zero-copy change, and core matten still carries no ndarray dependency.
A public-dependency compatibility event handled as a lock-step family minor (RFC-030); it does
not imply v1.0.
Public API snapshot
This page lists every public item in matten at the current v0.28 family. It serves as the
baseline for tracking breaking changes toward v1.0.0 and as the review gate
required by RFC-015.
Root exports
#![allow(unused)]
fn main() {
// Primary user-facing types
pub use matten::Tensor;
pub use matten::MattenError;
pub use matten::DataFormat;
pub use matten::MattenLimits; // RFC-018: resource safety limits
pub use matten::SliceBuilder;
// Feature-gated
#[cfg(feature = "dynamic")]
pub use matten::Element;
#[cfg(feature = "dynamic")]
pub use matten::NumericPolicy; // RFC-017: numeric conversion policy
// Compiler-visibility plumbing — #[doc(hidden)], NOT user-facing extension points.
// IntoSliceRange and SliceConvert use a private sealed::Sealed supertrait;
// downstream crates cannot meaningfully implement either trait.
// Users never need to name them in imports.
#[doc(hidden)] pub use matten::IntoSliceRange;
#[doc(hidden)] pub use matten::SliceConvert;
#[doc(hidden)] pub use matten::SliceSpecRepr;
}
Dynamic tensor behaviour
Methods marked numeric-only panic with a matten unsupported error message
when called on a dynamic tensor. Call try_numeric() to convert first.
| Numeric method group | Dynamic behaviour |
|---|---|
reshape, flatten, transpose, swap_axes, squeeze, expand_dims | panic |
slice() builder, slice_str() | returns MattenError::Unsupported |
| Arithmetic operators, scalar operators | panic |
Reductions (sum, mean, min, max, norm, *_axis) | panic; non-panicking try_* forms return Unsupported (and Shape for axis) |
dot / matmul | panic |
as_slice, to_vec, into_vec, get, get_flat | panic |
From<Tensor> for Vec<f64>, From<&Tensor>, TryFrom | panic / Err |
Serialize | returns serde error |
Tensor — construction
| Method | Returns | Notes |
|---|---|---|
new(data, shape) | Tensor | panics on mismatch |
try_new(data, shape) | Result<Tensor, MattenError> | |
scalar(value) | Tensor | shape [] |
zeros(shape) | Tensor | |
ones(shape) | Tensor | |
full(shape, value) | Tensor | |
from_vec(data) | Tensor | shape [n] |
arange(start, end, step) | Tensor | panics on invalid / too large |
try_arange(start, end, step) | Result<Tensor, MattenError> | |
linspace(start, end, count) | Tensor | RFC-038; count evenly spaced, both endpoints; panics if count == 0 |
try_linspace(start, end, count) | Result<Tensor, MattenError> | RFC-038; budget-checked |
eye(n) | Tensor | RFC-038; n × n identity; panics if n == 0 |
try_eye(n) | Result<Tensor, MattenError> | RFC-038; budget-checked |
try_from_rows(rows) | Result<Tensor, MattenError> | ragged → error |
try_zeros(shape) | Result<Tensor, MattenError> | RFC-018; budget-checked |
try_ones(shape) | Result<Tensor, MattenError> | RFC-018; budget-checked |
try_full(shape, value) | Result<Tensor, MattenError> | RFC-018; budget-checked |
try_zeros_with_limits(shape, limits) | Result<Tensor, MattenError> | custom budget |
try_ones_with_limits(shape, limits) | Result<Tensor, MattenError> | custom budget |
try_full_with_limits(shape, value, limits) | Result<Tensor, MattenError> | custom budget |
Tensor — shape inspection
| Method | Returns | Notes |
|---|---|---|
shape() | &[usize] | |
ndim() | usize | |
len() | usize | logical element count |
is_scalar() | bool | ndim == 0 |
is_vector() | bool | ndim == 1 |
is_matrix() | bool | ndim == 2 |
Tensor — data access (numeric Tensor)
| Method | Returns | Notes |
|---|---|---|
as_slice() | &[f64] | panics on dynamic |
to_vec() | Vec<f64> | clone; panics on dynamic |
into_vec(self) | Vec<f64> | consuming; panics on dynamic |
get(coord) | Option<f64> | panics on dynamic |
get_flat(index) | Option<f64> | panics on dynamic |
Tensor — shape operations (numeric Tensor)
| Method | Returns | Notes |
|---|---|---|
reshape(shape) | Tensor | panics on mismatch or dynamic |
try_reshape(shape) | Result<Tensor, MattenError> | returns Unsupported on dynamic |
flatten() | Tensor | panics on dynamic |
transpose() | Tensor | reverses axes; panics on dynamic |
t() | Tensor | alias for transpose |
swap_axes(a, b) | Tensor | panics on dynamic |
squeeze() | Tensor | RFC-038; removes length-1 axes; panics on dynamic |
expand_dims(axis) | Tensor | RFC-038; inserts a length-1 axis; panics if axis > ndim or dynamic |
try_expand_dims(axis) | Result<Tensor, MattenError> | RFC-038; InvalidArgument if axis > ndim; Unsupported on dynamic |
Tensor — shape composition (numeric Tensor, RFC-039)
Associated functions (called as Tensor::concatenate(...)), not methods. Both take
a borrowed slice &[&Tensor] and reject dynamic inputs.
| Function | Returns | Notes |
|---|---|---|
concatenate(tensors, axis) | Tensor | joins an existing axis; panics on empty/shape/axis error or dynamic |
try_concatenate(tensors, axis) | Result<Tensor, MattenError> | InvalidArgument if empty; Shape on rank/dim/axis (0..rank); Unsupported on dynamic; Allocation if oversized |
stack(tensors, axis) | Tensor | joins a new axis (rank + 1); panics on empty/shape/axis error or dynamic |
try_stack(tensors, axis) | Result<Tensor, MattenError> | InvalidArgument if empty; Shape if shapes differ or axis > rank; Unsupported on dynamic; Allocation if oversized |
repeat/tile/meshgrid are deferred (RFC-039 §8) and not part of the API.
Tensor — slicing (numeric Tensor)
| Method | Returns | Notes |
|---|---|---|
slice() | SliceBuilder<'_> | returns Unsupported on dynamic |
slice_str(spec) | Result<Tensor, MattenError> | returns Unsupported on dynamic |
SliceBuilder methods
| Method | Returns |
|---|---|
all() | SliceBuilder |
index(i) | SliceBuilder |
range<R: IntoSliceRange>(r) | SliceBuilder |
build() | Result<Tensor, MattenError> |
Tensor — arithmetic (numeric Tensor)
Operator traits implemented for &Tensor:
Add, Sub, Mul, Div, Neg — element-wise with broadcasting.
Scalar operators: &Tensor + f64, &Tensor - f64, &Tensor * f64, &Tensor / f64
(and reverse: f64 + &Tensor, f64 - &Tensor, f64 * &Tensor, f64 / &Tensor).
All panic on dynamic tensors.
Tensor — elementwise comfort math (numeric Tensor, RFC-038)
| Method | Returns | Notes |
|---|---|---|
abs() | Tensor | elementwise; shape preserved |
sqrt() | Tensor | negative element → NaN |
exp() | Tensor | natural exponential e^x |
ln() | Tensor | ln(0.0) → -inf, negative → NaN |
clip(min, max) | Tensor | clamp; panics if min > max |
try_clip(min, max) | Result<Tensor> | InvalidArgument if min > max; Unsupported on dynamic |
All panic on dynamic tensors (except try_clip, which returns Unsupported).
| Method | Returns | Notes |
|---|---|---|
sum() | f64 | |
mean() | f64 | |
min() | f64 | NaN if any element is NaN |
max() | f64 | NaN if any element is NaN |
try_sum() / try_mean() / try_min() / try_max() | Result<f64, MattenError> | Unsupported on dynamic; NaN propagates as a value (RFC-055) |
sum_axis(axis) | Tensor | |
mean_axis(axis) | Tensor | |
min_axis(axis) | Tensor | NaN propagated per slice |
max_axis(axis) | Tensor | NaN propagated per slice |
try_sum_axis(axis) / try_mean_axis(axis) / try_min_axis(axis) / try_max_axis(axis) | Result<Tensor, MattenError> | Shape if axis >= rank; Unsupported on dynamic (RFC-056) |
argmin() / argmax() | usize | flat row-major index; first tie; panics on NaN/dynamic |
try_argmin() / try_argmax() | Result<usize> | InvalidArgument on NaN; Unsupported on dynamic |
dot(rhs) | Tensor | 4 shape cases; panics on dynamic |
matmul(rhs) | Tensor | alias for dot; panics on dynamic |
Tensor — linalg core-lite (numeric Tensor, RFC-041)
Small linalg-adjacent helpers — not a linear algebra backend. inverse,
determinant, solve, eigen-decomposition, SVD, QR, LU, Cholesky, sparse, and
BLAS/LAPACK are out of scope for core (use nalgebra or ndarray-linalg).
| Method | Returns | Notes |
|---|---|---|
norm() | f64 | L2 / Frobenius over all elements; NaN propagates; panics on dynamic |
try_norm() | Result<f64, MattenError> | Unsupported on dynamic; NaN propagates as a value (RFC-055) |
trace() | f64 | rank-2 only; rectangular via min(rows, cols); panics on non-rank-2 or dynamic |
try_trace() | Result<f64, MattenError> | Shape if not rank-2; Unsupported on dynamic |
outer(other) | Tensor | rank-1 × rank-1 → [m, n]; panics on non-rank-1, dynamic, or oversized |
try_outer(other) | Result<Tensor, MattenError> | Shape if not rank-1; Unsupported on dynamic; Allocation if oversized |
Tensor — statistics (numeric Tensor, RFC-040)
Population variance only (ddof = 0): var = sum((x_i - mean)^2) / n,
std = sqrt(var), two-pass, NaN-propagating. Sample variance, quantile,
percentile, histogram, covariance, correlation, and z-score are out of core scope.
| Method | Returns | Notes |
|---|---|---|
var() / std() | f64 | population (ddof = 0); NaN propagates; singleton → 0.0; panics on dynamic |
try_var() / try_std() | Result<f64, MattenError> | Unsupported on dynamic; InvalidArgument on empty (not constructible) |
var_axis(axis) / std_axis(axis) | Tensor | reduces and drops the axis; panics if axis >= rank or dynamic |
try_var_axis(axis) / try_std_axis(axis) | Result<Tensor, MattenError> | Shape if axis >= rank; Unsupported on dynamic |
Tensor — boundary / serde
| Method | Returns | Notes |
|---|---|---|
from_json(input) | Result<Tensor, MattenError> | |
load_json(path) | Result<Tensor, MattenError> | |
from_csv(input) | Result<Tensor, MattenError> | numeric only |
load_csv(path) | Result<Tensor, MattenError> | |
Serialize (serde) | via feature serde | panics on dynamic |
Deserialize (serde) | via feature serde |
Tensor — dynamic (#[cfg(feature = "dynamic")])
| Method | Returns | Notes |
|---|---|---|
from_elements(data, shape) | Tensor | |
try_from_elements(data, shape) | Result<Tensor, MattenError> | |
get_element(coord) | Option<Element> | |
is_dynamic() | bool | |
from_json_dynamic(input) | Result<Tensor, MattenError> | needs json |
from_csv_dynamic(input) | Result<Tensor, MattenError> | needs csv |
to_elements() | Vec<Element> | |
fill_none(value: impl Into<Element>) | Tensor | |
none_mask() | Tensor | 1.0/0.0 mask |
is_none_mask() | Tensor | alias for none_mask |
count_none() | usize | |
forward_fill_none(fallback: impl Into<Element>) | Tensor | |
sum_skip_none() | f64 | skips None; panics on non-numeric |
try_numeric() | Result<Tensor, MattenError> | strict default |
try_numeric_with(policy) | Result<Tensor, MattenError> | RFC-017; explicit policy |
numeric_mask() | Tensor | RFC-016; 1.0/0.0 like none_mask |
is_numeric_convertible() | bool | RFC-016; true if all Float/Int |
schema_summary() | String | RFC-016; element-type counts |
MattenLimits (RFC-018)
#![allow(unused)]
fn main() {
pub struct MattenLimits {
pub max_dimensions: usize, // default: 8
pub max_elements: usize, // default: 1 048 576 (~1 M / ~8 MiB)
pub max_parse_bytes: usize, // default: 128 MiB
}
}
Methods: MattenLimits::default(), MattenLimits::strict().
NumericPolicy (RFC-017, #[cfg(feature = "dynamic")])
Controls how Element values coerce to f64 in try_numeric_with.
Builder methods: .strict(), .permissive(), .allow_bool(),
.allow_text_parse(), .none_as(value), .none_as_nan().
Conversion traits
| Trait | Notes |
|---|---|
From<Vec<f64>> for Tensor | shape [n] |
From<Vec<Vec<f64>>> for Tensor | panics if ragged |
From<Tensor> for Vec<f64> | consuming; panics on dynamic |
From<&Tensor> for Vec<f64> | clone; panics on dynamic |
TryFrom<Tensor> for Vec<Vec<f64>> | requires rank-2; errors on dynamic |
MattenError variants
#![allow(unused)]
fn main() {
#[non_exhaustive]
pub enum MattenError {
Shape { operation: &'static str, message: String },
Broadcast { left: Vec<usize>, right: Vec<usize> },
Allocation { requested_elements: usize, message: String },
Slice { input: Option<String>, message: String },
Parse { format: DataFormat, message: String },
Io { path: PathBuf, source: std::io::Error },
Unsupported { operation: &'static str, message: String },
InvalidArgument { operation: &'static str, argument: &'static str, message: String },
}
}
DataFormat variants
#![allow(unused)]
fn main() {
pub enum DataFormat { Json, Csv }
}
Element variants (#[cfg(feature = "dynamic")])
#![allow(unused)]
fn main() {
pub enum Element {
Float(f64),
Int(i64),
Text(Arc<str>),
Bool(bool),
None,
}
}
Methods: try_as_f64() -> Option<f64>, is_numeric() -> bool,
is_none() -> bool, as_text() -> Option<&str>, as_bool() -> Option<bool>,
and the text(s) constructor.
Production migration guide
matten is the family car: small, approachable, Tensor-centered, and good for
proof-of-concept work, learning, and small serious workflows. It stays deliberately
dependency-light and does not try to become a dataframe engine, an ML framework, or a
high-performance linear-algebra backend.
This guide is about the other half of that promise: helping you know when and how to
leave matten when a workflow outgrows it. Moving a hot path to a production-oriented
ecosystem is not a failure — outgrowing matten is a successful PoC outcome. It means
the idea earned the move.
What this guide is — and is not
This guide helps you migrate intentionally. It is:
- a way to decide when to stay with
mattenand when to migrate; - a target-selection matrix from your workload to the right ecosystem;
- a set of playbooks for specific targets (
ndarray,nalgebra, Polars/Pandas, Candle, and NumPy); - guidance on the bridge crates that own dependency-specific conversion.
It is explicitly not:
- a claim that
mattenis faster, or a promise that you can swapmattenout unchanged; - a claim that any target is universally “better” — it depends entirely on the workload;
- a tool that rewrites your code for you.
mattenhelps you understand and plan a migration. (An assisted tool,matten-migrate, is a deferred future possibility, not part of this guide.)
The layered idea
core matten → owns Tensor; stays small; no heavy target-library dependencies
bridge crates → own dependency-specific conversion (e.g. matten-ndarray)
docs (here) → when to stay, when to migrate, and how
Core matten gains no new heavy dependency from any of this. The conversion to a
specific ecosystem lives in a dedicated bridge crate (such as matten-ndarray) or in your
own code — never inside core matten.
Where to go next
- When to migrate — signals that you have outgrown
matten, and the equally important signals that you should stay. - Choosing a target — a matrix from workload shape to ecosystem.
- Common pitfalls — mistakes to avoid when moving data out.
- Readiness checklist and report template — turn the signals into an explicit, advisory decision you can record and review.
- Target playbooks — step-by-step, per-target migration guides.
For quick, copy-paste data-export snippets, the reference page Migration to specialised libraries is the companion to this narrative guide.
When to migrate
The honest default is: stay with matten until a concrete signal tells you to move.
matten is built for PoC, learning, and small serious workflows, and most of those never
need to leave. Migration is a deliberate response to pressure, not a rite of passage.
Signals that you have outgrown matten
Treat any of these as a real reason to plan a migration of the affected part of your workflow:
- Data-size pressure. Your arrays are large enough that
matten’s copy-on-every- reshape/slice behavior shows up in profiles, or you are pushing past comfortable in-memory sizes. - Runtime pressure. A dense numeric kernel (matrix multiply, matrix–vector products,
operator application) is a measured hot path. In the accepted RFC-049 Rust peer
comparison, dense
matmuland matrix–vector tasks showed a noticeably larger gap tondarray/nalgebrathan lighter vector tasks did — so those are the kernels most worth moving when they get hot. - Linear-algebra pressure. You need decompositions (LU, QR, SVD), solvers, or
eigenvalues.
mattenintentionally does not provide these. - Dataframe pressure. You need group-by, joins, pivots, or query-style operations.
matten-datais an ingestion on-ramp (CSV/table →Tensor) and will not grow into a dataframe engine. - ML / device pressure. You need autodiff, training loops, or GPU execution.
- Dynamic-ingestion pressure. You are leaning heavily on the
dynamicfeature for large or repeated messy-data cleanup, beyond a one-time on-ramp.
Signals that you should stay
Equally important — these are reasons not to migrate:
- The numeric work is small and not on a hot path.
- You are wiring data into a web API (serde in, serde out) with light math in between.
- You are learning, prototyping, or teaching, and approachability matters more than raw speed.
- Your messy data needs a one-time clean-then-compute pass, which
matten’sdynamicon-ramp handles.
If none of the pressure signals above apply, staying with matten is the right call, and
adding a heavyweight dependency would cost you simplicity for no real gain.
Migrate the hot path, not the whole program
Migration is rarely all-or-nothing. The common, healthy pattern is to keep matten for
construction, ingestion, and glue, and move only the measured hot kernel into a specialised
crate:
matten → build / ingest / shape the data, light math
specialised crate → the heavy kernel (matmul, decomposition, training, group-by)
The target-selection matrix helps you map each pressure signal to a destination, and the playbooks show the per-target mechanics.
Choosing a target
There is no universally “best” target — the right destination depends on the shape of the pressure you are feeling. Use the matrix below to map a signal to an ecosystem, then open the matching playbook.
Target-selection matrix
| Pressure / need | Recommended target | Notes |
|---|---|---|
General N-D numeric arrays, dense matmul, axis reductions at scale | ndarray | The general Rust N-D array production path; BLAS-backed matmul available. |
| Small/mid dense vectors & matrices, decompositions, solvers (LU/QR/SVD), eigenvalues | nalgebra | The dense linear-algebra path. |
| Group-by, joins, pivots, query-style dataframe analytics | Polars (Rust) / Pandas (Python) | matten-data is an ingestion on-ramp only; it will not grow these. |
| Autodiff, training loops, GPU/device execution | Candle (Rust) / framework of choice | matten is not an ML framework. |
| Existing Python scientific stack, NumPy interop | NumPy (Python) | Manual/conceptual hand-off; no automatic bridge. |
| Small numeric work, ingestion, glue, learning/PoC | stay with matten | Migrating here would add dependencies for no real gain. |
A quick decision path
- Is the bottleneck a dense numeric kernel (matmul, matrix–vector, operator
application, axis reductions) that you have measured as hot? →
ndarray(general N-D) ornalgebra(if it is fundamentally small/mid dense linear algebra needing decompositions). - Do you need linear-algebra results
mattendoes not provide (LU/QR/SVD, solvers, eigenvalues)? →nalgebra. - Is the real need tabular (group-by/join/pivot/query)? → Polars (Rust) or Pandas
(Python). Not
matten-data. - Is it ML (autodiff/training/GPU)? → Candle or another ML framework.
- Are you already in Python? → NumPy/Pandas, with
mattenas the upstream Rust producer if useful. - None of the above, or the work is small? → stay with
matten.
Playbooks
Full per-target playbooks are available for every destination above:
ndarray, nalgebra,
Polars/Pandas, Candle, and
NumPy. The two Rust array/linalg targets carry task-scoped
positioning notes from the accepted RFC-049 peer comparison; the dataframe, ML, and
Python targets are different paradigms with no such benchmark (see each playbook).
Common pitfalls
A few mistakes come up repeatedly when moving data out of matten. None are hard to avoid
once you know to look for them.
Memory order: matten is row-major
matten stores tensor data in row-major (C order) logical layout. Some targets differ:
nalgebra’s DMatrix is column-major. If you hand a flat Vec<f64> to a column-major
constructor as if it were column-major, you will silently transpose your data. Always use a
constructor that interprets the source order explicitly (for example
nalgebra::DMatrix::from_row_slice, which reads row-major source), or transpose
deliberately. The per-target playbooks show the correct constructor for each.
Conversions copy — plan for it
Both directions of a bridge conversion copy the underlying data. That is the right
default for safety, but it means converting inside a tight loop is wasteful. Convert
once, at the boundary between “build/ingest in matten” and “compute in the specialised
crate”, not on every iteration.
do: build in matten → convert once → run the hot loop in the target
avoid: convert ↔ on every iteration of the hot loop
f64 vs other dtypes
matten tensors are f64. Targets that want f32 (common in ML, e.g. Candle) need an
explicit conversion, which is another copy and a precision change. Decide this at the
boundary and do it once.
Dynamic tensors must be made numeric first
If your data came through the dynamic feature (the messy-data on-ramp), it may hold
non-numeric or missing elements. Bridges reject dynamic tensors rather than guess. Resolve
to a numeric tensor first (fill or drop missing values, then try_numeric()), and only then
convert. See the dynamic reference.
Don’t expect matten-data to grow dataframe features
matten-data is an ingestion on-ramp (CSV/table → Tensor). If you find yourself wanting
group-by, joins, pivots, or query expressions, that is a signal to move the tabular work to
Polars or Pandas — not a gap to be filled in matten-data. It will not grow those features.
Migrate the kernel, keep the glue
The goal is rarely to rewrite everything. Keep matten for construction, ingestion, and
glue; move only the measured hot kernel. A migration that replaces your whole program is
usually a sign of over-migrating.
Migration readiness checklist
This checklist turns the vague question “should I leave matten?” into a set of concrete
pressure signals. Work through it for the part of your workflow under question, mark each
signal, and follow the mapping to a playbook. It is an advisory
self-assessment — there is no tool that scans your code, and a high score does not by itself
mean you must migrate.
How to read it: each signal is a yes/no probe. The more signals you mark “yes” — especially
the first six — the stronger the case to move that hot part of the workflow. If almost
everything is “no”, staying with matten is the right answer.
Pressure signals → target
| # | Signal | You feel it when… | If yes, consider |
|---|---|---|---|
| 1 | Data-size pressure | arrays are large enough that matten’s copy-on-reshape/slice shows up, or memory is tight | ndarray |
| 2 | Runtime pressure | a dense kernel (matmul, matrix–vector, operator application) is a measured hot path | ndarray / nalgebra |
| 3 | Axis-reduction pressure | sums/means over axes at scale are a bottleneck | ndarray |
| 4 | Linear-algebra pressure | you need LU/QR/SVD, solvers, or eigenvalues (matten has none) | nalgebra |
| 5 | Dataframe pressure | you need group-by, joins, pivots, or query expressions | Polars / Pandas |
| 6 | ML / device pressure | you need autodiff, training loops, or GPU/device execution | Candle |
| 7 | Dynamic-ingestion pressure | you lean heavily on the dynamic feature beyond a one-time messy-data on-ramp | resolve to numeric, then 1–4 as they apply |
| 8 | Dependency policy | you cannot add heavy dependencies (binary size, audit, embedded) | stay with matten |
| 9 | Target ecosystem preference | the surrounding system is Rust, or specifically Python | Rust → 1–4; Python → NumPy |
| 10 | Team language preference | the team works in Python and wants the numeric code there | NumPy / Pandas |
Reading the result
- Signals 1–3 dominate → a dense-array hot path:
ndarray. - Signal 4 is present → you need capability
mattenlacks:nalgebra. - Signal 5 dominates → the work is tabular, not array math: Polars/Pandas. Remember
matten-datawill not grow these. - Signal 6 is present → you have crossed into ML: Candle or another framework.
- Signals 9–10 point to Python → NumPy/Pandas, with
mattenas an upstream producer. - Signal 8 is “yes”, or almost everything is “no” → stay with
matten. Adding a dependency would cost simplicity for no measured gain.
Migration is usually partial: move the signalled hot kernel, keep matten for construction,
ingestion, and glue. When you are ready to write the result down, use the
readiness report template; a filled-in example is in
examples/.
Migration readiness report
When you have worked through the readiness checklist and want to record a decision — for a code review, a design doc, or just your own notes — fill in the report template below. It is a manual template: you write it, drawing on what you know about your own workload. There is no generator and no source-scanner.
This report is advisory. It does not prove production readiness, does not guarantee a target library is better, and does not perform automatic conversion.
Keep that framing in mind: the report’s job is to make a migration decision explicit and reviewable, not to certify anything.
How to use the template
Copy the skeleton below into your own doc and fill each section. Sections you have nothing to say about can be marked “none” — that is itself useful information (e.g. “Manual redesign areas: none” means the move is a near-direct port).
Template
# matten Migration Readiness Report
## Summary
One or two sentences: what is being assessed, and the headline recommendation
(stay, or migrate which part to which target).
## Current matten usage
What the code does in matten today — the shapes, the operations (matmul, reductions,
slicing, dynamic ingestion), and which examples it resembles.
## Production pressure signals
Which checklist signals are present, and the evidence (a profile, a data size, a
required capability). Be concrete; "runtime pressure: the per-step matmul dominates
at N samples" beats "it feels slow".
## Recommended target(s)
The target(s) the signals point to, and why. It is fine to recommend "stay with
matten" if the signals are weak.
## Direct conversion candidates
The operations that map cleanly onto the target (e.g. matmul → ndarray `.dot()`),
including which bridge function carries the data across.
## Manual redesign areas
The parts that do not port mechanically and need rethinking (e.g. an iterative loop,
or switching an algorithm to a decomposition-based form). "none" is a valid answer.
## Bridge crates / tools
Which bridge crate applies (e.g. matten-ndarray) or that the conversion is manual
(e.g. nalgebra). Note copy/precision boundaries.
## Risks
What could go wrong: precision changes (f64 → f32), memory-order traps (row- vs
column-major), converting inside a hot loop, or scope creep into over-migration.
## Next steps
The concrete plan: profile to confirm the hot path, convert once at the boundary,
move the kernel, keep matten for setup/glue, and a checkpoint to reassess.
A filled-in example
See Linear regression (GD) readiness for the
template applied to the 35_linear_regression_gradient_descent example.
Worked example: linear regression (gradient descent) readiness
This applies the readiness report template to the
35_linear_regression_gradient_descent example. It is illustrative — the example itself runs
on toy data; the report imagines the same code scaled to real data and asks what would change.
This report is advisory. It does not prove production readiness, does not guarantee a target library is better, and does not perform automatic conversion.
matten Migration Readiness Report
Summary
Batch gradient descent for a linear model ŷ = X · θ. At the example’s toy size, stay with
matten. If the same code runs on a real design matrix (thousands of samples, many
features), move the per-step matrix products to ndarray via the matten-ndarray bridge,
keeping matten for setup. A closed-form solve in nalgebra is an optional redesign.
Current matten usage
Xis a[samples, 2]Tensor(leading bias column, soθ = [b, w]).- Each step runs two
Tensor::matmulcalls: predictionsX · θ([n,2] × [2] → [n]) and the gradientXᵀ · residual([2,n] × [n] → [2]). Xᵀis formed once withTensor::transposeand reused.- The residual and the
θupdate are plain Rust (zip/map). - The loop runs many iterations (2000 in the example).
Production pressure signals
- Runtime pressure (signal 2): present at scale. The two matmuls per step, over many
iterations, are the hot path once
Xis large. This is the kernel worth moving. - Data-size pressure (signal 1): present at scale. A large design matrix stresses
matten’s copy-on-reshape/slice behavior. - Linear-algebra pressure (signal 4): partial. The problem can be solved without
iteration, via the normal equations — which needs a solver/decomposition
mattenlacks. - Dependency policy (signal 8): low cost.
matten-ndarrayis already an available bridge, so the ndarray path adds little. - Ecosystem/team (signals 9–10): Rust.
- Axis-reduction, dataframe, ML/device, and dynamic-ingestion signals are not present here.
Recommended target(s)
ndarray(primary). Keep the gradient-descent structure as-is and run the two matrix products asndarray.dot()(BLAS-backed for largeX). This is a near-direct port.nalgebra(optional redesign). If you would rather not iterate, reformulate as a closed-form normal-equation solve using a decomposition. That is a change of algorithm, chosen for capability, not a mechanical port.- Toy size: stay with
matten. The signals only bite at real data sizes.
Direct conversion candidates
X,Xᵀ, andθ→ArrayD<f64>withmatten_ndarray::to_arrayd, converted once before the loop.- The two
matmulcalls →ndarray.dot(). - Final
θback to aTensorwithfrom_arraydif downstream code expects one.
Manual redesign areas
- The plain-Rust residual and
θupdate becomendarrayelementwise operations — small, but not a literal copy-paste. - The optional closed-form solve is a genuine redesign (assemble
XᵀXandXᵀy, solve), not a translation of the existing loop.
Bridge crates / tools
matten-ndarray(to_arrayd/from_arrayd): copies both directions,f64on both sides, so no precision change. See the bridge contract.- The
nalgebraoption has no bridge crate; conversion is manual viaDMatrix::from_row_slice(mind the row- vs column-major boundary).
Risks
- Converting inside the loop. Convert
X/Xᵀ/θonce, before iterating — not per step. - Column-major trap (nalgebra option only). Build
DMatrixwithfrom_row_sliceso the row-major data is not silently transposed. - Over-migration. Keep
mattenfor constructingXandy; only the kernel needs to move.
Next steps
- Profile at a realistic data size to confirm the matmuls are the bottleneck.
- Convert
X,Xᵀ,θonce viamatten-ndarray; run the loop withndarray.dot(). - Keep
mattenfor data construction and glue. - Reassess later: if you want a non-iterative solve, move to
nalgebra; if the model grows into trained ML with autodiff/GPU, that is a Candle question.
Bridge conversion contracts
A bridge crate converts a matten::Tensor to and from a specific external type (for
example matten-ndarray ↔ ndarray::ArrayD<f64>). Because a conversion can silently lose or
reshape data if its rules are vague, every bridge crate documents a conversion contract:
a fixed set of dimensions that say exactly what the conversion does. This page gives the
template and the filled-in contract for the reference bridge, matten-ndarray.
The contract template
Every bridge contract documents these dimensions:
| Dimension | What it states |
|---|---|
| Source type | The matten side of the conversion. |
| Target type | The external type. |
| Direction | One-way or bidirectional, and the function names. |
| Copy / view behavior | Whether data is copied or shared (zero-copy). |
| Shape / rank policy | How shape is preserved and any rank limits. |
| Memory-order policy | Row-major vs column-major, and how non-standard layouts are handled. |
| Dynamic-tensor policy | What happens to dynamic (non-numeric/missing-capable) tensors. |
| NaN policy | Whether NaN/inf are passed through or treated specially. |
| Missing-value policy | How missing/None values are handled (if reachable at all). |
| Integer / text / bool policy | How non-f64 element kinds are handled (if reachable). |
| Error behavior | Result vs panic, and the error type/variants. |
| Performance caveat | The cost the caller must plan around. |
| Examples | Runnable conversion snippets. |
Two rules are constant across all bridges (see bridge-crate policy):
conversions return Result and never panic on rejected input, and a bridge crate
does not re-export core Tensor.
Reference contract: matten-ndarray
matten-ndarray converts between matten::Tensor and ndarray::ArrayD<f64>.
| Dimension | matten-ndarray |
|---|---|
| Source / target type | matten::Tensor ↔ ndarray::ArrayD<f64> |
| Direction | Bidirectional: to_arrayd(&Tensor), from_arrayd(ArrayD<f64>) |
| Copy / view | Copies both directions. No zero-copy is claimed. |
| Shape / rank | Shape is preserved exactly. Rank is bounded by core matten; an over-rank array is rejected via the core validation error. |
| Memory order | Row-major logical order both ways. from_arrayd reads logical order, so a transposed/sliced/non-standard-layout ArrayD converts correctly instead of being silently transposed. |
| Dynamic-tensor policy | Rejected. to_arrayd on a dynamic tensor returns DynamicTensor (a Result, not a panic). The guard is unconditional — it does not depend on the dynamic feature being enabled. |
| NaN policy | Passed through as ordinary f64 values; no special handling. |
| Missing-value policy | Not reachable: only numeric tensors convert, and dynamic tensors (which can carry missing values) are rejected first. |
| Integer / text / bool policy | Not reachable: matten’s numeric Tensor is f64 only; non-numeric element kinds live in the rejected dynamic model. |
| Error behavior | Returns Result<_, MattenNdarrayError>; never panics. Variants: DynamicTensor, ZeroSizedAxis(shape) (core has no zero-length axes), NdarrayShape(..) (ndarray shape mismatch), Matten(MattenError) (wraps a core validation error). |
| Performance caveat | Both directions allocate and copy. Convert once at the boundary, not inside a hot loop. |
| Examples | See below. |
Examples
#![allow(unused)]
fn main() {
use matten::Tensor;
use matten_ndarray::{to_arrayd, from_arrayd};
// Tensor -> ArrayD<f64> (copies; row-major)
let t = Tensor::new(vec![1.0, 2.0, 3.0, 4.0], &[2, 2]);
let arr = to_arrayd(&t)?;
assert_eq!(arr[[1, 0]], 3.0);
// ArrayD<f64> -> Tensor preserves *logical* order, even for a transposed array
let back = from_arrayd(arr.t().to_owned())?; // logical shape [2, 2], transposed
Ok::<(), matten_ndarray::MattenNdarrayError>(())
}
A dynamic tensor is rejected rather than guessed:
// to_arrayd(&dynamic_tensor) -> Err(MattenNdarrayError::DynamicTensor)
// Resolve to a numeric tensor first (e.g. try_numeric()), then convert.
Error-category note
The generic error categories sketched in RFC-051 (UnsupportedTensorKind, UnsupportedRank,
…) are illustrative for future bridges, not a required enum schema. matten-ndarray’s
existing variants (DynamicTensor / ZeroSizedAxis / NdarrayShape / Matten) document its
contract clearly and are compliant as-is; a bridge need not rename or expand its error enum
to match the sketch.
Bridge-crate policy
Bridge crates are how matten connects to specific external ecosystems without dragging
their dependencies into core. This page states the rules every bridge crate follows, and the
checklist a future bridge crate must satisfy.
Why bridges are separate crates
Core matten owns the Tensor type and stays small and dependency-light. It must not
gain a dependency on ndarray, nalgebra, Polars, Candle, or any other target library.
Each target-specific conversion therefore lives in its own crate (for example
matten-ndarray), which is the only place that target’s dependency appears.
core matten → owns Tensor; no target-library dependency
matten-ndarray → owns the ndarray dependency; converts Tensor ↔ ArrayD<f64>
(future bridges) → own their target's dependency; same pattern
This boundary is CI-enforced: scripts/check-published-dependency-isolation.sh proves
that the published core and companion crates do not pull in target/benchmark dependencies,
with matten-ndarray → ndarray as the one allowed, documented exception.
Rules every bridge crate follows
- Own the target dependency. The bridge crate is the only published crate that depends on its target library.
- Do not add a dependency to core
matten. A bridge never causes core to gain a target-library dependency. - Do not re-export core
Tensor. A bridge takes and returnsmatten::Tensor, but users importTensorfrommatten, not from the bridge. (For example,matten-ndarrayexports onlyto_arrayd,from_arrayd, andMattenNdarrayError.) - Return
Result, never panic on rejected input. Document the rejection cases. - Publish a conversion contract. Fill in every dimension of the contract template.
- Name conversions
to_<target>/from_<target>. This follows theto_arrayd/from_arraydprecedent (e.g.to_dmatrix/from_dmatrix,to_dvector/from_dvector). Deviate only if the target ecosystem has a stronger idiom, and justify it in that bridge’s RFC.
Current bridges
matten-ndarray— the reference bridge (Tensor↔ndarray::ArrayD<f64>). Its contract is documented in bridge contracts.
There is no matten-nalgebra bridge today; the nalgebra playbook
documents the manual conversion path, and a dedicated bridge is only a possible future
direction, not a commitment.
Future bridge-crate checklist
Before a new bridge crate is created (which requires separate approval — see below):
- The target library has a clear, recurring conversion need that does not fit an existing bridge.
- The crate owns the target dependency; core
mattengains nothing. - Conversions are
to_<target>/from_<target>and returnResult. - The crate does not re-export
Tensor. - A full conversion contract is filled in (copy/shape/memory order/dynamic/NaN/missing/dtype/error/performance).
-
scripts/check-published-dependency-isolation.shis extended so the new crate’s allowed/forbidden dependencies are enforced. - The dynamic-tensor policy is explicit (reject, or document the numeric-first step).
No new bridge crate without approval
This policy page does not authorize creating new bridge crates. A new bridge (such as a
hypothetical matten-nalgebra, matten-polars, or matten-candle) requires its own RFC and
explicit approval. Documenting the pattern here does not pre-approve any specific crate.
Target playbooks
Each playbook is a step-by-step guide for moving a matten workflow to one specific
ecosystem. They share a common structure: when to choose (and not choose) the target, how
matten concepts map onto it, worked example migrations drawn from the
examples, the conversion path, pitfalls, task-scoped positioning
notes, and a minimal checklist.
Available now
ndarray— general Rust N-D arrays; the first stop for dense numeric workloads at scale, with a contract-backed bridge crate (matten-ndarray).nalgebra— dense linear algebra: vectors, matrices, decompositions, and solvers.- Polars / Pandas — dataframe analytics (group-by, joins, pivots,
query).
matten-datais an on-ramp and will not grow these. - Candle — ML tensors, training, and device execution — without implying
mattenis an ML framework. - NumPy — the Python scientific path, as a manual/conceptual hand-off.
Decision tree
measured dense numeric hot path?
├─ general N-D arrays / BLAS matmul / axis reductions → ndarray
└─ small/mid dense linear algebra, decompositions → nalgebra
need LU / QR / SVD / solvers / eigenvalues? → nalgebra
need group-by / join / pivot / query? → Polars (Rust) / Pandas (Python)
need autodiff / training / GPU? → Candle / ML framework
already in Python / NumPy ecosystem? → NumPy (matten as upstream producer)
small work, ingestion, glue, learning? → stay with matten
If you are unsure whether you have outgrown matten at all, start with
When to migrate.
Migrating to ndarray
ndarray is the general Rust N-D array crate: strided arrays,
views, broadcasting, and (with a BLAS backend) fast matrix multiplication. It is the natural
first production target when a dense numeric workload outgrows matten. A dedicated bridge
crate, matten-ndarray, provides a contract-backed conversion in both directions.
Choose this target when
- You have general N-D numeric arrays and need production-grade array operations.
- Dense
matmulor axis reductions are a measured hot path at scale. - You want strided views, advanced indexing, or a BLAS backend.
Do not choose this target when
- The work is small or not on a hot path — staying with
mattenis simpler. - You fundamentally need linear-algebra results (LU/QR/SVD, solvers, eigenvalues) →
prefer
nalgebra. - The real need is tabular (group-by/join/pivot) → Polars/Pandas, not an array crate.
Concept mapping
matten | ndarray |
|---|---|
Tensor (row-major f64) | ArrayD<f64> / Array1/Array2 |
Tensor::new(data, &[r, c]) | Array2::from_shape_vec((r, c), data) |
.matmul(&b) | a.dot(&b) |
.sum_axis(i) / .mean_axis(i) | a.sum_axis(Axis(i)) / a.mean_axis(Axis(i)) |
elementwise &a + &b | &a + &b |
.reshape(&[..]) | ndarray’s current reshape APIs (e.g. to_shape / into_shape_with_order, per ownership/layout) |
.shape() | .shape() / .dim() |
Example migrations
These map directly from the shipped examples:
22_matrix_multiplication→ndarraywhen the matrices are large or the multiply is hot.27_axis_reductions→ndarray; axis reductions are exactly wherematten’s internal baseline flagged the widest internal cost, so this is a strong candidate to move.35_linear_regression_gradient_descent→ndarray(ornalgebra) once the GD loop runs on real-sized design matrices.50_rowwise_scoring→ndarrayif rows get large, otherwise stay withmatten.
Conversion path
The clean path is the matten-ndarray bridge, which copies, is numeric-only, rejects
dynamic tensors, and preserves logical row-major order:
#![allow(unused)]
fn main() {
use matten::Tensor;
use matten_ndarray::{to_arrayd, from_arrayd};
let t = Tensor::new(vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0], &[2, 3]);
// matten Tensor -> ndarray ArrayD<f64> (copies)
let arr = to_arrayd(&t)?;
// ... heavy ndarray work here (dot, BLAS matmul, axis reductions, views) ...
// ndarray ArrayD<f64> -> matten Tensor (copies)
let back = from_arrayd(arr)?;
Ok::<(), matten_ndarray::MattenNdarrayError>(())
}
If you are not using the bridge crate, the manual path via flat data also works (see the
reference page): t.into_vec() / t.shape() into
ArrayD::from_shape_vec(shape, flat).
Common pitfalls
- Convert once. Both directions copy — do it at the boundary, not inside the hot loop.
- Dynamic tensors are rejected. Make the tensor numeric (
try_numeric()) before converting; the bridge returns aDynamicTensorerror rather than guessing. - Reshape APIs moved. Prefer ndarray’s current reshape APIs over the deprecated
into_shape; check the ndarray versionmatten-ndarraypins before copying snippets.
Performance / positioning notes
In the accepted RFC-049 Rust peer comparison (task-scoped, small fixed sizes, single
machine — not a ranking), dense matmul and matrix–vector tasks showed the widest gap to
ndarray (roughly an order of magnitude at those sizes), while a lighter vector task was
competitive. The practical reading: if dense matmul, matrix–vector, or axis-reduction
kernels are your measured hot paths, moving those to ndarray is where the benefit is
concentrated. This is positioning, not a claim that either library is “better” in general.
Minimal checklist
- The hot path is a dense array kernel you have actually measured.
- You convert once at the boundary, not per iteration.
- The tensor is numeric (no dynamic elements) before conversion.
- You kept
mattenfor construction/ingestion/glue where it was already fine.
Migrating to nalgebra
nalgebra is the dense linear-algebra crate: statically- and
dynamically-sized vectors and matrices, plus decompositions (LU, QR, SVD), solvers, and
eigenvalues. It is the right target when your workload is fundamentally small/mid dense
linear algebra, especially when you need results matten intentionally does not provide.
There is no matten-nalgebra bridge crate today — conversion is manual (a few lines) and
a dedicated bridge is only a documented future direction, not a commitment.
Choose this target when
- You need decompositions or solvers: LU, QR, SVD, eigenvalues, linear systems.
- Your data is naturally small/mid dense vectors and matrices.
- You want a typed linear-algebra API rather than general N-D arrays.
Do not choose this target when
- You need general N-D arrays or BLAS-backed bulk array ops → prefer
ndarray. - The work is small or not hot → stay with
matten. - The real need is tabular or ML → Polars/Pandas or Candle.
Concept mapping
matten | nalgebra |
|---|---|
Tensor of shape [n] | DVector<f64> |
Tensor of shape [r, c] (row-major) | DMatrix<f64> (column-major — see pitfalls) |
.matmul(&b) | &a * &b |
| matrix–vector | &m * &v |
.dot(&b) (vectors) | a.dot(&b) |
.transpose() | .transpose() |
decompositions (not in matten) | .lu(), .qr(), .svd(..), .symmetric_eigen(), … |
Example migrations
20_dot_product/21_matrix_vector_product→nalgebraDVector/DMatrixoperations.22_matrix_multiplication→nalgebra&a * &b(orndarrayfor general N-D).31_fibonacci_matrix_power→nalgebramatrix powers.35_linear_regression_gradient_descent→nalgebrawhen you want a typed matrix/vector formulation (or want to switch to a closed-form solve via a decomposition).
Conversion path
Manual, via flat row-major data. DMatrix is column-major, so build it from a row-major
slice with from_row_slice, which reads the source in row-major order:
#![allow(unused)]
fn main() {
use matten::Tensor;
use nalgebra::{DMatrix, DVector};
// vector
let v = Tensor::from_vec(vec![1.0, 2.0, 3.0]);
let dv = DVector::from_vec(v.into_vec());
// matrix (row-major source -> from_row_slice keeps the logical layout)
let m = Tensor::new(vec![1.0, 2.0, 3.0, 4.0], &[2, 2]);
let shape = m.shape().to_vec();
let dm = DMatrix::from_row_slice(shape[0], shape[1], &m.into_vec());
// ... decompositions / solvers / matrix algebra here ...
}
To return to matten, read the matrix back in row-major order (transpose as needed) and
rebuild with Tensor::new(data, &shape).
Common pitfalls
- Column-major trap. Do not feed a row-major flat
Vecto a column-major constructor as if it were column-major — usefrom_row_sliceor transpose deliberately, or you will silently transpose your data. - Convert once at the boundary; conversions copy.
- Make dynamic tensors numeric first (
try_numeric()).
Performance / positioning notes
In the accepted RFC-049 Rust peer comparison (task-scoped, small fixed sizes, single
machine — not a ranking), nalgebra had lower overhead than matten on dense matmul and
matrix–vector kernels, while a lighter vector task was competitive. The value of nalgebra,
though, is usually capability rather than raw speed: decompositions and solvers that
matten does not implement at all. If you need those, the migration is about what you can
compute, not just how fast.
Minimal checklist
- You need dense linear algebra or a decomposition/solver
mattendoes not provide. - You build
DMatrixwithfrom_row_slice(or transpose deliberately). - You convert once at the boundary; the tensor is numeric first.
- You kept
mattenfor the parts where it was already a good fit.
Migrating to Polars / Pandas (dataframes)
Polars (Rust, with Python bindings) and
Pandas (Python) are dataframe libraries: labeled columns,
heterogeneous types, group-by, joins, pivots, and query expressions. They solve a different
problem than matten, which is a numeric Tensor. Reach for them when your real need is
tabular analytics, not array math.
The boundary you are crossing
matten-data is an ingestion on-ramp: it reads CSV/table data and hands you a numeric
Tensor. It is intentionally minimal and will not grow group-by, joins, pivots, or
query expressions. If you find yourself wanting any of those, that is the signal to move the
tabular layer to a dataframe library — it is not a missing matten-data feature.
matten-data → CSV/table → numeric Tensor (on-ramp only)
Polars/Pandas → group-by / join / pivot / query / labeled columns
Choose this target when
- You need group-by, joins, pivots, windowing, or query-style selection.
- Your data is tabular and heterogeneous (mixed column types, labels, nulls as a first-class concept).
- You want to explore/clean tabular data interactively before any numeric step.
Do not choose this target when
- Your data is already a clean numeric array and you only need array math →
ndarrayor stay withmatten. - You need decompositions/solvers →
nalgebra. - The tabular step is a one-time CSV-to-numeric on-ramp →
matten-dataalready covers it.
Concept mapping
matten / matten-data | Polars / Pandas |
|---|---|
Table::from_csv_str(..) (on-ramp) | pl.read_csv(..) / pd.read_csv(..) |
numeric Tensor (homogeneous f64) | a DataFrame of typed, labeled columns |
select columns then to_tensor() | df.select([..]), then to ndarray/NumPy if needed |
| (not available) group-by / join / pivot | df.group_by(..), df.join(..), pivots |
Example migrations
data_00_quickstart→ if the next step is group-by/join/pivot rather than array math, read the CSV straight into Polars/Pandas instead ofmatten-data.- A CSV → clean → single numeric pass with no tabular analytics → stay with
matten-data; it is the right size for that.
Conversion path
The usual pattern is not to convert a matten tensor into a dataframe, but to enter
the dataframe library at the data source:
have a CSV and need tabular analytics? → read it directly into Polars/Pandas
already have a numeric matten Tensor? → export its columns if you must, but prefer
doing tabular work upstream of matten
If you genuinely need to move a numeric Tensor into a dataframe, export its data
(tensor.to_vec() / tensor.shape()) and build columns in the dataframe library; the exact
constructor depends on the library and version, so follow its current docs.
Common pitfalls
- Don’t wait for
matten-datato grow tabular features. It will not. Recognize the boundary early. - Round-tripping is usually wrong. If tabular work is the point, do it in the dataframe
library from the start rather than converting back and forth with
matten.
Performance / positioning notes
There is no matten-vs-dataframe benchmark: a numeric tensor and a dataframe are
different paradigms, and a cross-library/cross-language ecosystem comparison would be
RFC-049 Phase 3, which is not authorized. Choose by capability and ecosystem fit
(do you need tabular operations?), not by a measured speed comparison.
Minimal checklist
- Your real need is tabular (group-by/join/pivot/query), not array math.
- You enter the dataframe library at the data source rather than round-tripping.
- You are not waiting for
matten-datato grow dataframe features.
Migrating to Candle (ML tensors)
Candle is a Rust ML tensor framework: autograd, neural-network layers, model loading, and CPU/GPU execution. Move here when your workflow becomes machine learning — training loops, autodiff, or device acceleration.
matten is not an ML framework and does not aim to become one. It has no autograd, no
layers, no optimizers, and no device backend. When you need those, that capability lives in
Candle (or another ML framework), not in a future matten feature.
Choose this target when
- You need automatic differentiation / backprop.
- You are building or running a model (layers, training loop, inference).
- You need GPU/device execution.
Do not choose this target when
- You are doing plain numeric array math with no learning →
ndarrayor stay withmatten. - You need classical linear-algebra results (decompositions/solvers) →
nalgebra. - The “ML” is actually a small hand-written numeric step (e.g. a single gradient-descent
update) that
mattenalready expresses clearly — it may not be worth a framework yet.
Concept mapping
matten | Candle |
|---|---|
Tensor (f64, CPU, no grad) | candle_core::Tensor (often f32, CPU/GPU, autograd) |
manual update step (e.g. 35_linear_regression_gradient_descent) | optimizer + loss.backward() |
.matmul(&b) | a.matmul(&b)? on a device |
| (not available) autodiff / layers / optimizers | candle_nn modules, Var, optimizers |
Example migrations
35_linear_regression_gradient_descent→ Candle once you want autodiff and an optimizer instead of a hand-written gradient step.37_kmeans_small/38_nearest_neighbor_classification→ Candle (or a dedicated ML crate) if these grow into trained models on real data; for small teaching versions,mattenis fine.
Conversion path
matten is f64; Candle workflows are commonly f32, so the boundary involves a precision
conversion as well as a copy. The shape carries over directly. Illustratively (Candle is not
a matten dependency):
#![allow(unused)]
fn main() {
use matten::Tensor;
// candle_core = { version = "0.x", features = ["..."] }
let t = Tensor::new(vec![1.0, 2.0, 3.0, 4.0], &[2, 2]);
let shape = t.shape().to_vec();
let flat_f32: Vec<f32> = t.as_slice().iter().map(|&v| v as f32).collect();
// let device = candle_core::Device::Cpu;
// let candle_t = candle_core::Tensor::from_vec(flat_f32, shape, &device)?;
}
Common pitfalls
f64→f32is a precision change, not just a copy. Do it once at the boundary and be aware of the loss.- Don’t expect
mattento provide autograd or layers. If you reach for those, you have already crossed into ML-framework territory. - A single update step is not a model. If your “training” is one hand-written step, consider whether you actually need a framework yet.
Performance / positioning notes
There is no matten-vs-Candle benchmark. They occupy different layers (a plain numeric
tensor vs. an autodiff/device ML framework), and a cross-framework comparison would be
RFC-049 Phase 3, which is not authorized. Choose Candle for ML capability and device
support, not on the basis of a measured speed comparison.
Minimal checklist
- You actually need autodiff, layers/optimizers, or device execution.
- You handle the
f64→f32precision change once, at the boundary. - You are not treating
mattenas an ML framework it never claimed to be.
Migrating to NumPy (Python scientific stack)
NumPy is the foundation of the Python scientific ecosystem (SciPy, scikit-learn, Pandas, and most ML tooling sit on top of it). Move here when the workflow’s center of gravity is Python, or when you need a library that only exists in that ecosystem.
This is a cross-language boundary, so it is manual/conceptual: there is no automatic
Rust↔Python conversion, and matten does not provide one. The realistic pattern is to use
matten as an upstream Rust producer and hand the data to Python through a serialization
format.
Choose this target when
- Your team or downstream pipeline is in Python.
- You need a Python-only library (SciPy, scikit-learn, a specific ML/stats package).
- The numeric work belongs next to Python data tooling rather than in a Rust binary.
Do not choose this target when
- You want to stay in Rust →
ndarray(general arrays) ornalgebra(linear algebra). - The work is small and already lives happily in
matten.
Concept mapping
matten (Rust) | NumPy (Python) |
|---|---|
Tensor (f64, row-major) | numpy.ndarray (default C/row-major) |
tensor.shape() | array.shape |
tensor.to_vec() / into_vec() (flat row-major) | array.ravel() / array.reshape(..) |
.matmul(&b) | a @ b |
| axis reductions | a.sum(axis=..) / a.mean(axis=..) |
Example migrations
- Any numeric example (e.g.
35_linear_regression_gradient_descent,36_heat_equation_1d) → reimplement in NumPy when the surrounding pipeline is Python; the row-major layout and shape transfer directly.
Conversion path
Hand data across the language boundary via a serialization format. The flat data is row-major, which matches NumPy’s default, so only the shape needs to travel with it:
matten (Rust): tensor.to_vec() + tensor.shape()
↓ write to a shared format (CSV, JSON, or .npy / Arrow for larger data)
NumPy (Python): np.loadtxt(...) / np.load("data.npy").reshape(shape)
For small data, CSV/JSON is simplest; for larger or repeated transfers, a binary format
(.npy, or Arrow) avoids text parsing overhead. There is no in-process bridge — the two
runtimes do not share memory here.
Common pitfalls
- No automatic bridge. Plan an explicit serialization step; do not expect in-process conversion between Rust and Python.
- Carry the shape. The flat buffer is row-major (NumPy’s default), but you must reattach the shape on the Python side.
f64everywhere inmatten. If the Python side wantsfloat32, cast there.
Performance / positioning notes
There is no matten-vs-NumPy benchmark, and one would be a cross-language RFC-049
Phase 3 comparison, which is not authorized. NumPy is C/BLAS-backed and fast on dense
numeric work, but the reason to migrate here is usually ecosystem and language fit, not
a measured speed comparison against matten.
Minimal checklist
- The workflow’s home is Python (team, pipeline, or a Python-only library).
- You have a concrete serialization hand-off (CSV/JSON for small,
.npy/Arrow for larger) and you carry the shape across. - You are not expecting an in-process Rust↔Python conversion.
Benchmarks
matten keeps a small, reproducible benchmarking and positioning program
(RFC-049). Its goal is to describe matten’s position honestly and with
evidence, not to win a performance contest.
The program answers questions like:
- What is
mattengood at? - Where is it intentionally simpler?
- Where is it slower but acceptable?
- Where would performance become a blocker?
- How much code does a user write to solve small problems?
It deliberately does not claim that matten replaces ndarray, nalgebra,
NumPy, SciPy, Pandas, or Candle. matten is a small, approachable, Tensor-centered
Rust numeric crate for PoC, learning, and small workflows; the benchmarks exist to
make that position legible.
Current status
The benchmark program is staged.
- Phase 1 — internal Rust baseline: implemented and accepted. A benchmark harness
(
benchmarks/, kept outside the workspace and unpublished); a core micro set and five scenario workloads drawn from the examples; a peak-RSS memory note on Linux; and an accepted internal baseline report. - Phase 2 — Rust peer comparison (
ndarray/nalgebra): complete and accepted. The official peer comparison was filled from a maintainer run on the baseline’s machine class and accepted by architect ruling on 2026-06-25. Peer tasks are opt-in behind thepeersfeature (off by default).
Still deferred (designed in RFC-049, not yet implemented/authorized):
- Phase 3 — ecosystem reference comparison (NumPy/Pandas), script-driven;
- Phase 4 — regression tracking policy and hard thresholds/gates.
Read next
Two paths, depending on what you need:
- Just want the results? → Results — a curated, readable summary of the latest numbers (Phase 1 internal baseline and Phase 2 peer comparison), with the “positioning, not ranking” framing. This is the reader’s page.
- Need to regenerate or extend the benchmarks? → Methodology for what is
measured and the rules that keep the program honest, then the harness
README.mdinbenchmarks/for the maintainer path: the environment-capture snippet and the exactcargo bench …commands under How to regenerate (with environment capture).
The full reports (complete tables, environment, regeneration commands) live in
benchmarks/reports/.
Benchmark methodology
This page records how matten’s benchmarks are measured and the rules that keep the
program honest. It reflects Phase 1 (internal Rust baseline) of RFC-049.
Purpose
Clarify matten’s position with reproducible evidence: execution time, memory
behavior, example-code size (ELOC), and dependency footprint. The output is a
positioning and regression-visibility tool, not a ranking or a marketing claim.
Non-goals
The benchmark program must not:
- claim
mattenis faster than NumPy, or a replacement forndarray/nalgebra; - include SciPy, Pandas, Candle, or GPU suites;
- add hard CI speed-fail thresholds (initially);
- change any public API merely to make a benchmark faster;
- pressure the project into scope creep.
Metrics
- Execution time — measured with
criterionfor Rust microbenchmarks: inputs are pinned and built outside the timed body, no printing happens inside the measured section, andblack_boxis used to prevent the optimizer from deleting the work. - Memory — peak resident set size (see below). Informative, not a gate.
- Example ELOC and dependency footprint — reported alongside timings when available, to show approachability and dependency trade-offs.
Workloads (Phase 1)
A core micro set: construction, reshape/flatten, elementwise add/mul,
broadcasting, sum/mean, sum_axis/mean_axis, matmul, and a small slice. An
optional dynamic try_numeric micro-workload is available behind the harness’s
dynamic feature.
A scenario set of five small, well-known computations taken from the examples: cosine similarity, a Markov-chain step, a tiny PageRank step, a linear-regression gradient-descent step, and a 1-D heat-equation step.
Heavier examples (k-means, nearest-neighbor, finite differences, trapezoidal integration) and any peer/reference comparisons are deferred to later phases.
Memory measurement policy
Phase 1 uses Linux peak RSS, which is coarse but adequate and requires no allocator instrumentation:
/usr/bin/time -v cargo bench --manifest-path benchmarks/Cargo.toml --bench scenarios -- --noplot
# record "Maximum resident set size"
Measuring smaller per-scenario commands gives a more useful figure than one giant
run. No custom global allocator and no allocation-level instrumentation are added in
Phase 1. macOS (/usr/bin/time -l) and Windows are deferred; memory must never block
Phase 1 if allocation-level measurement is not ready.
Environment recording
Every report records: OS, kernel, CPU, RAM, rustc version, target, build profile,
the exact command, and the peak-RSS tool. Benchmarks are workload- and
environment-specific; numbers from different machines are not directly comparable.
A runnable capture snippet for these fields, plus the full regenerate steps, lives in the harness README under How to regenerate (with environment capture).
CI policy
CI compile-checks the harness (cargo bench --manifest-path benchmarks/Cargo.toml --no-run) but does not run full benchmarks. CI may fail if the harness does not
compile, a report generator breaks, or a result schema is invalid — but never because
a run is slower or uses more memory than a previous run. There are no hard performance
gates.
Required disclaimer (in every report)
These results are workload-specific and environment-specific. They are for positioning and regression visibility, not universal ranking.
Phase 2 — Rust peer comparison (implemented)
Phase 2 was authorized once the maintainer-run internal baseline was accepted, and the peer-comparison harness is implemented. Peer comparison is:
- task-scoped, not library-scoped — a task is included only if the compared implementations solve the same small mathematical problem with comparable data representation and no hidden extra library capability. It is a Rust peer comparison for positioning, never a competitor ranking or a “faster than X” claim;
- opt-in — behind the
peersfeature (ndarray/nalgebraas optional deps), off by default, so the default harness build and ordinary CI stay peer-free. The peers bench is compile-checked only in a separate, manually/scheduled workflow, never with speed gates; - isolated — published crates are positively proven free of peer dependencies by
scripts/check-published-dependency-isolation.sh(thematten-ndarray → ndarraybridge is the one allowed exception).
Run it with cargo bench --manifest-path benchmarks/Cargo.toml --features peers --bench peers -- --noplot; results go in benchmarks/reports/peer-comparison-v0.1.md.
The Phase 2 harness, report template, and official peer report are complete: the
official Rust peer comparison was filled from a maintainer run on the same machine class as
the accepted internal baseline and accepted by architect ruling on 2026-06-25
(benchmarks/reports/peer-comparison-v0.1.md, Report ID
matten-rfc049-rust-peer-comparison-v0.1). Phase 3 (NumPy/Pandas) and hard performance
gates remain not authorized.
Benchmark results
This page is the reader’s view: a curated summary of matten’s benchmark results so they are
readable from inside the book. It is a small representative selection, not the full matrix — the
complete numbers, environment details, and regeneration steps live in the reports under
benchmarks/reports/. If you want to run the benchmarks, see the
methodology and the harness README.md.
These numbers are workload-specific and environment-specific. They were produced on one virtualized machine with microbenchmark methodology. They are a positioning and regression-visibility reference — not a ranking, and not a “faster than X” claim.
mattenoptimizes for time to a runnable PoC, not benchmark leadership.
The numbers below are the v0.2 maintainer refresh at workspace 0.28.3, produced under the
unchanged RFC-049 methodology. The architect-accepted reference baseline is v0.1 (see the reports);
the relative positioning matches v0.1. Absolute timings drift run-to-run with VM load — all
libraries move together — so the shape of the results is the signal, not the exact microseconds.
Phase 1 — internal baseline
matten measured against itself, to establish a reference point and make future regressions
visible (RFC-049 Phase 1).
- Baseline ID:
matten-rfc049-internal-baseline-v0.2— maintainer refresh at v0.28.3 (reference:…-v0.1, accepted 2026-06-24). - Environment: Ubuntu 26.04, 8 vCPU AMD (virtualized), rustc 1.93.1, profile
bench(opt-level 3), Criterion defaults; git5953c9f, workspace0.28.3. Not comparable across machines.
Representative medians (full table in the report):
| Workload | Time (median) |
|---|---|
| construction (4096-element vector) | ~1.0 µs |
| elementwise add (4096 elements) | ~10.3 µs |
matmul (64×64) | ~78 µs |
sum_axis + mean_axis (64×64, combined) | ~1.30 ms |
| cosine similarity (len 512) | ~803 ns |
| linear-regression GD step (m=256) | ~2.23 µs |
Peak RSS was not captured in this refresh (the VM lacked GNU /usr/bin/time); it is
informative-only and never a gate. The accepted v0.1 baseline recorded ~44 MiB for the full
scenario run under the same methodology, dominated by Criterion’s own footprint rather than the
small tensors.
The clearest signal is that axis reductions are currently matten’s most expensive core path —
the combined sum_axis/mean_axis workload (~1.30 ms) is roughly 400× the whole-tensor
sum/mean (~3.23 µs) and ~17× a 64×64 matmul. This is recorded as positioning /
regression-visibility information, not a defect: it is the natural first place to look if
axis-reduction cost ever matters for your workload.
Phase 2 — Rust peer comparison
The same small problems placed next to two established Rust numeric crates, ndarray and
nalgebra, each in its native type (RFC-049 Phase 2). This shows where matten’s approachable
Tensor API sits — including where it is slower but acceptable — not a ranking of libraries.
- Report ID:
matten-rfc049-rust-peer-comparison-v0.2— maintainer refresh at v0.28.3 (reference:…-v0.1, accepted 2026-06-25). - Environment: same machine class as the baseline; git
5953c9f, workspace0.28.3,ndarray0.17.2,nalgebra0.33.3. Peer tasks are opt-in behind thepeersfeature (off by default). This run was taken atndarray 0.17.2, so the harness now matches thematten-ndarraybridge’s supportedndarrayversion. Not comparable across machines.
Representative Criterion medians (full six-task table in the report):
| Task | matten | ndarray | nalgebra |
|---|---|---|---|
| markov step (v·P, n=64) | ~924 ns | ~1.16 µs | ~2.15 µs |
| cosine similarity (len 512) | ~626 ns | ~175 ns | ~138 ns |
matmul (64×64) | ~80.8 µs | ~10.8 µs | ~10.7 µs |
| heat step (operator·u, n=64) | ~6.77 µs | ~752 ns | ~741 ns |
On these small dense kernels the production-oriented peers generally carry less overhead than
matten’s Tensor API — expected, and consistent with matten’s DX-first role. The size of the
gap is the useful part, and it is not uniform: a vector×matrix step (markov) is competitive here —
ahead of both peers at this size — while dense matmul and matrix×vector steps (heat, pagerank)
show the widest gaps (~7.5–9×). A consistent internal pattern is that matten’s matrix×vector path
is its widest gap while its vector×matrix path is competitive — echoing the axis-reduction signal
from Phase 1.
Read next
- Methodology — what is measured, what is not, and the rules that keep the program honest.
- Full reports with complete tables, environment, and regeneration commands:
benchmarks/reports/internal-baseline-v0.2.mdandbenchmarks/reports/peer-comparison-v0.2.md(and the accepted v0.1 references alongside them).
Phases 3 (NumPy/Pandas reference) and 4 (regression gates) are designed in RFC-049 but deferred and not yet measured.
Development process
This page distils the workflow that applies to every PR in matten. It is
drawn from the common sections that appear across all implementation handoffs
(RFC-002 through RFC-008).
Required QA commands
Run these before requesting review unless the PR is explicitly documentation-only:
cargo fmt --all --check
cargo clippy --all-targets --all-features -- -D warnings
cargo test --all-targets
cargo test --doc
When the PR touches feature-gated behaviour, also run:
cargo test --all-targets --no-default-features
cargo test --all-targets --features serde
cargo test --all-targets --features json
cargo test --all-targets --features csv
cargo test --all-targets --features dynamic
cargo test --all-features
When the PR touches parser, JSON, CSV, indexing, or shape arithmetic, add at least one targeted property or fuzz-oriented test module. The fuzz target does not need to run on every PR, but it must compile if the fuzz crate is included.
Reviewer checklist
Every PR is reviewed against this list regardless of which RFC it implements:
- The implementation keeps the
matten::Tensorpublic surface simple. - No public lifetime, storage, or dimension generic leaks into user-facing examples.
- Panic-zone APIs have actionable
matten … errormessages. - Result-zone APIs do not panic for malformed external input.
- Shape product and allocation-sensitive paths use checked helpers.
- Documentation examples compile and match implemented behaviour.
- Any deferred work is listed explicitly rather than hidden in TODO comments.
Definition of done
A milestone or RFC is complete when:
- all planned PRs are merged;
- acceptance criteria in the RFC are satisfied;
- all QA commands pass;
- README and rustdoc examples for the affected API surface compile and are accurate.
File-size guideline
- Consider splitting a
.rsfile if it exceeds 300 effective lines of code (ELOC) (non-blank, non-comment-only lines). - Splitting is strongly recommended above 500 ELOC.
- Test code within
src/lives intests.rs(a sibling file) or in atests/subdirectory; use the 2018+ module style (foo.rs+foo/coexistence, nomod.rs).
Design-before-code sequence
Requirements → External design → RFC → Implementation → Tests
Do not widen the public API beyond what an accepted RFC specifies without a follow-up RFC or maintainer approval.
Release checklist
This page documents the steps required before publishing any matten release.
It is the canonical gate referenced by RFC-015.
Before every release
1. Source verification
cargo fmt --all --check
bash scripts/check-core-dependency-boundary.sh # RFC-022 core boundary gate
bash scripts/check-published-dependency-isolation.sh # RFC-049 §B1 per-crate peer-dep isolation
bash scripts/check-matten-data-scope.sh # RFC-042 matten-data anti-scope guard
bash scripts/check-benchmark-dependency-sync.sh # benchmark harness ndarray pin == workspace requirement
bash scripts/check-release-docs.sh # doc-truth + examples naming-band guards
cargo clippy --all-targets --all-features -- -D warnings
cargo clippy --all-targets --no-default-features -- -D warnings
cargo clippy --all-targets --no-default-features --features dynamic -- -D warnings
RUSTFLAGS="-D warnings" cargo check --all-targets --all-features
cargo test --all-targets
cargo test --doc --all-features
2. Feature matrix
cargo test --no-default-features
cargo test --no-default-features --features serde
cargo test --no-default-features --features json
cargo test --no-default-features --features csv
cargo test --no-default-features --features dynamic
cargo test --no-default-features --features dynamic,json
cargo test --no-default-features --features dynamic,csv
cargo test --no-default-features --features dynamic,json,csv
cargo test --all-features
3. Examples
cargo check --examples
cargo check --examples --all-features
cargo run --example 00_quickstart
cargo run --example 06_broadcasting
cargo run --example 08_slicing_builder
cargo run --example 12_boundary_error_handling
cargo run --example dynamic_00_quickstart --features dynamic,json,csv
cargo run --example dynamic_05_dirty_csv_cleanup --features dynamic,json,csv
4. MSRV
cargo +1.85.0 build
cargo +1.85.0 test --all-features --quiet
5. Public API audit
Compare the current public surface against docs/src/reference/public-api-snapshot.md.
Allowed root exports:
TensorMattenErrorDataFormatMattenLimitsSliceBuilderElement(under#[cfg(feature = "dynamic")])NumericPolicy(under#[cfg(feature = "dynamic")])
Allowed #[doc(hidden)] exports (compiler visibility only, not user-facing):
IntoSliceRangeSliceConvertSliceSpecRepr
Run a spot-check:
grep -n "^pub use" src/lib.rs
Verify no module accidentally became pub mod.
6. Documentation truth pass
# No stale version strings in user-facing files
grep -R "Status:.*0\.[0-9]\{2\}\." README.md docs/src/ src/lib.rs || true
# No stale "matten 0.x" in runtime messages
grep -rn "matten 0\." src/ | grep -v "CHANGELOG\|#\[" || true
# No version-specific claims in lib.rs crate docs
grep "This is.*0\." src/lib.rs || true
7. CHANGELOG
- Every API change has a changelog entry.
- Changelog entries describe actual changes, not planned ones.
- No changelog entry claims a fix that is not in the code.
8. Version bump
Update Cargo.toml version. During v0.x, patch releases (0.13.x) should not
introduce new public API unless a minor release (0.14.0) is intended.
Additional gates for minor releases (0.14.0, 0.15.0, …)
- New public API has a corresponding accepted RFC.
- Public API snapshot is regenerated and reviewed.
- mdBook examples for new APIs compile and run.
- Migration guide updated if any method signature changed.
Public-dependency-minor changes
When a published crate re-exposes a third-party type in its public API (for example
matten-ndarray exposing ndarray::ArrayD<f64> through to_arrayd/from_arrayd), changing the
supported minor of that dependency is a public-API compatibility event — not a routine
cargo update — and is handled as a lock-step family minor (RFC-030). Before releasing such a
change:
- The change has an accepted RFC recording the supported version(s) and the decision (a single
bump vs. a bounded range). Precedent: RFC-062 (
ndarray→0.17), which weighed a0.16+0.17range before the maintainer chose a single-version requirement to keepCargo.tomlsimple. - If a range is supported, CI verifies the crate’s tests, doctests, and examples against each
supported minor — e.g.
cargo update -p <dep> --precise <ver>in a fresh checkout (so the per-job lockfile edit is not committed). A single-version requirement needs only the normal job against the resolved patch (document which patch CI targets). - No version-conditional bridge/crate code. If the unchanged crate cannot compile against every
supported minor, narrow the range instead of adding
#[cfg(...)]branches or per-version feature flags. - Docs state that the resolved dependency minor is part of the crate’s public type identity, name any yanked patch that is excluded and not a tested target, and note that docs.rs renders a single resolved minor even though CI verifies the full range.
- MSRV is re-verified with the new dependency version in the graph. A dependency’s own
rust-versionis not sufficient — its transitive dependencies can raise the floor independently. - Core
mattendependency isolation is re-confirmed (the published-dependency-isolation guard still passes; the change must not leak a peer dependency into the core graph). - If the dependency is also used by the workspace-excluded benchmark harness (e.g. a peer pin in
benchmarks/Cargo.toml), its pin is synced by hand andcheck-benchmark-dependency-sync.shpasses — the harness cannot inherit{ workspace = true }, so this guard catches a forgotten sync.
v1.0.0 gate
v1.0.0 requires explicit confirmation from the maintainer (nabbisen). It is not triggered automatically by any feature or test passing.
Before v1.0.0, the project should have:
- stable core public API;
- clear dynamic on-ramp story;
- strong, scoped examples;
- reliable diagnostics;
- documented companion-crate boundary (RFC-022);
- clean feature matrix across all profiles.
Architecture
Source layout
src/
lib.rs crate root: public re-exports, #![forbid(unsafe_code)]
error.rs MattenError + DataFormat (RFC-005)
shape.rs validate_shape, strides, coord↔flat helpers (RFC-003)
tensor.rs Tensor struct, constructors, accessors, arange
tensor/
ops.rs shape ops, slicing, boundary APIs (split per 300-ELOC rule)
limits.rs MattenLimits — single source of truth for allocation budgets
convert.rs From/TryFrom trait impls (RFC-004)
reshape.rs permute_axes, reshape helpers (RFC-007)
slice.rs SliceSpec, SliceBuilder, slice_str parser (RFC-008)
ops.rs ops/ module root
ops/
broadcast.rs broadcast_shape, BroadcastCtx, apply_binary (RFC-006)
broadcast/
tests.rs BroadcastCtx unit tests
tensor_ops.rs Add/Sub/Mul/Div for &Tensor pairs
scalar_ops.rs &Tensor op f64, f64 op &Tensor
unary_ops.rs Neg
tests.rs test module root
tests/
tensor.rs construction, shape validation, fill ctors, arange, limits
convert.rs From/TryFrom
error.rs MattenError / DataFormat model
shape.rs row-major index helpers
ops.rs broadcasting, scalar ops
reshape.rs reshape/flatten/transpose/swap_axes/get
slice.rs SliceBuilder, slice_str
math.rs reductions, axis reductions, matmul, NaN policy
dynamic.rs dynamic test dispatcher
dynamic/
element.rs Element model tests
tensor.rs dynamic construction, JSON, CSV
lifecycle.rs storage, utility, is_none_mask, lifecycle
guards.rs accessor guards, diagnostics
policy.rs NumericPolicy, inspection helpers
Module style: foo.rs + foo/ coexistence (Rust 2018+). No mod.rs files.
Public re-exports
#![allow(unused)]
fn main() {
// Numeric core — always available:
pub use crate::error::{DataFormat, MattenError};
pub use crate::limits::MattenLimits;
pub use crate::slice::SliceBuilder;
pub use crate::tensor::Tensor;
// Dynamic on-ramp — under #[cfg(feature = "dynamic")]:
pub use crate::dynamic::Element;
pub use crate::dynamic::NumericPolicy;
// Hidden compiler-visibility plumbing (sealed trait chain):
#[doc(hidden)] pub use crate::slice::{IntoSliceRange, SliceConvert, SliceSpecRepr};
}
Cargo feature matrix
[features]
default = ["serde", "json", "csv"]
serde = ["dep:serde"]
json = ["serde", "dep:serde_json"]
csv = ["dep:csv"]
dynamic = []
Lean build: matten = { version = "0.28", default-features = false }.
The strict compile-time baseline (< 15 s on a modern laptop) applies to the
lean profile. The default profile is the convenient PoC baseline; dynamic
is off by default.
Design invariants
- One primary user type. Every user workflow starts with
use matten::Tensor. - No public lifetimes. All numeric-core methods that take or return tensors use
owned values. Internal helpers may borrow, but lifetimes never appear in the
public API signature of a method that returns a
Tensor. - No public generics on Tensor. The type is
Tensor, notTensor<T>orTensor<T, D>. Generic dtype and dimension support belongs to the dynamic path (dynamic). #![forbid(unsafe_code)]. Any future exception requires a dedicated RFC.- Panic zone / Result zone split. Convenience APIs for trusted local code
may panic. Every external boundary returns
Result<_, MattenError>. - Checked arithmetic everywhere. Shape products and allocation counts use
checked_mul; overflow surfaces asMattenError::Allocation, never wraps. - Row-major canonical order. All operations that produce a new tensor materialise it in row-major contiguous order.
Milestone sequence
| Version | RFC(s) | Content |
|---|---|---|
| 0.0.1 | — | M0: crate skeleton, MattenError/DataFormat |
| 0.1.0 | RFC-001–005 | M1: Tensor contract, shape model, scalar/vector/matrix |
| 0.2.0 | RFC-004 | M2: construction, arange, From/TryFrom |
| 0.3.0 | RFC-006 | M3: broadcasting, Add/Sub/Mul/Div/Neg |
| 0.4.0 | RFC-007/008 | M4: reshape, transpose, SliceBuilder, slice_str |
| 0.5.0 | RFC-009 | M5: serde, from_json, from_csv |
| 0.6.0–0.7.0 | RFC-010/014 | M6: reductions, matmul, examples, CI gates |
| 0.8.0 | RFC-011/012 | Dynamic alpha: Element, CoW DynamicTensor, dynamic JSON/CSV |
| 0.9.0 | RFC-013 | Dynamic hardening: min_axis/max_axis, missing-value helpers |
| 0.10.0–0.11.0 | — | Stabilization, post-audit, get_flat, NumPy fixtures |
| 0.12.0–0.13.2 | — | Dynamic lifecycle hardening; accessor guards; sealed slice traits |
| 0.13.3 | RFC-015/020 | API stabilization, release checklist, diagnostics |
| 0.14.0 | RFC-016/017/018 | Dynamic on-ramp: NumericPolicy, MattenLimits, try_zeros/try_ones/try_full |
| 0.15.0–0.15.1 | RFC-019/021 | Axis reductions, tutorial/example path, file splits |
| 0.16+ | RFC-022–026 | Companion-crate design phase (design-only RFCs) |