v0.2.0 Mixed multi-field assembly, distributed FEM & open-domain waves

A fast, differentiable, JIT-free, debugging-friendly finite element library for PyTorch.

GITHUB DOCUMENTATION

runnable examples: 50+
problem categories: 11
sparse-solver backends: 6
GPU speedup vs CPU FEM: 10×

Developed at

Core capabilities

Why TensorMesh

Built for the workflow where finite elements meet deep learning — without sacrificing the speed and accuracy you expect from a real FEM library.

GPU-Native & Differentiable

Built on PyTorch — move the entire FEM workflow to GPU with one line. Autograd flows seamlessly through assembly and solve for end-to-end differentiable PDE pipelines.

Tensorized Assembly

A fully tensorized Map-Reduce algorithm powered by TensorGalerkin fuses element-wise ops into monolithic GPU kernels — order-of-magnitude speedups over CPU-based FEM stacks.

JIT-Free & Debugging-Friendly

Eager execution with no compilation overhead. Dynamic meshes, adaptive refinement, and interactive workflows just work — no recompilation latency, no opaque traces.

Element & Mesh Support

Triangular, tetrahedral, pyramid, and prismatic elements. Automated mesh generation for common geometries with seamless Gmsh and VTK-HDF5 I/O.

Flexible Solvers

Powered by torch-sla — linear, nonlinear, and eigenvalue solvers across CPU/GPU backends with autograd, batched solves, and multi-GPU scaling.

Pythonic API

Custom weak forms in pure Python — no DSL, no form compiler. If you can write PyTorch, you can write FEM.

Gallery

See it in action

Real outputs from the example gallery — meshes, fields, and animations rendered straight from TensorMesh.

Wave

Phononic crystals

Bloch–Floquet band structure; mean error 0.08% vs COMSOL.

Poisson

3D Poisson

Tetrahedral mesh, cut view of the scalar field.

Diffusion Animation

Allen–Cahn phase field

Nonlinear time evolution with Newton iteration per step.

Wave Animation

Wave equation

Explicit central-difference time integration.

Solid

Hyperelastic rubber

Large-deformation solid mechanics with a Newton solver.

Wave

Open-domain waves

Waveguide-coupled silicon microdisk with PML absorbing layers.

Fluid

Taylor–Hood Stokes

Stabilization-free mixed P2–P1 assembly with optimal convergence.

Solid

Modal analysis

Natural frequencies and mode shapes of a cantilever cylinder.

Maxwell

Magnetostatics

3D magnetic field around a current-carrying wire (stabilized nodal curl–curl).

Inverse design Animation

Topology optimization

Compliance minimization via the Optimality Criteria method.

Physics-informed learning

A network trained to minimize the assembled Galerkin residual.

Fluid

Lid-driven cavity

Incompressible Navier–Stokes with Taylor–Hood (P2–P1) mixed elements.

Learn by example

50+ runnable examples

Eleven problem categories — every script ships with the repo, and every rendered result lives in the docs.

4 examples

Basics

Mesh viz, basis functions, element gallery.

View

3 examples

Poisson

2D/3D Poisson, batched RHS, h-adaptivity.

View

5 examples

Diffusion

Heat equation and Allen–Cahn phase field.

View

8 examples

Wave

Time-domain wave, Helmholtz, phononic band structures, PML & ports.

View

9 examples

Solid

Hyperelasticity, contact, plasticity, geomechanics, modal analysis.

View

8 examples

Fluid

Taylor–Hood Stokes, lid-driven cavity, cylinder flow, Rayleigh–Bénard, Taylor–Green.

View

1 example

Magnetostatics

3D Maxwell — field around a wire via nodal curl–curl.

View

4 examples

Inverse design

Coefficient ID and density-based topology optimization, via autograd.

View

1 example

Physics-informed

Train a network to minimize the assembled Galerkin residual.

View

3 examples

Dataset

Batch mesh & field generation for ML training.

View

4 examples

Distributed

Multi-GPU assembly, mesh partitioning, graph coloring.

View

Open the full example gallery →

Quickstart

From mesh to solution in pure Python

A complete Poisson solver — no DSL, no JIT, no surprises. Just PyTorch autograd flowing through every step.

quickstart.py

import math
import torch
from tensormesh import ElementAssembler, NodeAssembler, Mesh, Condenser

# 1. Triangular mesh of the unit square.
mesh = Mesh.gen_rectangle(chara_length=0.05)

# 2. Stiffness weak form:  a(u, v) = ∫ ∇u · ∇v dΩ
class LaplaceAssembler(ElementAssembler):
    def forward(self, gradu, gradv):
        return gradu @ gradv

# 3. Load weak form:  l(v) = ∫ f v dΩ
class SourceAssembler(NodeAssembler):
    def forward(self, v, f):
        return f * v

# 4. Source term, evaluated at every mesh node.
x, y = mesh.points[:, 0], mesh.points[:, 1]
f_vals = 2 * math.pi**2 * torch.sin(math.pi * x) * torch.sin(math.pi * y)

# 5. Assemble.
K = LaplaceAssembler.from_mesh(mesh)()
b = SourceAssembler.from_mesh(mesh)(point_data={"f": f_vals})

# 6. Apply Dirichlet BCs via static condensation, then solve.
condenser = Condenser(mesh.boundary_mask)
K_, b_ = condenser(K, b)
u = condenser.recover(K_.solve(b_, verbose=True))

Numerical solution u(x, y) = sin(πx)sin(πy) on the unit square

$ python quickstart.py
[torch-sla] solve: n=431, nnz=2859, dtype=float64, device=cpu, symmetric=True, spd=False, backend=scipy, method=lu
L2 error: 3.135e-03

The same script runs unchanged on GPU with mesh = mesh.cuda(), and becomes differentiable with mesh.points.requires_grad_(True).

Read the full quickstart →

Performance