8 Kernels | Pure Rust + WASM (wasm32-wasip2) | No CUDA | No unsafe

Production inference kernels
you can actually read.

8 fused Rust + WASM (wasm32-wasip2) kernels covering the full LLM inference stack: embedding, attention, KV-cache, RoPE, LayerNorm+GeLU, fused MLP, SwiGLU, and cognitive-faculty WASM. No CUDA. No unsafe code. Ships to any runtime.

$1,500
Flat fee — source included
110 tests
Across 8 kernels, all green
0 unsafe
Pure safe Rust — audit it yourself
wasm32-wasip2
Ships to any runtime, no CUDA needed

Why we built it

Our agent memory system (NovaMem) processes memories through a vector search layer. We needed an embedding kernel we could verify, modify, and compile to wasm32-wasip2 (WASI Preview 2) without pulling in a Python runtime or CUDA toolchain. So we wrote one from scratch in Rust.

Approach Cost Auditable WASM-compatible
sentence-transformers (Python) Free Complex No
ONNX Runtime Free Partial Partial
Custom CUDA kernel $10K–$50K consulting Yes No
blitz-embedding $1,500 Yes — pure safe Rust Yes — wasm32-wasip2

What's in the box

1

Fused embedding pipeline

Token lookup → mean pooling → layer norm → L2 normalization in a single pass. No intermediate allocations.

2

Ragged batch support

Real variable-length inputs — no padding needed. Tested at BERT-base scale: 256 × 64 tokens × 768-dim.

3

Load your own weights

EmbeddingTable::from_weights() accepts your checkpoint. Unit-norm outputs ready for cosine similarity search.

4

30-min architecture call included

We walk you through the code, integration path, and answer every question. Flat fee, no ongoing obligation.

Who this is for

If you're building an embedding pipeline and want a kernel you can actually read, audit, and ship to any runtime, this is for you.

Early Access — No Risk

If you buy and find anything in the source that doesn't match the spec, reach out. We'll fix it or refund immediately. We want honest customers who got exactly what they expected.

Pricing

One-time purchase. No subscriptions. No vendor lock-in.

Early Access
$1,500
blitz-embedding — source, tests, and integration support.
  • Full source (pure safe Rust)
  • wasm32-wasip2 compatible binary
  • EmbeddingTable::from_weights() API
  • 18 tests (15 unit + 3 doc), all green
  • 30-min architecture call included
  • No-questions refund if spec not met
Get Early Access — $1,500
Enterprise
$15,000 /year
Full kernel library + custom builds + SLA.
  • Unlimited kernel library access
  • Custom kernel requests (priority)
  • Dedicated support engineer
  • SLA: 99.9% delivery uptime
  • Quarterly performance reviews
  • Early access to new kernels
Contact Sales

Optional: Kernel Support

Keep your kernels current as hardware and models evolve.

$200/month
Priority support + kernel updates as new GPU architectures ship.
Add Support — $200/mo

A kernel you can read is a kernel you can trust.

Flat fee. Source included. 30-min integration call. No ongoing obligation.

Contact to Buy — from $1,500