Technical Whitepaper

Today we’re publishing Lattica’s first technical whitepaper: an end-to-end look at the system we’ve built to make Fully Homomorphic Encryption practical for production AI inference.

FHE has been called the holy grail of cryptography for a decade. The maths has been ready; the engineering has not. Existing libraries expose low-level primitives, assume CPU-only execution, and don’t fit into modern ML pipelines. The whitepaper explains, in detail, how we close that gap.

What the paper covers

Architecture

A clean client/server split with key management on the client, heavy homomorphic computation on the server, and a tensor programming model that ties cryptography to accelerators.

GPU acceleration

How we re-engineer FHE primitives — NTTs, modular arithmetic, basis conversions, ciphertext rotations — into CUDA kernels that exploit the parallelism GPUs are built for.

Tensor programming model

Why representing ciphertexts and keys as integer tensors with explicit shapes and strides is the right abstraction for both cryptographers and hardware engineers.

Hardware portability via HEAL

The Homomorphic Encryption Abstraction Layer: the contract that lets GPU, FPGA, and ASIC backends plug into the same stack without re-implementing the cryptography.

Model adaptation

How standard ML models become FHE-compatible: approximating non-linear layers and restructuring computation graphs to reduce ciphertext depth and rotation overhead.

Security & trust model

Why keys never leave the client, what the server can and cannot see, and how this compares to TEEs, SMPC, and differential privacy for per-query inference.

Who it’s for

ML engineers evaluating whether encrypted inference is real enough to put behind a product surface.
Hardware teams looking for a concrete, tensor-shaped target to optimise FHE workloads against.
Security and compliance leads who need to understand the trust model before approving an encrypted-AI deployment.
Researchers interested in how a production FHE stack actually composes, end to end.

Read the paper

The whitepaper is free to read and share. If you have feedback, questions, or want to dig into the parts we couldn’t fit into 28 pages, we’d love to hear from you.

Download the whitepaper (PDF) Explore the technology

What the paper covers

Architecture

GPU acceleration

Tensor programming model

Hardware portability via HEAL

Model adaptation

Security & trust model

Who it’s for

Read the paper

2nd FHE Landscape Survey

Lattica + Sunscreen: Bringing TFHE to SaaS

Announcing HEAL

Technical Whitepaper

What the paper covers

Architecture

GPU acceleration

Tensor programming model

Hardware portability via HEAL

Model adaptation

Security & trust model

Who it’s for

Read the paper

Keep reading

2nd FHE Landscape Survey

Lattica + Sunscreen: Bringing TFHE to SaaS

Announcing HEAL