Today we’re publishing Lattica’s first technical whitepaper: an end-to-end look at the system we’ve built to make Fully Homomorphic Encryption practical for production AI inference.
FHE has been called the holy grail of cryptography for a decade. The maths has been ready; the engineering has not. Existing libraries expose low-level primitives, assume CPU-only execution, and don’t fit into modern ML pipelines. The whitepaper explains, in detail, how we close that gap.
What the paper covers
Architecture
A clean client/server split with key management on the client, heavy homomorphic computation on the server, and a tensor programming model that ties cryptography to accelerators.
GPU acceleration
How we re-engineer FHE primitives — NTTs, modular arithmetic, basis conversions, ciphertext rotations — into CUDA kernels that exploit the parallelism GPUs are built for.
Tensor programming model
Why representing ciphertexts and keys as integer tensors with explicit shapes and strides is the right abstraction for both cryptographers and hardware engineers.
Hardware portability via HEAL
The Homomorphic Encryption Abstraction Layer: the contract that lets GPU, FPGA, and ASIC backends plug into the same stack without re-implementing the cryptography.
Model adaptation
How standard ML models become FHE-compatible: approximating non-linear layers and restructuring computation graphs to reduce ciphertext depth and rotation overhead.
Security & trust model
Why keys never leave the client, what the server can and cannot see, and how this compares to TEEs, SMPC, and differential privacy for per-query inference.
Who it’s for
-
ML engineers evaluating whether encrypted inference is real enough to put behind a product surface.
-
Hardware teams looking for a concrete, tensor-shaped target to optimise FHE workloads against.
-
Security and compliance leads who need to understand the trust model before approving an encrypted-AI deployment.
-
Researchers interested in how a production FHE stack actually composes, end to end.
Read the paper
The whitepaper is free to read and share. If you have feedback, questions, or want to dig into the parts we couldn’t fit into 28 pages, we’d love to hear from you.