Encrypt locally
The user's secret key never leaves their device. Plaintext is encrypted before it touches the network.
FULLY HOMOMORPHIC ENCRYPTION
FHE lets a server run computations directly on ciphertext. The data is never decrypted, never seen, never reconstructable, yet the result, when decrypted by the user, is identical to running on plaintext.
THE PRIMITIVE
A cryptographic scheme where operations on ciphertext correspond to operations on the underlying plaintext.
The defining property
Any function f, a matrix multiply, an attention layer, a similarity score, a SQL predicate, evaluates on encrypted inputs and produces an encrypted output that decrypts to the correct plaintext answer.
The user's secret key never leaves their device. Plaintext is encrypted before it touches the network.
The server runs additions and multiplications directly on encrypted values. The math works under the encryption.
Only the user can decrypt the encrypted output. The server never learns the input, intermediates, or answer.
THE 18-YEAR BOTTLENECK
The math has been known since Gentry's 2009 thesis. The problem was never feasibility, it was performance. For most of the last decade, FHE ran 10,000 to 1,000,000 times slower than plaintext.
One encrypted query over a small vector DB (cosine similarity + threshold)
Illustrative ranges for an encrypted nearest neighbor query (cosine similarity with a threshold filter) over a small vector index. Bars are log scaled for visibility.
A single encrypted number is a high-degree polynomial with thousands of coefficients. A 4-byte weight becomes kilobytes of ciphertext.
Every operation adds noise. Once it crosses a threshold an expensive 'bootstrapping' step must reset it, historically the slowest op in FHE.
Open source FHE libraries were built for CPUs. A single encrypted inference call could take hours, fine for papers, unusable in production.
FHE natively supports + and ×. The non-linearities at the heart of AI inference, activations, softmax, comparisons, normalizations, must be polynomial approximated. Done naively, this destroys accuracy or explodes cost.
WHY LATTICA
Lattica rebuilt the FHE stack from the kernels up around HEAL, our Homomorphic Encryption Abstraction Layer. Encrypted workloads compile to tensor operations and run on whatever hardware is fastest today, GPU, TPU, FPGA, or FHE specific ASIC, with no software rewrite.
Our Homomorphic Encryption Abstraction Layer, think 'CUDA for FHE'. We compile CKKS and BGV primitives (NTT, key switching, rescaling, bootstrapping) to tensor operations that run on GPUs, TPUs, FPGAs, and FHE specific ASICs without rewriting the software.
FHE is linear algebra at its core, polynomial rings, NTTs, matrix style key switching. That makes it a natural fit for tensor hardware. HEAL lowers encrypted workloads to tensor operations so the accelerator does what it is already best at.
Modern accelerators are massively parallel. We pack many ciphertexts into a single batched computation and process them together at effectively no extra cost, so large models and deep inference graphs scale without a per-item penalty.
Carefully bounded polynomial approximations of activations, comparisons, and normalizations used across AI inference and analytics, preserving accuracy within fractions of a percent of plaintext.
10,000×+
Speedup over CPU reference implementations on encrypted AI inference workloads
<1%
Accuracy delta vs. plaintext baselines on production workloads
Zero
Plaintext exposure on Lattica infrastructure, by construction
THE STACK
Every layer, from GPU kernels up to the developer SDK, is designed for one job: making FHE fast enough to ship.
Our 'CUDA for FHE'. CKKS and BGV primitives compile to tensor operations and run on GPU, TPU, FPGA, or FHE specific ASICs, no software rewrite when the hardware changes.
FHE is linear algebra under the hood, a natural match for tensor hardware. The compiler lowers high level AI models and workloads to batched tensor ops that the accelerator runs in parallel.
Scheduler that batches ciphertexts, amortizes bootstrapping, and pipelines accelerator work.
Secret keys never leave the customer's device. Lattica only ever holds public evaluation keys and ciphertext. Plaintext is never seen, never stored, never reconstructable.