FULLY HOMOMORPHIC ENCRYPTION

Compute on encrypted data. No decryption. No exposure. No trust required.

FHE lets a server run computations directly on ciphertext. The data is never decrypted, never seen, never reconstructable, yet the result, when decrypted by the user, is identical to running on plaintext.

How it works See it in action

The defining property

ENCRYPT Enc(x) user-side

COMPUTE f( Enc(x) ) server-side

ENCRYPTED RESULT Enc( f(x) ) server output

DECRYPT f(x) user-side

Any function f, a matrix multiply, an attention layer, a similarity score, a SQL predicate, evaluates on encrypted inputs and produces an encrypted output that decrypts to the correct plaintext answer.

Encrypt locally

The user's secret key never leaves their device. Plaintext is encrypted before it touches the network.

Compute on ciphertext

The server runs additions and multiplications directly on encrypted values. The math works under the encryption.

Decrypt only the result

Only the user can decrypt the encrypted output. The server never learns the input, intermediates, or answer.

vs. other privacy approaches

TEEs: trust the hardware vendor MPC: needs every party online and chatty Federated: trust the aggregator FHE: trust no one. Provably.

One encrypted query over a small vector DB (cosine similarity + threshold)

Plaintext (GPU)

~5 ms

FHE on CPU (2018)

~6 hours

FHE on CPU (2026)

~19 minutes

Lattica FHE via HEAL (GPU)

~200 ms

Illustrative ranges for an encrypted nearest neighbor query (cosine similarity with a threshold filter) over a small vector index. Bars are log scaled for visibility.

Ciphertexts are huge

A single encrypted number is a high-degree polynomial with thousands of coefficients. A 4-byte weight becomes kilobytes of ciphertext.

Noise growth & bootstrapping

Every operation adds noise. Once it crosses a threshold an expensive 'bootstrapping' step must reset it, historically the slowest op in FHE.

CPU-only reference libraries

Open source FHE libraries were built for CPUs. A single encrypted inference call could take hours, fine for papers, unusable in production.

Non-linear functions are hard

FHE natively supports + and ×. The non-linearities at the heart of AI inference, activations, softmax, comparisons, normalizations, must be polynomial approximated. Done naively, this destroys accuracy or explodes cost.

Classical FHE

Built for CPUs, can't exploit modern accelerators
Hand written circuits per workload
Sequential, no real batching
Locked to one stack, rewrite the world for new hardware
Minutes to hours per query

Lattica with HEAL

FHE built for acceleration hardware, lowered to tensor ops
Compile models and workloads from a high level SDK
Massively parallel, free batching across ciphertexts
Hardware agnostic, GPU today, TPU/FPGA/ASIC tomorrow
Sub-second on production hardware

HEAL and acceleration

HEAL, hardware agnostic FHE

Our Homomorphic Encryption Abstraction Layer, think 'CUDA for FHE'. We compile CKKS and BGV primitives (NTT, key switching, rescaling, bootstrapping) to tensor operations that run on GPUs, TPUs, FPGAs, and FHE specific ASICs without rewriting the software.

FHE meets tensors

FHE is linear algebra at its core, polynomial rings, NTTs, matrix style key switching. That makes it a natural fit for tensor hardware. HEAL lowers encrypted workloads to tensor operations so the accelerator does what it is already best at.

Free batching

Modern accelerators are massively parallel. We pack many ciphertexts into a single batched computation and process them together at effectively no extra cost, so large models and deep inference graphs scale without a per-item penalty.

Approximations that hold

Carefully bounded polynomial approximations of activations, comparisons, and normalizations used across AI inference and analytics, preserving accuracy within fractions of a percent of plaintext.

10,000×+

Speedup over CPU reference implementations on encrypted AI inference workloads

<1%

Accuracy delta vs. plaintext baselines on production workloads

Zero

Plaintext exposure on Lattica infrastructure, by construction

HEAL, Hardware Abstraction

Our 'CUDA for FHE'. CKKS and BGV primitives compile to tensor operations and run on GPU, TPU, FPGA, or FHE specific ASICs, no software rewrite when the hardware changes.

Compiler

FHE is linear algebra under the hood, a natural match for tensor hardware. The compiler lowers high level AI models and workloads to batched tensor ops that the accelerator runs in parallel.

Runtime

Scheduler that batches ciphertexts, amortizes bootstrapping, and pipelines accelerator work.

Key Management

Secret keys never leave the customer's device. Lattica only ever holds public evaluation keys and ciphertext. Plaintext is never seen, never stored, never reconstructable.