Author Bernard van Gastel
License Apache License 2.0
Library for polymorphic pseudonimisation and encryption. Same library in different languages:
libpep-cpp (C++);
libpep on crates.io (Rust).
1Introduction
This library implements the PEP encryption based on ElGamal using Curve25519, and operations on these encrypted messages. A message M
can be encrypted for a receiver which has public key Y
associated with it, belonging to secret key y
with (with being a generator for the used curve). This encryption is random: every time a different random r
is used, resulting in different ciphertexts (encrypted messages). We represent this encryption function as . Decrypt a ciphertext using secret key by calculating .
The library supports three operations on ciphertext in
= , encrypting message for public key with random :
rerandomize(in, s) = out
: scrambles a ciphertext with a randoms
. Bothin
andout
can be decrypted by the same secret keyy
, both resulting in the same decrypted messageM
. However, the binary form ofin
andout
differs. Spec: input is transformed to by performing ;reshuffle(in, n) = out
: modifies a ciphertextin
(an encrypted form ofM
), so that after decryption ofout
the decrypted message will be equal ton*M
. Spec: input is transformed to by performing .rekey(in, k) = out
: ifin
can be decrypted by secret keyy
, thenout
can be decrypted by secret keyk*y
. Decryption will both result in messageM
. Spec: input is transformed to by performing .
The rekey(in, k)
and reshuffle(in, n)
can be combined in a slighly more efficient rks(in, k, n)
.
There are also zero knowledge proof version of these operations. These are needed so that a party can prove to another party that it has applied the operation on the input data, without revealing the factors used in the operation.
When distributing trust over multiple central servers, these zero knowledge proofs are essential, so that a malfunctioning server can not violate security guarantees of the system. If server B has a secret factor n
, server A can check with these zero knowledge proofs that server B has applied for example a reshuffle with factor N
() to a specific message. Server A does not learn anything about n
.
2Applications
For pseudonimisation, the core operation is reshuffle with n
. It modifies a main pseudonym with a factor n
that is specific to a user (or user group) receiving the pseudonym. After applying a user specific factor n
, a pseudonym is called a local pseudonym. The factor n
is typically tied to the access group of a user.
Using only a reshuffle is insufficient, as the pseudonym is still encrypted with the public key Y
(which can be decrypted by the secret key y
). To allow a user to decrypt the encrypted pseudonym, a rekey with k
is needed, in combination with a protocol to hand the user the secret key k*y
. The factor k
is typically tied to the current session of a user.
To make pseudonyms harder to trace, rerandomize is applied frequently. This way a binary compare of the encrypted pseudonym will not leak any information.
3Implementation
This library is using the Ristretto encoding on Curve25519. There are a number of arithmetic rules for scalars and group elements: group elements can be added and subtracted from each other. Scalars support addition, subtraction, and multiplication. Division can be done by multipling with the inverse (using s.invert()
for non-zero scalar s
). A scalar can be converted to a group element (by multiplying with the special generator G
), but not the other way around. Group elements can also be multiplied by a scalar.
Group elements have an almost 32 byte range (top bit is always zero, and some other values are invalid). Therefore, not all AES-256 keys (using the full 32 bytes range) are valid group elements. But all group elements are valid AES-256 keys. Group elements can be generated by GroupElement::random(..)
or GroupElement::from_hash(..)
. Scalars are also 32 bytes, and can be generated with Scalar::random(..)
or Scalar::from_hash(..)
. We ensure that scalars are never zero, to avoid a multiply/division by zero problem.
The zero knowledge proofs are offline Schnorr proofs, based on a Fiat-Shamir transform. The hashing algorithm used is SHA512.
4Building
Build using cargo:
cargo test
Run using cargo:
cargo run --bin peppy --features build-binary
5Installing
Install using
cargo install libpep --features build-binary
Run using:
peppy --help
6Background
Based on the article by Eric Verheul and Bart Jacobs, Polymorphic Encryption and Pseudonymisation in Identity Management and Medical Research. In Nieuw Archief voor Wiskunde (NAW), 5/18, nr. 3, 2017, p. 168-172. This article does not contain the zero knowledge proofs.