Introduction to Decorrelation Theory
On-Line Manual
In order to make the decorrelation theory more easily accessable, here is
an easy introduction.
This "decorrelation manual" will be subject to frequent updates.
(It is mainly under construction at this time.)
For any feedback, please email to
Serge Vaudenay <>
Last update: July 15th, 2002.
Table of Content
- Goals
- d-wise Distribution Matrices
- Decorrelation Distance and Decorrelation Bias
- Matrix Norms and Decorrelation
- The Four Problems of Decorrelation Theory
- On the Choice of the Matrix Norm
- Norms and Undistinguishability
- Decorrelation Modules
- A few Decorrelated Cipher Constructions
- Differential and Linear Cryptanalysis
- Iterated Attacks
- References
The goals of decorrelation theory is
- to make the theory on k-wise independence usable in the block cipher
theory
- to provide handy tools for dealing with undistinguishability
- to make formal security results feasible on block ciphers
- to provide design guidelines for new algorithms.
For this we consider block ciphers as random permutations in a mathematical
sense:
the random choice of the secret key K induce a distribution on the permutation
CK.
In the following we will omit the reference to the key.
The goal of the designer of the block cipher is to make it "look like" a
truly random permutation, i.e. like a random permutation C*
with a uniform distribution among the set of permutations.
For this we need tools to compare C and C*.
Intuitively, we can compare C and C* at different level of depth.
Namely, if for any fixed plaintext x the corresponding ciphertext C(x) is
uniformly distributed, we say that C and C* are similar
"to the order 1".
Similarly, if for any two different plaintexts x1 and x2
the corresponding ciphertext pair (C(x1),C(x2)) has
a uniform distribution among all the pairs of different ciphertexts, we say
that C and C* are similar
"to the order 2".
And so on.
This intuitively defined the notion of order of decorrelation.
For any random function F (a block cipher is a random function with the
special property that it is always a permutation) and any integer d we
associate a huge matrix [F]d with real number entries which
is called the "d-wise distribution matrix".
Beginners do not really need to understand where this matrix come from.
For the others, here is the formal definition.
Definition (d-wise distribution matrix).
Let F be a random function and d be en integer.
Any d inputs x1, ..., xd define a d-multi-input
Any d outputs y1, ..., yd define a d-multi-output
To any d-multi-input x and any d-multi-output y we associate the probability
[F]d(x,y) = Pr[ C(x1=y1), ...,
C(xd=yd) ].
This way we define the d-wise distribution matrix [F]d in which
each row corresponds to a d-multi-input and each colomn corresponds to a
d-multi-output.
In particular, the row of [F]d corresponding to the d-multi-input
(x1,...,xd) defines the distribution of the random
d-multi-output (C(x1),...,C(xd)).
The most important property we need to know is the following.
Interestingly, if C* denotes the ideal random permutation
(which means it has a uniform distribution), we have
[C]d x [C*]d =
[C* o C]d = [C*]d =
[C o C*]d =
[C*]d x [C]d
(Note that this does not hold for functions.
Indeed if F is the constant function which is zero on all points, we have
[F o F*]d = [F]d and
[F* o F]d = [G]d where G is a random function
such that G(x) = G(y) for any x and y and G(x) is uniformly distributed.)
This implies the following property.
(To prove this, we just develop the right hand term and use
Proposition 1.)
In the introduction was addressed the problem of comparing a cipher C to the
ideal cipher C* "to a given order".
Now we can be a little more precise.
We will compare [C]d and [C*]d.
Assuming we are given a norm ||.|| on the matrices set, a straightforward way
to compare the two matrices [F]d [G]d is to compute the
distance
This is called the d-wise decorrelation distance between F and G
according to the norm ||.||.
Random functions F must be compared to an ideal random function F*
which has a uniform distribution.
For this we define
DecFd||.||(F) =
|| [F]d - [F*]d ||
the d-wise decorrelation bias of function F.
Similarly, ciphers C must be compared to the ideal cipher C*,
and we define
DecPd||.||(C) =
|| [C]d - [C*]d ||
the d-wise decorrelation bias of permutation C.
The question on the choice of the norm remains.
What we can already notice is that a decorrelation distance of zero do not
depend on this choice, but holds for any norm.
Thus we say that
F and G have the same decorrelation of order d
if their d-wise decorrelation distance is zero.
Similarly, we say that a
function F is perfectly decorrelated to the order d
if its d-wise decorrelation bias is zero.
We say that a cipher C is perfectly decorrelated to the order d
if its d-wise decorrelation bias is zero.
We can notice the link with well-known combinatorial properties.
First of all, when a cipher C is perfectly decorrelated to the order 1, it
obviously achieve perfect secrecy in Shannon's sense when used only
once.
Second we notice that a random hash function H is perfectly decorrelated to the
order 2 if, and only if it is strong universal2 hash function
in Carter-Wegman's sense.
(See Ref. [CW79,WC81].)
We have an equivalent to the Shannon Theorem.
Hence we need a lot of random bits in order to construct perfectly decorrelated
functions.
A mapping from the set of matrices to the set of real numbers is a norm if
- ||A|| = 0 holds if and only if A=0
- ||u.A|| = |u|.||A|| for any real number u
(|u| is the absolute value of u here)
- ||A+B|| <= ||A|| + ||B||
In addition "matrix norms" are norms such that
- ||AxB|| <= ||A||.||B||
If we measure decorrelation biases with matrix norms, from the property of
matrix norms, the definitions and Proposition 2, we have
the interesting property that
DecPd(C2oC1) <=
DecPd(C1).DecPd(C2).
This means that the decorrelation bias of an iterated cipher decreases
exponentially with the number of rounds (as long as the decorrelation of a
few rounds is less than one), which is quite a satisfactory property.
Note that this property does not hold for functions (despite what was claimed
in [Sac99]).
Decorrelation theory faces four complementary problems.
- How to choose the matrix norm?
- Does the decorrelation of a cipher provide protection?
- How to build simple decorrelated primitives (decorrelation modules)?
- How to use them in order to build decorrelated ciphers?
For the fourth problem, the strategy consists of considering any regular
conservative designs, and plugging into it a few decorrelated primitives
which are called decorrelation modules.
If we choose the right notion of decorrelation (the right norm), the
rights decorrelation modules and plug them in the right positions, we obtain
a decorrelated cipher which may be protected against the right attacks.
The Euclidean norm (the L2 norm) has first been considered.
For any matrix A, its Euclidean norm ||A||2 is the square root
of the sum of the square of all entries.
It is not obvious to see it is a matrix norm, but it is indeed.
One problem is that it leads with hard to manipulate quantities.
(See Ref. [Sac98] for more details.)
A much more convenient norm is the |||.|||oo norm.
For any matrix A, the norm |||A|||oo is the maximum over all
rows of the sum of the absolute values of all entries:
|||A|||oo = maxx sumy |A(x,y)|.
The notation of this norm comes from the fact that it is precisely the
matrix norm associated to the Loo vector-norm.
More precisely we define
for a vector and
|||A|||oo =
maxV ||A.V||oo / ||V||oo.
This is the regular way for constructing a matrix norm from a vector norm.
Other interesting matrix norms are the ||.||a norm and the
||.||s norm defined by
||A||a = maxx1 sumy1
maxx2 sumy2 ...
|A((x1,x2,...),
(y1,y2,...))|
||A||s = maxb1,b2,...
maxx1,0 sumx1,1
maxx2,0 sumx2,1 ...
|A((x1,b1,x2,b2,...),
(x1,1-b1,x2,1-b2,...))|
where b1, b2, ... are 0 or 1.
In the theory of randomness, we measure how much a function is pseudorandom
by measuring the advantage of Turing tests.
Actually, given two random functions F and G, we consider an oracle Turing
machine AO which outputs 0 or 1 after playing with the oracle O.
We measure the advantage for distinguishing F from G by
AdvA(F,G) = | Pr[AF=1] - Pr[AG=1] |.
Considering a class Cl of oracle Turing machines which output 0 or 1, we
consider the best advantage
AdvCl(F,G) = maxA in Cl AdvA(F,G).
Here are important results which motivate the choice of the
|||.|||oo, ||.||a and ||.||s matrix norms.
Theorem 1
(Non-adaptive distinguishers).
Let d be an integer.
Let Cldna be the class of non-adaptive distinguishers
limited to d chosen plaintexts.
Such an attack consists of choosing d plaintexts
x1, ..., xd, querying the oracle with it,
then choosing the final output 0 or 1.
For any random function F, and the ideal random function F*, we have
AdvCldna(F,F*) =
(1/2).DecFd|||.|||oo(F).
Similarly, for any cipher C and the ideal cipher C*, we have
AdvCldna(C,C*) =
(1/2).DecPd|||.|||oo(C).
(See Ref. [Sac99] for more details.)
This formalism turns out to make trivial (from the matrix norm properties)
the following intuitive results.
If we consider r independent instances C1, ..., Cr
of the same cipher C, we consider the iterated cipher
Cro...oC1.
We have
AdvCldna
(Cro...oC1,C*) <= 2r-1.
( AdvCldna(C,C*) )r
AdvClda
(Cro...oC1,C*) <= 2r-1.
( AdvClda(C,C*) )r
AdvClds
(Cro...oC1,C*) <= 2r-1.
( AdvClds(C,C*) )r
The best advantage for distinguishing an iterated cipher from the ideal cipher
C* decreases exponentially with the number of rounds r.
Inspired by what has been done based on the Carter-Wegman universal hashing
theory (see Ref. [CW79],
Ref. [WC81] and Ref. [CG89]),
it is quite straightforward to propose decorrelation modules.
Here are a few decorrelation modules for the corresponding norms.
- (NUT-I: Perfectly decorrelated random function in
GF(q))
First of all, making a random function on a finite field GF(q) which is
perfectly decorrelated to the order d
(which means with a decorrelation bias of 0) is easy.
If we let
(a1,a2,a3,...,ad)
in GF(q)d
be random with a uniform distribution, the function
F(x) = a1 + a2.x + a3.x2
+ ... + ad.xd-1
on GF(q) is such that DecFd(F) = 0.
- (NUT-II: Perfectly pairwise decorrelated
permutation in GF(q))
Making a random permutation on a finite field GF(q) perfectly decorrelated
to an arbitrary order d is still an open problem.
For d=1, the trivial solution
satisfies DecP1(C) = 0, where a is uniformly distributed in GF(q).
Note that this corresponds to Vernam's cipher (also known as one-time pad)
which is known to be unconditionally secure (from Shannon's theory)
if used only once.
For d=2, the well known solution
satisfies DecP2(C) = 0, where (a,b) is uniformly distributed in
GF(q)*xGF(q).
In general, polynomials of larger degree are not permutations.
Therefore it is not quite clear how to adapt the previous construction of
perfectly decorrelated functions.
- (NUT-III: Pairwise decorrelated function
in GF(p) for the L2 norm)
In general, practical applications require that we work with bit-strings,
in {0,1}m, which suggest that we use GF(2m).
It is a little frustrating however, for software, to have to implement the
multiplication in this field when we have an optimized built-in hardware
multiplication on integers.
For this we can use GF(p) with a prime p close to 2m.
The first possibility consists of choosing the maximal prime p smaller than
2m.
Let us write
Then we let
with
uniformly distributed.
We obtain that
DecF2L2(F) <= 2.sqrt(2.delta)
for delta < 1/14.
Unfortunately, the decorrelation with respect to the
|||.|||oo, ||.||a and ||.||s are quite bad.
Namely, we have
DecF2||.||s(F) >=
DecF2||.||a(F) >=
DecF2|||.|||oo(F) >=
2 - 21-m + delta
which is approximately 2, i.e. the worst decorrelation bias.
(See Ref. [Sac98] for more details.)
- (NUT-IV: Decorrelated function in GF(q) for
the |||.|||oo and ||.||a norms)
The other possibility consists of choosing the minimal prime power q larger
than 2m.
Let us write
Then we let
F(x) = pi( r(a1) + r(a2).r(x) +
r(a3).r(x)2 + ... +
r(ad).r(x)d-1 )
with
(a1,a2,a3,...,ad)
in {0,1,...,2m-1}d
uniformly distributed, and where r is any injective mapping from
{0,1,...,2m-1} to GF(q) (a "representation" mapping) and pi is any
surjective mapping from GF(q) to {0,1,...,2m-1}
(a "projection" mapping).
For instance we can use
F(x) = a1 + a2.x + a3.x2 + ...
+ ad.xd-1 mod p mod 2m
for a prime p when considering {0,1,...,2m-1} as a subset of GF(p).
We obtain that
DecF2|||.|||oo(F) <=
DecF2||.||a(F) <=
2.((1 + delta)d - 1)
(See Ref. [Sac99] for more details.)
Once we have a decorrelation module, we can plug it in regular conservative
ciphers.
In order to deduce an upper bound on the decorrelation bias of the cipher
from an upper bound on the decorrelation bias of these modules, we use the
following "meta-theorem".
Theorem 4 (Decorrelation modules to decorrelated ciphers).
Let
C = Omega(F1,...,Fr,C1,...,Cs)
be a computation network which defines a cipher from r independent
random functions F1,...,Fr and s random permutations
C1,...,Cs.
We consider an ideal version of this construction
C' = Omega(F*1,...,F*r,
C*1,...,C*s)
in which the F*i and C*j are
uniformly distributed functions or permutations.
We assume that a computation of C requires ai computations of
Fi for any i, and bj computations of
Cj for any j.
For any d we have
DecPd(C) <=
DecPd(C') +
DecFda1(F1)
+ ... + DecFdar(Fr) +
DecPdb1(C1) + ... +
DecPdbs(Cs)
for the |||.|||oo and ||.||a norms.
Similarly, assuming that a computation of C or C-1 requires
a'i computations of Fi for any i, and b'j
computations of Cj or C-1j for any j, we
have
DecPd||.||s(C) <=
DecPd||.||s(C') +
DecFda'1||.||a(F1)
+ ... +
DecFda'r||.||a(Fr) +
DecPdb'1||.||s(C1)
+ ... +
DecPdb's||.||s(Cs).
(See Ref. [Sac99] for more details.)
This theorem reduce the problem of estimating the decorrelation bias of C
to the problem of estimating the decorrelation bias of C'.
It shall be outlined that it makes trivial the extension of
Luby-Rackoff Theorem to pseudorandom functions instead of random ones:
assuming that the advantage of any distinguisher which is limited to d
chosen plaintext is at most epsilon against any round function Fi
of a Feistel construction with at least three rounds, we obtain that any
distinguisher which is limited to d chosen plaintext against the Feistel
cipher is at most
This result extends to chosen plaintext and ciphertext with 4 instead of 3.
We can use the following constructions.
- (Coconut design)
We let C1 and C3 be any ciphers, and C2
be a decorrelation invertible module.
We can use the
construction.
For the ideal version with C2 = C*, we obtain that
DecPd(C') = 0.
Therefore we have
for the |||.|||oo, ||.||a and ||.||s norms.
In Ref. [Stacs98] was propose the
Coconut98 cipher with a perfect pairwise decorrelation with
shown as the NUT-II decorrelation module.
- (Peanut design)
In Peanut ciphers, we use a regular r-round Feistel cipher with decorrelation
modules F1, ..., Fr as round functions.
With the standard notations we have
We recall that the Feistel scheme Psi is defined by
Psi(F1)(x,y) = ( x + F1(y) , y )
Psi(F1,...,Fr)(x,y) =
Psi(F2,...,Fr)( y , x + F1(y) ).
The decorrelation bias of the corresponding ideal function C' is given by
Luby-Rackoff Theorem (see Ref. [LR88]):
DecPd||.||a(C') <=
2.d2.2-m/2
for r >= 3
DecPd||.||s(C') <=
2.d2.2-m/2
for r >= 4.
Thus if we let epsilon denote the maximal decorrelation bias of the
decorrelation modules we have
DecPd||.||a(C) <=
( 3.epsilon||.||a +
2.d2.2-m/2 )floor(r/3) for r >= 3
DecPd||.||s(C) <=
( 4.epsilon||.||s +
2.d2.2-m/2 )floor(r/4)
for r >= 4.
(See Ref. [Sac99] for more details.)
In Ref. [Stacs98] was propose the
Peanut98 cipher with pairwise decorrelation with
Fi(x) = g( ax + b mod 232+15 mod 232 ).
In Ref. [Aes98] (see also Ref.
[Aes99]) was propose the DFC
cipher with pairwise decorrelation with
Fi(x) = g( ax + b mod 264+13 mod 264 ).
These are the NUT-IV decorrelation module.
In Ref. [Sac98] this construction was adapted to the
L2 norm in order to propose the Peanut97
cipher with pairwise decorrelation with
Fi(x) = g( ax + b mod 232-5 )
which is the NUT-III decorrelation module.
- (Walnut design)
We can also use the Lai-Massey Scheme which has been used in the IDEA cipher
(see Ref. [Lai92] and Ref. [LM90]).
More precisely we define
Lambdasigma(F1)(x,y) = ( x + F1(x-y) ,
y + F1(x-y) )
Lambdasigma(F1,...,Fr)(x,y) =
Lambdasigma(F2,...,Fr)(
sigma( x + F1(x-y) ) , y + F1(x-y) )
where sigma is a group alpha-orthomorphism.
Similar bounds than for the Luby-Rackoff construction can be obtained.
We need 3 rounds for pseudorandomness, and 4 rounds for super-pseudorandomness.
In the Walnut ciphers, F1, ..., Fr are decorrelation
modules.
(See Ref. [Asiacrypt99] for more details.)
Other constructions can be adapted.
In the literature we can find many theorems of constructions in the ideal
cases.
For instance Ref. [ZMI89] investigates generalized
Feistel schemes with more than two branches.
Ref. [NR99] improves the Luby-Rackoff construction by
using the same round functions several times.
Ref. [PRS99] investigates a new construction, ...
Note on the "nut" names:
the first aim of this name was to claim that decorrelation modules are cheap
and basically cost nuts (see [Liens-97-3]).
Here NUT stands for "N-Universal Transformation"
(which reminds the Carter-Wegman paradigm).
COCONUT stands for "Cipher Organized with Cute Operations and NUT".
PEANUT stands for "Pretty Encryption Algorithm and NUT".
WALNUT stands for "Wonderful Algorithm with Light NUT".
All other nuts are welcome!
Differential cryptanalysis against r-round block ciphers usually starts from
a simple distinguisher between a (r-i)-round block cipher and the perfect
cipher with a small i (typically i = 1, 2, or 3).
This distinguisher uses a fixed differential characteristic.
It can be formalized as follows.
Differential distinguisher
- iterate n times the following process
- pick a random X
- query C(X) and C(X XOR a)
- if C(X XOR a) = C(X) XOR b then output 1, exit the loop, and stop
- output 0 and stop
The capability of a distinguisher A to distinguish two block ciphers is
measured by the advantage
Adv( C1 , C2 ) =
Pr[ AC = 1 , C <- C1 ] -
Pr[ AC = 1 , C <- C2 ]
where the probabilities hold over the X and C samples spaces.
Given a function f we define
DPf(a,b) = PrX[ f(X XOR a) = f(X) XOR b ]
and given a block cipher C (defined by a random key) we define
EDPC(a,b) = PrC,X[ C(X XOR a) = C(X) XOR b ]
The link between the advantage of the differential distinguisher and this
quantity is given by the following result proven in
[Stacs98].
There is also a link between EDP and the decorrelation to the order 2.
This leads to the following theorem.
Other approach to guaranty low differential probabilities in block ciphers
hold for a target group law like XOR.
We can notice that the result here extends to differential attacks using
any group law which is somewhat stronger.
There is a similar treatment for linear distinguishers.
They are formalized by n,a,b and an acceptance set B:
- initialize a counter u to 0
- iterate n times the following process
- pick a random X
- query C(X)
- if a.X = b.C(X) then increment counter u
- If u/n is in B then output 1 otherwise output 0
We define
LPf(a,b) = ( 2.PrX[ a.X = b.f(X) ] - 1 )2
ELPC(a,b) = EC( LPC(a,b) )
and we have the following results.
This treatment is quite useful when desiging block ciphers:
if we manage to construct one with low decorrelation of order 2, we can
guaranty for free that no differential or linear characteristic has a useful
bias on average over the distribution of the key.
[to be completed]
[CW79] |
Universal Classes of Hash Functions.
By Carter and Wegman.
In Journal of Computer and System Sciences vol. 18, pp. 143-154, 1979. |
[WC81] |
New Hash Functions and their Use in Authentication and Set
Equality.
By Wegman and Carter.
In Journal of Computer and System Sciences vol. 22, pp. 265-279, 1981. |
[LR88] |
How to Construct Pseudorandom Permutations from Pseudorandom
Functions.
By Luby and Rackoff.
In SIAM Journal on Computing vol. 17, pp. 373-386, 1988. |
[CG89] |
On the Power of Two-Point Based Sampling.
By Chor and Goldreich.
In Jounal of Complexity vol. 5, pp. 96-106, 1989. |
[ZMI89] |
Impossibility and Optimality Results on Constructing
Pseudorandom Permutations.
By Zheng, Matsumoto and Imai.
In EUROCRYPT' 89, pp. 412-422, LNCS 434, Springer-Verlag, 1990. |
[LM90] |
A Proposal for a New Block Encryption Standard.
By Lai and Massey.
In EUROCRYPT' 90, pp. 389-404, LNCS 473, Springer-Verlag, 1991. |
[Lai92] |
On the Design and Security of Block Ciphers.
By Lai.
In ETH Series in Information Processing vol. 1, Hartung-Gorre Verlag Konstanz,
1992. |
[Liens-97-3] |
A Cheap Paradigm for Block Cipher Security
Strengthening.
Technical report LIENS-97-3, Ecole Normale Supérieure, 1997. |
[Stacs98] |
Provable Security for Block Ciphers by
Decorrelation.
In STACS' 98, pp. 249-275, LNCS 1373, Springer-Verlag, 1998. |
[Sac98] |
Feistel Ciphers with L2 Decorrelation.
In SAC' 98, pp. 1-14, LNCS 1556, Springer-Verlag, 1999. |
[Aes98] |
Decorrelated Fast Cipher: an AES Candidate.
By Gilbert, Girault, Hoogvoorst, Noilhan, Pornin, Poupard, Stern and Vaudenay.
In the Proceedings from the First Advanced Encryption Standard Candidate
Conference, National Institute of Standards and Technology (NIST), 1998. |
[NR99] |
On the Construction of Pseudorandom Permutations:
Luby-Rackoff Revisited.
By Naor and Reingold.
In Journal of Cryptology vol. 12, pp. 29-66, 1999. |
[Aes99] |
DFC Update.
By Baudron, Gilbert, Granboulan, Handschuh, Harley, Joux, Nguyen, Noilhan,
Pointcheval, Pornin, Poupard, Stern and Vaudenay.
In the Proceedings from the Second Advanced Encryption Standard Candidate
Conference, National Institute of Standards and Technology (NIST), 1999. |
[PRS99] |
Towards Making Luby-Rackoff Ciphers Optimal and
Practical.
By Patel, Ramzan, Sundaram.
In Fast Software Encryption' 99, pp. 171-185, LNCS 1636, Springer-Verlag,
1999. |
[Eurocrypt99] |
Resistance Against General Iterated Attacks.
In EUROCRYPT' 99, pp. 255-271, LNCS 1592, Springer-Verlag, 1999. |
[Sac99] |
Adaptive-Attack Norm for Decorrelation and
Super-Pseudorandomness.
Technical report LIENS-99-2, Ecole Normale Supérieure, 1999.
To appear in SAC'99. |
[Asiacrypt99] |
On the Lai-Massey Scheme.
Technical report LIENS-99-3, Ecole Normale Supérieure, 1999.
To appear in ASIACRYPT'99. |
Return to the Decorrelation Home
Page