Introduction to Decorrelation Theory
On-Line Manual
In order to make the decorrelation theory more easily accessable, here is an easy introduction. This "decorrelation manual" will be subject to frequent updates. (It is mainly under construction at this time.) For any feedback, please email to

Last update: July 15th, 2002.

Table of Content

  1. Goals
  2. d-wise Distribution Matrices
  3. Decorrelation Distance and Decorrelation Bias
  4. Matrix Norms and Decorrelation
  5. The Four Problems of Decorrelation Theory
  6. On the Choice of the Matrix Norm
  7. Norms and Undistinguishability
  8. Decorrelation Modules
  9. A few Decorrelated Cipher Constructions
  10. Differential and Linear Cryptanalysis
  11. Iterated Attacks
  12. References

1. Goals

The goals of decorrelation theory is
  1. to make the theory on k-wise independence usable in the block cipher theory
  2. to provide handy tools for dealing with undistinguishability
  3. to make formal security results feasible on block ciphers
  4. to provide design guidelines for new algorithms.

For this we consider block ciphers as random permutations in a mathematical sense: the random choice of the secret key K induce a distribution on the permutation CK. In the following we will omit the reference to the key.

The goal of the designer of the block cipher is to make it "look like" a truly random permutation, i.e. like a random permutation C* with a uniform distribution among the set of permutations. For this we need tools to compare C and C*.

Intuitively, we can compare C and C* at different level of depth. Namely, if for any fixed plaintext x the corresponding ciphertext C(x) is uniformly distributed, we say that C and C* are similar "to the order 1". Similarly, if for any two different plaintexts x1 and x2 the corresponding ciphertext pair (C(x1),C(x2)) has a uniform distribution among all the pairs of different ciphertexts, we say that C and C* are similar "to the order 2". And so on. This intuitively defined the notion of order of decorrelation.

2. d-wise Distribution Matrices

For any random function F (a block cipher is a random function with the special property that it is always a permutation) and any integer d we associate a huge matrix [F]d with real number entries which is called the "d-wise distribution matrix".

Beginners do not really need to understand where this matrix come from. For the others, here is the formal definition.

The most important property we need to know is the following.

Interestingly, if C* denotes the ideal random permutation (which means it has a uniform distribution), we have

(Note that this does not hold for functions. Indeed if F is the constant function which is zero on all points, we have [F o F*]d = [F]d and [F* o F]d = [G]d where G is a random function such that G(x) = G(y) for any x and y and G(x) is uniformly distributed.) This implies the following property.
(To prove this, we just develop the right hand term and use Proposition 1.)

In the introduction was addressed the problem of comparing a cipher C to the ideal cipher C* "to a given order". Now we can be a little more precise. We will compare [C]d and [C*]d.

3. Decorrelation Distance and Decorrelation Bias

Assuming we are given a norm ||.|| on the matrices set, a straightforward way to compare the two matrices [F]d [G]d is to compute the distance This is called the d-wise decorrelation distance between F and G according to the norm ||.||.

Random functions F must be compared to an ideal random function F* which has a uniform distribution. For this we define

the d-wise decorrelation bias of function F.

Similarly, ciphers C must be compared to the ideal cipher C*, and we define

the d-wise decorrelation bias of permutation C.

The question on the choice of the norm remains.
What we can already notice is that a decorrelation distance of zero do not depend on this choice, but holds for any norm. Thus we say that F and G have the same decorrelation of order d if their d-wise decorrelation distance is zero. Similarly, we say that a function F is perfectly decorrelated to the order d if its d-wise decorrelation bias is zero. We say that a cipher C is perfectly decorrelated to the order d if its d-wise decorrelation bias is zero.

We can notice the link with well-known combinatorial properties. First of all, when a cipher C is perfectly decorrelated to the order 1, it obviously achieve perfect secrecy in Shannon's sense when used only once. Second we notice that a random hash function H is perfectly decorrelated to the order 2 if, and only if it is strong universal2 hash function in Carter-Wegman's sense. (See Ref. [CW79,WC81].)

We have an equivalent to the Shannon Theorem.

Hence we need a lot of random bits in order to construct perfectly decorrelated functions.

4. Matrix Norms and Decorrelation

A mapping from the set of matrices to the set of real numbers is a norm if
  1. ||A|| = 0 holds if and only if A=0
  2. ||u.A|| = |u|.||A|| for any real number u (|u| is the absolute value of u here)
  3. ||A+B|| <= ||A|| + ||B||
In addition "matrix norms" are norms such that
  1. ||AxB|| <= ||A||.||B||

If we measure decorrelation biases with matrix norms, from the property of matrix norms, the definitions and Proposition 2, we have the interesting property that

This means that the decorrelation bias of an iterated cipher decreases exponentially with the number of rounds (as long as the decorrelation of a few rounds is less than one), which is quite a satisfactory property. Note that this property does not hold for functions (despite what was claimed in [Sac99]).

5. The Four Problems of Decorrelation Theory

Decorrelation theory faces four complementary problems.
  1. How to choose the matrix norm?
  2. Does the decorrelation of a cipher provide protection?
  3. How to build simple decorrelated primitives (decorrelation modules)?
  4. How to use them in order to build decorrelated ciphers?

For the fourth problem, the strategy consists of considering any regular conservative designs, and plugging into it a few decorrelated primitives which are called decorrelation modules. If we choose the right notion of decorrelation (the right norm), the rights decorrelation modules and plug them in the right positions, we obtain a decorrelated cipher which may be protected against the right attacks.

6. On the Choice of the Matrix Norm

The Euclidean norm (the L2 norm) has first been considered. For any matrix A, its Euclidean norm ||A||2 is the square root of the sum of the square of all entries. It is not obvious to see it is a matrix norm, but it is indeed. One problem is that it leads with hard to manipulate quantities. (See Ref. [Sac98] for more details.)

A much more convenient norm is the |||.|||oo norm. For any matrix A, the norm |||A|||oo is the maximum over all rows of the sum of the absolute values of all entries:

The notation of this norm comes from the fact that it is precisely the matrix norm associated to the Loo vector-norm. More precisely we define for a vector and This is the regular way for constructing a matrix norm from a vector norm.

Other interesting matrix norms are the ||.||a norm and the ||.||s norm defined by

where b1, b2, ... are 0 or 1.

7. Norms and Undistinguishability

In the theory of randomness, we measure how much a function is pseudorandom by measuring the advantage of Turing tests. Actually, given two random functions F and G, we consider an oracle Turing machine AO which outputs 0 or 1 after playing with the oracle O. We measure the advantage for distinguishing F from G by Considering a class Cl of oracle Turing machines which output 0 or 1, we consider the best advantage

Here are important results which motivate the choice of the |||.|||oo, ||.||a and ||.||s matrix norms.

(See Ref. [Sac99] for more details.)

This formalism turns out to make trivial (from the matrix norm properties) the following intuitive results. If we consider r independent instances C1, ..., Cr of the same cipher C, we consider the iterated cipher Cro...oC1. We have

The best advantage for distinguishing an iterated cipher from the ideal cipher C* decreases exponentially with the number of rounds r.

8. Decorrelation Modules

Inspired by what has been done based on the Carter-Wegman universal hashing theory (see Ref. [CW79], Ref. [WC81] and Ref. [CG89]), it is quite straightforward to propose decorrelation modules.
Here are a few decorrelation modules for the corresponding norms.
  1. (NUT-I: Perfectly decorrelated random function in GF(q))
    First of all, making a random function on a finite field GF(q) which is perfectly decorrelated to the order d (which means with a decorrelation bias of 0) is easy. If we let
      (a1,a2,a3,...,ad) in GF(q)d
    be random with a uniform distribution, the function
      F(x) = a1 + a2.x + a3.x2 + ... + ad.xd-1
    on GF(q) is such that DecFd(F) = 0.

  2. (NUT-II: Perfectly pairwise decorrelated permutation in GF(q))
    Making a random permutation on a finite field GF(q) perfectly decorrelated to an arbitrary order d is still an open problem. For d=1, the trivial solution
      C(x) = x + a
    satisfies DecP1(C) = 0, where a is uniformly distributed in GF(q). Note that this corresponds to Vernam's cipher (also known as one-time pad) which is known to be unconditionally secure (from Shannon's theory) if used only once.
    For d=2, the well known solution
      C(x) = ax + b
    satisfies DecP2(C) = 0, where (a,b) is uniformly distributed in GF(q)*xGF(q).
    In general, polynomials of larger degree are not permutations. Therefore it is not quite clear how to adapt the previous construction of perfectly decorrelated functions.

  3. (NUT-III: Pairwise decorrelated function in GF(p) for the L2 norm)
    In general, practical applications require that we work with bit-strings, in {0,1}m, which suggest that we use GF(2m). It is a little frustrating however, for software, to have to implement the multiplication in this field when we have an optimized built-in hardware multiplication on integers. For this we can use GF(p) with a prime p close to 2m.

    The first possibility consists of choosing the maximal prime p smaller than 2m. Let us write

      p = (1 - delta).2m.
    Then we let
      F(x) = ax + b mod p
    with
      (a,b) in {0,1,...,2m-1}2
    uniformly distributed. We obtain that
      DecF2L2(F) <= 2.sqrt(2.delta)
    for delta < 1/14. Unfortunately, the decorrelation with respect to the |||.|||oo, ||.||a and ||.||s are quite bad. Namely, we have
      DecF2||.||s(F) >= DecF2||.||a(F) >= DecF2|||.|||oo(F) >= 2 - 21-m + delta
    which is approximately 2, i.e. the worst decorrelation bias.
    (See
    Ref. [Sac98] for more details.)

  4. (NUT-IV: Decorrelated function in GF(q) for the |||.|||oo and ||.||a norms)
    The other possibility consists of choosing the minimal prime power q larger than 2m. Let us write
      q = (1 + delta).2m.
    Then we let
      F(x) = pi( r(a1) + r(a2).r(x) + r(a3).r(x)2 + ... + r(ad).r(x)d-1 )
    with
      (a1,a2,a3,...,ad) in {0,1,...,2m-1}d
    uniformly distributed, and where r is any injective mapping from {0,1,...,2m-1} to GF(q) (a "representation" mapping) and pi is any surjective mapping from GF(q) to {0,1,...,2m-1} (a "projection" mapping). For instance we can use
      F(x) = a1 + a2.x + a3.x2 + ... + ad.xd-1 mod p mod 2m
    for a prime p when considering {0,1,...,2m-1} as a subset of GF(p). We obtain that
      DecF2|||.|||oo(F) <= DecF2||.||a(F) <= 2.((1 + delta)d - 1)

    (See
    Ref. [Sac99] for more details.)

9. A few Decorrelated Cipher Constructions

Once we have a decorrelation module, we can plug it in regular conservative ciphers. In order to deduce an upper bound on the decorrelation bias of the cipher from an upper bound on the decorrelation bias of these modules, we use the following "meta-theorem". (See Ref. [Sac99] for more details.)
This theorem reduce the problem of estimating the decorrelation bias of C to the problem of estimating the decorrelation bias of C'.

It shall be outlined that it makes trivial the extension of Luby-Rackoff Theorem to pseudorandom functions instead of random ones: assuming that the advantage of any distinguisher which is limited to d chosen plaintext is at most epsilon against any round function Fi of a Feistel construction with at least three rounds, we obtain that any distinguisher which is limited to d chosen plaintext against the Feistel cipher is at most

This result extends to chosen plaintext and ciphertext with 4 instead of 3.

We can use the following constructions.

  1. (Coconut design)
    We let C1 and C3 be any ciphers, and C2 be a decorrelation invertible module. We can use the construction. For the ideal version with C2 = C*, we obtain that DecPd(C') = 0. Therefore we have for the |||.|||oo, ||.||a and ||.||s norms.

    In Ref. [Stacs98] was propose the Coconut98 cipher with a perfect pairwise decorrelation with

    shown as the NUT-II decorrelation module.

  2. (Peanut design)
    In Peanut ciphers, we use a regular r-round Feistel cipher with decorrelation modules F1, ..., Fr as round functions. With the standard notations we have We recall that the Feistel scheme Psi is defined by The decorrelation bias of the corresponding ideal function C' is given by Luby-Rackoff Theorem (see Ref. [LR88]): Thus if we let epsilon denote the maximal decorrelation bias of the decorrelation modules we have (See Ref. [Sac99] for more details.)
    In Ref. [Stacs98] was propose the Peanut98 cipher with pairwise decorrelation with

    In Ref. [Aes98] (see also Ref. [Aes99]) was propose the DFC cipher with pairwise decorrelation with

    These are the NUT-IV decorrelation module. In Ref. [Sac98] this construction was adapted to the L2 norm in order to propose the Peanut97 cipher with pairwise decorrelation with

    which is the NUT-III decorrelation module.

  3. (Walnut design)
    We can also use the Lai-Massey Scheme which has been used in the IDEA cipher (see Ref. [Lai92] and Ref. [LM90]). More precisely we define where sigma is a group alpha-orthomorphism. Similar bounds than for the Luby-Rackoff construction can be obtained. We need 3 rounds for pseudorandomness, and 4 rounds for super-pseudorandomness. In the Walnut ciphers, F1, ..., Fr are decorrelation modules.
    (See Ref. [Asiacrypt99] for more details.)

Other constructions can be adapted. In the literature we can find many theorems of constructions in the ideal cases. For instance Ref. [ZMI89] investigates generalized Feistel schemes with more than two branches. Ref. [NR99] improves the Luby-Rackoff construction by using the same round functions several times. Ref. [PRS99] investigates a new construction, ...

10. Differential and Linear Cryptanalysis

Differential cryptanalysis against r-round block ciphers usually starts from a simple distinguisher between a (r-i)-round block cipher and the perfect cipher with a small i (typically i = 1, 2, or 3). This distinguisher uses a fixed differential characteristic. It can be formalized as follows.
  1. iterate n times the following process
    1. pick a random X
    2. query C(X) and C(X XOR a)
    3. if C(X XOR a) = C(X) XOR b then output 1, exit the loop, and stop
  2. output 0 and stop
The capability of a distinguisher A to distinguish two block ciphers is measured by the advantage where the probabilities hold over the X and C samples spaces.

Given a function f we define

and given a block cipher C (defined by a random key) we define The link between the advantage of the differential distinguisher and this quantity is given by the following result proven in
[Stacs98]. There is also a link between EDP and the decorrelation to the order 2. This leads to the following theorem. Other approach to guaranty low differential probabilities in block ciphers hold for a target group law like XOR. We can notice that the result here extends to differential attacks using any group law which is somewhat stronger.

There is a similar treatment for linear distinguishers. They are formalized by n,a,b and an acceptance set B:

  1. initialize a counter u to 0
  2. iterate n times the following process
    1. pick a random X
    2. query C(X)
    3. if a.X = b.C(X) then increment counter u
  3. If u/n is in B then output 1 otherwise output 0
We define and we have the following results.

This treatment is quite useful when desiging block ciphers: if we manage to construct one with low decorrelation of order 2, we can guaranty for free that no differential or linear characteristic has a useful bias on average over the distribution of the key.

11. Iterated Attacks

[to be completed]


12. References

[CW79] Universal Classes of Hash Functions. By Carter and Wegman. In Journal of Computer and System Sciences vol. 18, pp. 143-154, 1979.
[WC81] New Hash Functions and their Use in Authentication and Set Equality. By Wegman and Carter. In Journal of Computer and System Sciences vol. 22, pp. 265-279, 1981.
[LR88] How to Construct Pseudorandom Permutations from Pseudorandom Functions. By Luby and Rackoff. In SIAM Journal on Computing vol. 17, pp. 373-386, 1988.
[CG89] On the Power of Two-Point Based Sampling. By Chor and Goldreich. In Jounal of Complexity vol. 5, pp. 96-106, 1989.
[ZMI89] Impossibility and Optimality Results on Constructing Pseudorandom Permutations. By Zheng, Matsumoto and Imai. In EUROCRYPT' 89, pp. 412-422, LNCS 434, Springer-Verlag, 1990.
[LM90] A Proposal for a New Block Encryption Standard. By Lai and Massey. In EUROCRYPT' 90, pp. 389-404, LNCS 473, Springer-Verlag, 1991.
[Lai92] On the Design and Security of Block Ciphers. By Lai. In ETH Series in Information Processing vol. 1, Hartung-Gorre Verlag Konstanz, 1992.
[Liens-97-3] A Cheap Paradigm for Block Cipher Security Strengthening. Technical report LIENS-97-3, Ecole Normale Supérieure, 1997.
[Stacs98] Provable Security for Block Ciphers by Decorrelation. In STACS' 98, pp. 249-275, LNCS 1373, Springer-Verlag, 1998.
[Sac98] Feistel Ciphers with L2 Decorrelation. In SAC' 98, pp. 1-14, LNCS 1556, Springer-Verlag, 1999.
[Aes98] Decorrelated Fast Cipher: an AES Candidate. By Gilbert, Girault, Hoogvoorst, Noilhan, Pornin, Poupard, Stern and Vaudenay. In the Proceedings from the First Advanced Encryption Standard Candidate Conference, National Institute of Standards and Technology (NIST), 1998.
[NR99] On the Construction of Pseudorandom Permutations: Luby-Rackoff Revisited. By Naor and Reingold. In Journal of Cryptology vol. 12, pp. 29-66, 1999.
[Aes99] DFC Update. By Baudron, Gilbert, Granboulan, Handschuh, Harley, Joux, Nguyen, Noilhan, Pointcheval, Pornin, Poupard, Stern and Vaudenay. In the Proceedings from the Second Advanced Encryption Standard Candidate Conference, National Institute of Standards and Technology (NIST), 1999.
[PRS99] Towards Making Luby-Rackoff Ciphers Optimal and Practical. By Patel, Ramzan, Sundaram. In Fast Software Encryption' 99, pp. 171-185, LNCS 1636, Springer-Verlag, 1999.
[Eurocrypt99] Resistance Against General Iterated Attacks. In EUROCRYPT' 99, pp. 255-271, LNCS 1592, Springer-Verlag, 1999.
[Sac99] Adaptive-Attack Norm for Decorrelation and Super-Pseudorandomness. Technical report LIENS-99-2, Ecole Normale Supérieure, 1999. To appear in SAC'99.
[Asiacrypt99] On the Lai-Massey Scheme. Technical report LIENS-99-3, Ecole Normale Supérieure, 1999. To appear in ASIACRYPT'99.

Return to the Decorrelation Home Page