At Asiacrypt'98, Eli Biham gave an invited talk about AES and a primer avaluation of the candidates. Several criticisms have been emited against our DFC candidates, and we tried to correct it at the Rump Session. Here are our arguments.
Although we could expect a huge computation time, we achieved an implementation of DFC on a Motorola 6805 which is twice as fast as DES. About smart cards, although commonly claimed that we need more than 100 bytes of RAM, we did it on a real smart card which uses less than 100B. (See our CARDIS'98 paper.)8 on a 64-bit microprocessor 32 on a 32-bit microprocessor 512 on a 8-bit microprocessor
Thus the multiplication part of the DFC computation (which represents an important one) can benefits of a factor of 32 to 160 between a Pentium MMX with ANSI-C to a 64-bit microprocessor with ISO-C. In his talk, Eli Biham used the former architecture which handicaped DFC. NIST uses a Pentium Pro.from a Pentium/Pentium MMX to a Pentium Pro/Pentium II, the timing of multiplications has been reduced from 8-39 cycles down to 4 cycles from 32-bit architecture to 64-bit architecture, we need 4 times less multiplications due to the lack of long long int type in ANSI-C, the numbre of multiplications is reduced from ANSI-C to ISO-C by a factor of 4 as well
To illustrate these arguments, we can just compare the following benchmarks obtained by DFC (number of cycles for one encryption block):
(All but for Gladman's or Harley's implementations are ours. They are all in NIST's CD-2. See our report.)5874 on Pentium MMX (ANSI-C with gcc - experiment by Eli Biham) 3008 on Pentium 90 (standard C using long long int with gcc) 2592 on Pentium Pro (ANSI-C with gcc) 1750 on Pentium Pro (Visual C++ - implemented by Brian Gladman) 754 on Pentium Pro (assembly code) 428 on AXP 21164 (ANSI-C plus one opcode macro - implemented by Robert Harley).
Date: October the 28th, 1998.