TASTY: Tool for Automating Secure Two-partY computations

Page created by Greg Bennett
 
CONTINUE READING
TASTY: Tool for Automating Secure Two-partY
                                computations
                                                           (Full Version)
                                  Wilko Henecka, Stefan Kögl, Ahmad-Reza Sadeghi,
                                        Thomas Schneider, Immo Wehrenberg
                                                          System Security Lab
                                                         Ruhr-University Bochum
                                                               Germany
                                   {ahmad.sadeghi,thomas.schneider}@trust.rub.de,
                               {wilko.henecka,stefan.koegl,immo.wehrenberg}@rub.de

ABSTRACT                                                              Categories and Subject Descriptors
Secure two-party computation allows two untrusting parties            D.0 [Software]: General
to jointly compute an arbitrary function on their respective
private inputs while revealing no information beyond the              General Terms
outcome. Existing cryptographic compilers can automati-
cally generate secure computation protocols from high-level           Design, Security, Languages, Performance, Measurement
specifications, but are often limited in their use and effi-
ciency of generated protocols as they are based on either gar-        Keywords
bled circuits or (additively) homomorphic encryption only.            Cryptography, secure function evaluation, compiler, garbled
   In this paper we present TASTY, a novel tool for automat-          circuits, homomorphic encryption
ing, i.e., describing, generating, executing, benchmarking,
and comparing, efficient secure two-party computation pro-
tocols. TASTY is a new compiler that can generate pro-                1.   INTRODUCTION
tocols based on homomorphic encryption and efficient gar-               The design of efficient secure two-party computation pro-
bled circuits as well as combinations of both, which often            tocols is vital for a variety of security-critical applications
yields the most efficient protocols available today. The user         with sophisticated privacy and security requirements such
provides a high-level description of the computations to be           as electronic auctions [40], data mining [34], remote diag-
performed on encrypted data in a domain-specific language.            nostics [7], medical data classification [1], or face recognition
This is automatically transformed into a protocol. TASTY              [15, 50, 44] to name some.
provides most recent techniques and optimizations for prac-             Modern cryptography provides various tools for secure
tical secure two-party computation with low online latency.           computation. The concept of two-party Secure Function
Moreover, it allows to efficiently evaluate circuits generated        Evaluation (SFE) was introduced in 1982 by Yao [59]. The
by the well-known Fairplay compiler.                                  idea is to let two mutually mistrusting parties compute an
   We use TASTY to compare protocols for secure multipli-             arbitrary function (known by both)1 on their private inputs
cation based on homomorphic encryption with those based               without revealing any information about their inputs beyond
on garbled circuits and highly efficient Karatsuba multipli-          the function’s output. However, the real-world deployment
cation. Further, we show how TASTY improves the online                of SFE was believed to be very limited and expensive for a
latency for securely evaluating the AES functionality by an           relatively long time. Fortunately, the cost of SFE has been
order of magnitude compared to previous software imple-               dramatically reduced in the recent years thanks to many
mentations. TASTY allows to automatically generate effi-              algorithmic improvements and automatic tools, as well as
cient secure protocols for many privacy-preserving applica-           faster computing platforms and communication networks.
tions where we consider the use cases for private set inter-            For several years, two different approaches for secure two-
section and face recognition protocols.                               party computation have co-existed. One of them is based
                                                                      on homomorphic encryption (HE). Here one party sends its
                                                                      encrypted inputs to the other party, who then computes
                                                                      the desired function under encryption using the homomor-
                                                                      phic properties of the cryptosystem, and sends back the en-
                                                                      crypted result. Popular examples are the additively homo-
                                                                      morphic cryptosystems of Paillier [45] and Damgård-Jurik
                                                                      [12], and the recent fully homomorphic schemes [17, 13, 55].
                                                                      The other approach is based on garbled circuits (GC), intro-
                                                                      duced by Yao [60], that works as follows: one party (con-
                                                                      1
                                                                        Universal circuits [58, 32, 49] allow to hide the function
This is the full version of the ACM CCS’10 paper [23].                from one party.
structor) “encrypts” the circuit (using symmetric keys), the       tion of these techniques speeds up the online phase for secure
other party (evaluator) obliviously obtains the keys corre-        evaluation of AES (a large circuit with more than 30, 000
sponding to both parties’ inputs and the garbled circuit,          gates) compared to the currently fastest software implemen-
and is able to decrypt the corresponding output value. Both        tation of GCs [48] from 5 s to only 0.5 s, while the total costs
approaches have their respective advantages and disadvan-          for setup plus online phase stay almost the same (§5.2).
tages, i.e., GC requires to transfer the garbled circuit (com-        Circuit Optimizations: Additionally, TASTY has built-
munication complexity is at least linear in the size of the        in tools for on-the-fly generation and minimization of boolean
circuit) but allows to pre-compute almost all expensive oper-      circuits (§4.3). As new circuit building block we implement
ations resulting in a low latency of the online phase, whereas     fast multiplication circuits based on Karatsuba method [28]
most HE schemes require relatively expensive public-key op-        that are more efficient than textbook multiplication (used in
erations in the online phase but can result in a smaller overall   previous SFE tools), already for 20 bit numbers; for multipli-
communication complexity.                                          cation of 128 bit values, it is more efficient by 45% (§5.1.1).
   In the recent years several cryptographic compilers and            Benchmarking: Using TASTY, we obtain measurements
specification languages have been proposed that, after a pro-      for a detailed performance comparison of multiplication pro-
grammer has manually mapped an existing algorithm to in-           tocols based on GCs with those based on HE. Our experi-
teger arithmetics, automatically compile this into SFE pro-        ments show that GC-based multiplication has large commu-
tocols. We will give an overview on such previous works in         nication and time complexity in the setup phase, but results
§1.2. However, such tools are currently restricted to gen-         in a more efficient online time than HE-based multiplication
erating protocols based on only one SFE paradigm, i.e.,            for small values (§5.1.2). In particular, multiplication of two
use either garbled circuits (GC) or homomorphic encryp-            garbled values with bitlength ` ≤ 16 bits requires less on-
tion (HE), which often results in protocols with suboptimal        line communication and time than the multiplication of two
efficiency. For instance HE allows efficient addition and mul-     homomorphically encrypted values for short-term security.
tiplication of large values (as confirmed by our implementa-          Applications: We show that TASTY is a usable and
tion results in §5.1.2), whereas GCs are better for non-linear     useful tool for describing and automatically generating ef-
functionalities such as comparison [31]. By combining both         ficient protocols for several privacy-preserving applications.
approaches, relatively efficient protocols can be obtained         We implemented set intersection and face recognition (§3).
when designing privacy-preserving applications, e.g., remote          The paper is concluded with an overview on future work
diagnostics [7], classification [1], or face recognition [50].     which could be based on the TASTY framework (§6).
   The main goal of this work is the design and implemen-
tation of the first compiler, we call TASTY, that can auto-        1.2    Related Work
matically generate efficient protocols based on homomorphic           While the theoretical foundations of two-party Secure
encryption (HE) and garbled circuits (GC) as well as combi-        Function Evaluation (SFE) have been laid already in the
nations of both from a high-level description of the protocol.     eighties [59, 60], recent optimizations and prototype imple-
   Finally, we would like to stress that although fully homo-      mentations show that SFE is ready to be used in practical
morphic encryption schemes have emerged recently [17, 13,          applications (e.g., [35, 48]). To allow the deployment of SFE
55], they are still not efficient enough to be used in practi-     in a wide range of privacy-preserving applications it is not
cal applications. Nevertheless, they could be integrated into      only important to maximize the efficiency of SFE proto-
our compiler framework once they are efficient enough.             cols, but also to make SFE usable by automatically generat-
                                                                   ing protocols from high-level descriptions. For this, several
1.1    Our Contribution and Outline                                frameworks for SFE consisting of languages and correspond-
   In this paper, we present the following contributions in        ing tools have been developed in the last years. We review
the respective sections.                                           these proposals briefly in the following.
   SFE Compiler: We present TASTY, a tool that allows                 Existing SFE frameworks can be divided into three classes
to automatically generate, benchmark and compare the per-          on different abstraction levels as summarized in Table 1.
formance of efficient two-party SFE protocols in the semi-
honest model (§4). We show how TASTY is related to, im-
                                                                   Table 1: Abstraction Levels for Automatic Genera-
proves over, and can be combined with existing tools for
                                                                   tion of SFE Protocols
automatic generation of (two-party) SFE protocols (§1.2).
   Specification Language: TASTYL, the TASTY input
Language, allows to describe SFE protocols as sequence of           Abstraction Level                 Primitives
operations on encrypted data based on combinations of Gar-
                                                                    Function Description              I/O, computation
bled Circuits (GC) and Homomorphic Encryption (HE). We
                                                                    Protocol Description              I/O, enc/dec, compu-
review the underlying theoretical framework for such modu-
                                                                                                      tation under encryption
larly composed SFE protocols [31] in §2. TASTYL is based
                                                                    Protocol Implementation           I/O, protocols, messages,
on the Python programming language and hides technical
                                                                                                      cryptographic primitives
cryptographic details from the programmer (§4.1).
   Efficient Building Blocks: TASTY implements effi-
cient building blocks for HE and GC which allow to shift
most of the complexity into the less time critical setup phase       Function Description languages allow to specify what
resulting in SFE protocols with a low-latency online phase         function should be computed securely. The function is de-
(§4.3). While the implemented techniques have been known           scribed in a domain-specific high-level programming lan-
before, their combination and implementation in a single           guage which allows programmers to write programs using
package is unique and useful. We show how the combina-             SFE without any expert knowledge about SFE. Functions
described in such languages can then be (formally) analyzed        ing Yao’s garbled circuit protocol, communicating over a
to ensure security of the function (e.g., no information leak      TCP socket. Fairplay is supplemented by FairplayMP [3], a
to the other party) and are compiled (potentially through          multi-party version of Fairplay suited for three or more par-
lower-level SFE languages) into SFE protocols. Examples            ties with the more powerful SFDL 2 input language (with
are Fairplay’s Secure Function Definition Language (SFDL)          support for ∗, / and generic functions) and a corresponding
[37, 3] which can be compiled to boolean circuits (see below),     circuit compiler. TASTY can serve as efficient runtime en-
or the Secure Multiparty Computation Language (SMCL)               vironment for the Fairplay compiler suite, i.e., it allows to
[42] and its Python-based successor PySMCL [41] which al-          read in circuits generated by the FairplayMP compiler from
low compilation into arithmetic circuit-based secure multi-        SFDL 2 programs3 and optimizes these for efficient secure
party computation (SMPC) protocols such as the Virtual             evaluation with state-of-the-art GC evaluation techniques.
Ideal Functionality Framework (VIFF) [11].                            Homomorphic Encryption (HE). VIFF [11], the Vir-
   Protocol Description languages allow to specify how             tual Ideal Functionality Framework, is an open source frame-
the SFE protocol is composed as sequence of basic opera-           work written in Python for specifying secure multi-party
tions on encrypted (or secret-shared data). Examples (de-          computation (SMPC) protocols as a sequence of operations
scribed in more detail below) are VIFF [11], the Secure            performed on secret-shared (i.e., encrypted) data. While
Multiparty Computation language (SMC) [43, 54], Share-             VIFF was mainly designed for secret-sharing based SMPC
mind [5], and the compiler of MacKenzie et al. [36]. These         protocols with three or more parties, it also offers a two-
languages allow to specify SFE protocols while abstracting         player runtime based on the additively homomorphic Paillier
away the details of the underlying cryptographic protocols.        cryptosystem [45]. Using operator overloading, VIFF allows
The language and compiler we present in this paper also fall       the programmer to express a desired secure computation di-
into this class. However, in contrast to previous works which      rectly as standard arithmetic without knowing about the
were restricted to using homomorphic encryption only, our          used protocol. Indeed, TASTYL, the input language of our
compiler TASTY allows arbitrary combinations of compu-             compiler, is inspired by the VIFF language, but additionally
tations under encryption based on garbled circuits and/or          allows to combine HE with GC-based computations.
homomorphic encryption for highly efficient SFE protocols.         In contrast to general-purpose compilers such as Fairplay,
   Protocol Implementation languages allow to describe             VIFF, and TASTY, the compilers described below are built
how exactly the target SFE protocol is composed as sequence        for specific application scenarios, e.g., use specific number
of basic cryptographic protocol building blocks. They reside       representations [36, 5] or require n ≥ 3 parties [43, 54, 5]:
at the lowest level of the abstraction hierarchy and require       The compiler of MacKenzie et al. [36] implements secure
a substantial amount of expert knowledge in cryptographic          two-party computations over values     which are secret-shared
                                                                   between both parties using 22 secret-sharing over a prime
                                                                                                    
protocol design. For example the L1 language [52] allows
to describe secure computation protocols as sequence of ba-        field. The computations are composed as sequence of basic
sic primitives such as oblivious transfer (OT), homomor-           operations on the shared data (e.g., addition or multiplica-
phic encryption/decryption, creation and evaluation of gar-        tion). The compiler can be used for specific functions such
bled circuits, and messages to be exchanged. Qilin [38] is a       as cryptographic primitives defined over prime fields, e.g.,
Java library for rapid prototyping of cryptographic protocols      signatures or encryption schemes, where the secret key is
which currently provides common cryptographic protocols            shared between both parties.
(e.g., OT [39] and coin flipping) using cryptographic prim-        SMC [43, 54], the Secure Multiparty Computation language,
itives (e.g., Pedersen Commitment [47] and ElGamal [14])           provides a declarative language for describing SMPC based
implemented with elliptic curves.                                  on constraint programming. A program is distributed among
                                                                   the parties in the computation along with an interpreter,
   Next we describe SFE frameworks which are closely re-           each party gives its secret inputs and the interpreter calcu-
lated to ours. In contrast to TASTY, the existing SFE              lates the result. Computations are specified as arithmetic
frameworks are based on either garbled circuits (GC) or ho-        circuits and at least 3 parties are required as the underlying
momorphic encryption (HE), but not combinations of both.           multiplication protocol is based on the BGW protocol [4].
   Garbled Circuits (GC). The most prominent example               Sharemind [5] allows secure computation over the ring of
for automatic generation of SFE protocols is Fairplay [37]         32-bit integers for three parties and provides an assembly-
which is based on GCs. Fairplay provides a high-level func-        like programming language. As this setting is fixed and very
tion description language, SFDL, which allows to specify           specific it allows highly efficient protocols.
the function to be computed securely, i.e., the inputs and
outputs of the involved parties, and how the outputs are to
be computed from the inputs. The language resembles a
                                                                   2.      THEORETICAL BACKGROUND
simplified version of a hardware description language, such           In this section we summarize the framework for modu-
as Verilog or VHDL2 , and supports types, variables, func-         lar design of efficient two-party Secure Function Evaluation
tions, boolean operators (∧, ∨, ⊕, . . . ), arithmetic operators   (SFE) protocols of [31] on which TASTY is built.
(+, −), comparison (
wards protocols with stronger security guarantees. Indeed,        erates a key-pair for the homomorphic cryptosystem and
most protocols and implementations of protocols for practi-       sends the public key together with his inputs encrypted un-
cal privacy-preserving applications focus on the semi-honest      der the public key to server S. S uses the homomorphic
model [40, 7, 1, 15, 50, 44]. For a detailed discussion on the    property to evaluate the arithmetic circuit on the encrypted
semi-honest model and its extensions we refer to [34, 31].        data. If the cryptosystem is only additively homomorphic,
   Notation. We call the two semi-honest SFE participants         multiplication under encryption requires the help of C in a
client C and server S. This naming choice is influenced by        single round of interaction (details in [31]). Finally, S sends
the asymmetry in the SFE protocols, which fits into the           the encrypted outcome of the computation back to C who
client-server model. We stress that, while in most real-          can decrypt. As often the maximum size of elements in the
life two-party SFE scenarios this client-server relationship      plaintext space (e.g., P = Zn with RSA modulus n for the
in fact exists, we do not limit ourself to this setting.          Paillier cryptosystem [45]) is substantially larger than the
   Function Representations. Given a function f that              size of encrypted values, S can pack multiple values under
should be computed securely, the first task during the de-        encryption using Horner’s method before sending them to C
sign of the corresponding SFE protocol is to find a suitable      to reduce communication and number of decryptions by C.
representation for f . Well-established representations which        As described in [31], the interactive approach for multipli-
allow efficient SFE protocols are boolean circuits and arith-     cation currently results in faster SFE protocols than using
metic circuits as shown in Fig. 1.4 The representation deter-     schemes which also provide one (e.g., [6, 18]) or arbitrary
mines the size of the function, e.g., multiplication can be ex-   many (e.g., [17, 13, 55]) multiplications under encryption,
pressed as arithmetic circuit with a single multiplication gate   called fully homomorphic encryption. Such schemes could
while its representation as boolean circuit is substantially      be integrated in TASTY in future work as described in §6.
larger (cf. §5.1). As described in §2.2, the online phase for
SFE of boolean circuits is substantially more efficient than      2.2    Garbled Circuits: SFE of Boolean Circuits
SFE of arithmetic circuits, so especially non-linear functions       Garbled circuits (GC) are an efficient method for SFE of
such as comparisons benefit from boolean circuits [30]. The       boolean circuits. The general idea of GCs, going back to
framework of [31] allows to modularly compose functions           Yao [60], is to encrypt (garble) each wire with a symmetric
from building blocks which are compactly represented as           encryption scheme. In contrast to homomorphic encryption
boolean or arithmetic circuits and then convert back and          (cf. §2.1), the encryptions/garblings here cannot be oper-
forth between the representations under encryption.               ated on directly, but require helper information which is
                                                                  generated and sent to C in the setup phase in form of a
          x1 x2 x3 x4                  x1    x2                   garbled table for each gate. On the other hand, the online
                                                      5
                                                                  phase of GCs is highly efficient as it requires only symmetric
            ∧    ⊕                                                cryptographic operations, e.g., the GC method of [48] im-
                                         +        ×
                                                                  plemented in TASTY needs one invocation of SHA-256 per
                                                                  non-XOR gate (cf. §4.3).
             ∨       =                       ×
                                                                     On a high-level, Yao’s GC protocol works as follows: In
            z1       z2                                           the setup phase, the constructor (server S) generates an en-
                                           z
                                                                  crypted version of the function f (represented as boolean
       (a) Boolean Circuit       (b) Arithmetic Circuit
                                                                  circuit), called garbled circuit fe. For this, he assigns to each
                                                                  wire Wi of f two randomly chosen garbled values w          ei0 , w
                                                                                                                                   ei1
         Figure 1: Function Representations                       (symmetric keys) that correspond to the respective values 0
                                                                  and 1. Note that w   eij does not reveal any information about
   In the following, we summarize efficient methods for SFE       its plain value j as both keys look random. Then, for each
of arithmetic and boolean circuits, and conversions between       gate of f , the constructor creates helper information in form
them which are implemented in TASTY. For a comprehen-             of a garbled table Tei that allows to decrypt only the output
sive description we refer to [31] and list the specific primi-    key from the gate’s input keys. The garbled circuit fe con-
tives implemented in TASTY in §4.3.                               sists of the garbled tables of all gates and is sent to C in
                                                                  the setup phase. Later, in the online phase the evaluator
2.1    Homomorphic Encryption: SFE of Arith-                      (client C) obliviously obtains the garbled values x     e and ye
       metic Circuits                                             corresponding to the plain inputs x and y of C and S, re-
   Additively homomorphic encryption schemes (e.g., [45, 12])     spectively (see below). Afterwards, C evaluates the garbled
are semantically secure encryption schemes with plaintext         circuit fe on x
                                                                                e, ye by evaluating the garbled gates one-by-one
space P and ciphertext space C that allow addition under          using their garbled tables. Finally, C obtains the correspond-
encryption: The operation + can be computed on plain-             ing garbled output values ye which allow S to decrypt them
texts by defining a corresponding operation  on ciphertexts      into the corresponding plain output z = f (x, y).
which satisfies ∀x, y ∈ P : JxKJyK = Jx+yK. This naturally          For converting a plain input bit yi of S into its garbled
allows for multiplication with a plaintext constant a using       equivalent, S simply sends the key yeiyi to C. Similarly, C
repeated doubling and adding: ∀a ∈ N, x ∈ P : aJxK = JaxK.        must obtain the garbled bit x     ei corresponding to his input
We write JxK for homomorphic encryption of plaintext x.           bit xi , but without S learning xi . This can be achieved
   SFE of arithmetic circuits can be naturally based on ad-       by running (in parallel for each bit xi of x) a 1-out-of-2
ditively homomorphic encryption as follows: Client C gen-         Oblivious Transfer (OT) protocol. OT is a cryptographic
4
 Ordered Binary Decision Diagrams (OBDDs), an alterna-            protocol into which C inputs his choice bit b = xi and S
tive function representation which also fits into the frame-      inputs two strings s0 = x     e0i and s1 = x  e1i . The protocol
work of [31] is not implemented in TASTY yet.                     guarantees that C obtains only the chosen string sb = x     exi i =
ei while S learns no information on b = xi . We summarize
x                                                                             Two parties, client C and server S, have as inputs a set
efficient instantiations for parallel OT later in §4.3.                    X = {x1 , . . . , xm } respectively Y = {y1 , . . . , yn }. The pro-
   We emphasize that GCs cannot be evaluated twice, and                    tocol should compute the intersection X ∩ Y without reveal-
refer to [33] for a proof of security for Yao’s protocol in the            ing any other elements to the other party. The main idea
semi-honest model and to [31] for a summary of different                   behind this protocol is to encode X as a polynomial p(x)
methods for constructing garbled tables and converting gar-                whose roots are thePm values xi , i.e., p(x) = (x − x1 )(x −
bled outputs into plain values.                                            x2 ) . . . (x − xm ) = m ai xi . C computes the coefficients ai
                                                                           of p(x), encrypts them separately using homomorphic en-
2.3      Hybrid SFE of Mixed Representations                               cryption and then sends these ciphertexts to S. Then, S
   The SFE framework proposed in [31] allows to modularly                  evaluates the polynomial p under homomorphic encryption:
compose SFE protocols as sequence of operations on en-                     Jp(yi )K = Jak Kyik  Jak−1 Kyik−1  . . .  Ja0 K. This is done
crypted data as shown in Fig. 2: Both parties have Plain                   efficiently with Horner’s method. Now, for each yi ∈ Y ,
Values as their inputs into the protocol. These plain val-                 S picks a random value ri , computes Jȳi K = Jri ∗ p(yi ) + yi K,
ues, denoted as x, are first encrypted by converting them                  and sends it to C. If yi is equal to an element in X, then
into their corresponding encrypted value. A Garbled Value,                 this is an encryption of yi (as p(yi ) evaluates to 0), and of
denoted as x e, held by client C or a Homomorphic Value,                   a random element otherwise. C finally decrypts Jȳi K into ȳi ,
denoted as JxK held by server S, depending on which oper-                  and if ȳi ∈ X, C puts ȳi into the intersection set.
ations should be applied. After encryption, the function is                   This protocol can be implemented in TASTYL as listed
securely evaluated on the encrypted values, which may in-                  in Appendix §A. The performance for random 32-bit inputs,
volve conversion of the encryptions into the respective other              measured automatically with TASTY, is shown in Table 2.
type of encryption (see below). Finally, the encrypted out-
put values are revealed and can be decrypted by converting
them into their corresponding plain output values. In the                   Table 2: Set Intersection of [16, 34] with TASTY.
following we describe how to efficiently convert between the                   Elements (m = n)      10       100     1,000
two types of encryptions.                                                      C setup            153 ms 969 ms       9.3 s
                                                                               S setup            194 ms     1.6 s    15.8 s
                          Client C                   Server S                  C online           357 ms     7.2 s    489 s
Inputs/Outputs         Plain Value x               Plain Value x               S online           216 ms     6.2 s    478 s
                                                                               Total send         19.6 kB 186 kB 1.86 MB
Encrypted Values                   �
                     Garbled Value x          Homomorphic Value �x�

                      Boolean Circuits            Arithmetic Circuits
      SFE of       using Garbled Circuits   using Homomorphic Encryption
                                                                           3.2    Privacy-Preserving Face Recognition
               Figure 2: Hybrid SFE Protocols                                 For privacy-preserving face recognition, client C has a
                                                                           query face which should be searched in a database (DB)
                                                                           of faces held by server S without disclosing any additional
   Conversion between Garbled and Homomorphic                              information on the queried face to S nor any information
Values. To convert an Homomorphic Value JxK into a Gar-                    on the DB to C (besides the size and the outcome of the
bled Value x e, S adds a random mask r under homomorphic                   computation). At the end, C obtains either the index of the
encryption, sends the blinded value Jx̄K = JxK  JrK to C                  queried face in the DB, or ⊥ if no match was found.
who decrypts and both parties evaluate a garbled subtrac-                     We summarize the face recognition protocol of [50] which
tion circuit which takes off the random mask under “garbled                evaluates the well-known Eigenface algorithm [57] under en-
encryption”. A similar method can be used for converting                   cryption and can be divided into the following three phases:
a Garbled Value x   e into an Homomorphic Value JxK. For                   Projection. First, the query face Γ is projected into a low-
details we refer to [31].                                                  dimensional eigenspace. This is done under homomorphic
                                                                           encryption as follows: C encrypts Γ pixelwise and sends JΓK
                                                                           to S who performs the projection under encryption and ob-
3.     SELECTED APPLICATIONS                                               tains the encrypted projected query face JΩ̄K.
   In this section we show how the TASTY framework can be                  Distance. Then, the squared Euclidean distance JDi K =
used to intuitively describe, and automatically generate and               J(Ωi − Ω̄)2 K between the projected face and all faces Ωi in
measure the performance of two privacy-preserving applica-                 S’s DB is computed under homomorphic encryption.
tions. We consider privacy-preserving set intersection (§3.1)              Minimum. Finally, the minimum value of {JDi K} is com-
and privacy-preserving face recognition (§3.2). A detailed                 puted and, if smaller than a threshold τ provided by S, the
description of TASTY and its input language TASTYL is                      corresponding index in the DB is revealed to C. Otherwise,
given later in §4; further performance results and the specs               no match was found and ⊥ is returned. The protocol of [50]
of the machines used in our performance measurements are                   improves over [15] by computing this phase with garbled
given in §5.                                                               circuits instead of homomorphic encryption.
                                                                              The TASTYL code of this protocol is given in Appendix §B.
3.1      Privacy-Preserving Set Intersection                               Performance. In the following we compare the perfor-
  Privacy-preserving set intersection is a fundamental build-              mance of this protocol implemented in TASTY with its hand-
ing block for many privacy-preserving applications such as                 optimized implementation of [50] and the original protocol
privacy-preserving checking of no-flight list. We briefly sum-             of [15] based on HE only. As previous works we perform
marize the HE-based set-intersection protocol of [16, 34]:                 our measurements with ultra-short term security parame-
18
ters (cf. Table 4). The results are summarized in Table 3                                                                                  Setup
                                                                                                                                           Online
and visualized in Fig. 3 (time) and Fig. 4 (communication).                                                    16                     [15] Online
                                                                                                                         [15] Online (theoretical)
   Setup Phase. While [50] focused on the online phase only,                                                   14
                                                                                                                         [50] Online (theoretical)

we also provide performance measurements for the setup
                                                                                                               12
phase. As expected, both time and communication of the

                                                                                              Data in MBytes
setup phase grow linearly in the DB size (corresponding to                                                     10

linear circuit size and number of OTs). A constant overhead                                                     8

is needed for pre-computing random masks for HE (≈ 20s).
                                                                                                                6
The setup phase is less efficient than that of the HE-only
protocol of [15] as GCs need to be generated and transferred.                                                   4

   Online Phase. When comparing the online time of the                                                          2

protocol generated by TASTY with the hand-optimized im-
                                                                                                                0
plementation of [50] we observe that they mostly differ by a                                                        0     100          200           300   400     500     600   700   800   900   1000
                                                                                                                                                                 DB Size
constant overhead. In fact, the online phase is dominated by
homomorphically encrypting the vector Γ in the projection
phase which is not yet optimized in TASTY. For the online                                     Figure 4: Hybrid Face Recognition: Communication
communication complexity we observe that data serializa-
tion in TASTY, which is also not optimized yet, requires
approximately twice as much data as the theoretical lower
bound of the protocol of [50]. Still, even without further
                                                                                              4.                        TASTY
serialization optimizations, the online communication com-                                      In this section we present TASTY, our tool for describing
plexity outperforms that of [15] already for databases with                                   and automatically generating, benchmarking, and evaluat-
slightly more than 200 faces. We note that the communica-                                     ing hybrid secure two-party computation protocols.
tion complexity of [15] closely matches the theoretical lower                                   Design Goals. TASTY was designed and developed to
bound, i.e., their serialization cannot be optimized further.                                 meet the following goals:
                                                                                                               1. SFE protocols are programmed in TASTYL, an intu-
                                                                                                                  itive high-level language for describing the protocol as
            Table 3: Hybrid Face Recognition with TASTY                                                           sequence of operations on encrypted data (cf. §4.1).
                                           Communication                                                       2. TASTY allows to test, benchmark and compare the
                               Time in s
                                              in MBytes                                                           performance of the generated SFE protocols (cf. §4.2).
                      |DB| Setup Online Setup Online                                                           3. The generated SFE protocols aim at minimizing the
              [15]       320  22      18          7.3                                                             latency of the online phase, i.e., the time from pro-
              [50]       320   -      8.4    -        2.8                                                         viding the inputs until obtaining the outputs. This
             TASTY       320 38.1    41.5   3.3       5.9                                                         is achieved by using a combination of highly efficient
              [50]    1, 000   -      13     -        3.5                                                         primitives and pre-computations (cf. §4.3).
             TASTY 1, 000    83.4    56.2   10.2      6.8
                                                                                                 Architecture and Workflow (cf. Fig. 5). The work-
                                                                                              flow for using TASTY is as follows:
                                                                                                               1. Both users, client C and server S, agree on a Protocol
            90
                          Setup
                                                                                                                  Description of the SFE protocol in the TASTY input
            80
                       C Online
                       S Online
                                                                                                                  Language (TASTYL) as described in detail in §4.1.
                     [15] Online
                     [50] Online                                                                               2. Both users invoke TASTY’s Runtime Environment (de-
            70
                                                                                                                  tails later in §4.2), a program that can automatically
            60                                                                                                    analyze, run, test, and benchmark the SFE protocol:
            50                                                                                                          (a) In the Analyzation Phase, the runtime environ-
Time in s

            40
                                                                                                                            ment checks the syntactical correctness of the pro-
                                                                                                                            tocol description, exchanges a hash of it to en-
            30
                                                                                                                            sure that both parties run the same protocol, and
            20                                                                                                              analyzes the protocol to automatically determine
            10                                                                                                              which parts of the protocol can be pre-computed.
                                                                                                                        (b) In the Setup Phase, the parties pre-compute those
             0
                 0         100     200   300   400     500     600   700   800   900   1000                                 parts of the protocol which are independent of
                                                     DB Size
                                                                                                                            their inputs, e.g., create/send garbled circuits and
                                                                                                                            oblivious transfers (OT), see §4.3 for details.
                     Figure 3: Hybrid Face Recognition: Times                                                           (c) Finally, in the Online Phase, both parties provide
                                                                                                                            their inputs to the computation, and the online
                                                                                                                            part of the SFE protocol is executed (e.g., ho-
   SCiFI - a system for secure face identification. We note                                                                 momorphic encryptions and decryptions, online
that the recent face recognition system of [44], consisting of a                                                            OTs, and evaluation of GCs) to jointly compute
novel recognition algorithm which was co-designed together                                                                  the respective outputs for both parties.
with a highly efficient SFE protocol, is more accurate and                                                     3. TASTY provides a tool to compare the performance
efficient than Eigenface-based protocols.                                                                         costs of multiple SFE protocols as described in §4.2.
Protocol Description                              Value                                       Garbled
         Client C                               Server S
                           in TASTYL                                                                                  Value
                                                                        bitlength
                                                                                                                    mux,
# −∗− c o d i n g : u t f −8 −∗−                                                     ception if the result does not fit into the plaintext space of
def p r o t o c o l ( c l i e n t , s e r v e r ) :                                  the homomorphic cryptosystem. For example, in Fig. 7 the
    N = 4
    L = 32
                                                                                     component-wise product of two vectors with N components
                                                                                     of unsigned L-bit values results in the homomorphic vector
       # input of c l i e n t                                                        server.hx with N components of unsigned 2L-bit values.
       c l i e n t . v = UnsignedVec ( b i t l e n=L , dim=N)                           Multiple Outputs. Garbled circuits can also have mul-
       c l i e n t . v . i n p u t ( d e s c=” e n t e r v a l u e s f o r v ” )
                                                                                     tiple garbled output values written as comma separated list
       # input of server                                                             on the left side of the assignment operator, e.g., the garbled
       s e r v e r . w = UnsignedVec ( b i t l e n=L , dim=N)                        minimum value gv and its index gi can be computed as
       s e r v e r . w . i n p u t ( d e s c=” e n t e r v a l u e s f o r w” )
                                                                                     (client.gv, client.gi)=client.gx.min_value_index().
       # c o n v e r t u n s i g n e d t o homomorphic v e c t o r                      Circuits from File. TASTY allows secure evaluation of
       c l i e n t . hv = HomomorphicVec ( v a l=c l i e n t . v )                   boolean circuits read from an external file, e.g., circuits gen-
       s e r v e r . hv
and ECRYPT II [19] as shown in Table 4. By using matching           as random oracle. The elliptic curve implementation pro-
basic primitives both security and efficiency are optimized         vides (optional) point compression to reduce communication
simultaneously. We use elliptic curves from the SECG stan-          at the cost of a negligibly larger computation overhead.
dard [53] and SHA-256 as cryptographic hash function.                 Compiler Optimizations. TASTY parses the TASTYL
                                                                    program and performs several optimizations on the result-
                                                                    ing abstract syntax tree (AST): constant propagation, dead
 Table 4: Pre-Defined Security Levels in TASTY.                     code elimination, partial code evaluation, and loop unrolling.
                Symmetric/
 Security Level              Asymmetric Curve [53]
                 Statistical                                        5.    PERFORMANCE MEASUREMENTS
 ultra-short       80 bit     1,248 bit  secp160r1                     We measure the performance of primitives implemented in
 short             96 bit     1,776 bit  secp192r1                  TASTY and compare different protocols against each other
 medium           112 bit     2,432 bit  secp224r1                  and with existing SFE implementations: multiplication cir-
 long             128 bit     3,248 bit  secp256r1                  cuits and protocols based on GC or HE (§5.1), SFE of an
                                                                    AES circuit generated by the Fairplay compiler (§5.2), and
                                                                    SFE of large GCs (§5.3).
   Homomorphic Encryption (HE). We use the addi-                       System Setup. All performance measurements are per-
tively homomorphic cryptosystem of Paillier [45]. As key            formed on two desktop PCs with Intel Core 2 Duo CPU
generation for Paillier (an RSA modulus n) is computation-          (E6850) running at 3.00GHz and 4GB RAM connected via
ally expensive and can be used over multiple protocol runs,         Gigabit Ethernet. The system runs on 64 bit Gentoo Linux
the public key is generated and exchanged in the analyza-           with Python version 2.6.5, gmpy version 1.11 and GMP ver-
tion phase. For efficient encryption we use the extensions          sion 4.3.2. Unless stated otherwise, all measurements were
of [12, Sect. 6] for pre-computing expensive modular expo-          performed for short-term security (cf. Table 4) and using
nentiations of the form rn mod n2 in the setup phase and            point compression for elliptic curves (cf. §4.3).
only two modular multiplications per encryption in the on-
line phase. As C knows the factorization p, q of n, he uses         5.1     Multiplication Circuits and Protocols
Chinese remaindering modulo p and q for pre-computing rn               As arithmetic circuits can express arbitrary computations
mod n2 and efficient decryption. Paillier ciphertexts have          as sequence of additions and multiplications, multiplication
twice the length of the asymmetric security parameter as            is a fundamental basic operation. Indeed, the main differ-
the ciphertext space is Z∗n2 . For modular arithmetics we use       ence between SFE protocols based on arithmetic and boolean
gmpy [21], a Python wrapper for the GMP library [20].               circuits is the cost for multiplications. We present efficient
   Garbled Circuits (GC). We use the GC construction                multiplication circuits in §5.1.1 and compare the perfor-
with free XORs and garbled row reduction of [48] secure in          mance of secure multiplication protocols in §5.1.2.
the random-oracle model. This GC construction provides
free XOR gates (no garbled table and negligible computa-            5.1.1    Multiplication Circuits
tion). For non-XOR d-input gates, the garbled table consists           Textbook Multiplication. The usual way of multi-
of 2d − 1 entries (of size t + 1 bit each with symmetric se-        plying two unsigned `-bit integers x and y, called “Text-
curity parameter t), creation requires 2d and evaluation 1          book Method”, multiplies x with each bit of y and adds
invocation of SHA-256 modeled as random oracle.                     up all the
                                                                             Pproperly    shifted results according to the formula
                                                                               `−1      i                                        2
   Circuits. For computations on Garbled Values/Vectors,            x·y =      i=0 xyi 2 . This results in a circuit with 2` − `
TASTY dynamically generates circuits using the efficient            non-XOR 2-input gates [30].
circuit constructions of [30] which are optimized for a low            Karatsuba Multiplication. As observed by Karatsuba
number of non-XOR gates (cf. §5.1.1 for multiplication cir-         [28], multiplication can be performed more efficiently using
cuits). Alternatively, circuits can be generated externally,        the following recursive method (details in Algorithm 1): x
e.g., using the Fairplay compiler [37], and read from a file (cf.   and y are split into two halves as x = xh 2d`/2e + xl and
§4.1.2). TASTY optimizes the circuits to a low number of            y = yh 2d`/2e + yl . Then, the product can be computed as
non-XOR gates using the optimization of [48] which replaces         xy = (xh 2d`/2e +xl )(yh 2d`/2e +yl ) = zh 22d`/2e +zd 2d`/2e +zl .
3-input gates with a low number of 2-input non-XOR gates.           After computing zh = xh yh and zl = xl yl , zd can be com-
XNOR gates are replaced by an XOR gate and an inversion             puted with only one multiplication as zd = (xh + xl )(yh +
gate which is propagated into successor gates [46]. Generat-        yl ) − zh − zl . This process is continued recursively until
ing, reading, and optimizing circuits is mostly pipelined to        the numbers are sufficiently small (` = 19 in our case as
allow processing of large circuits with low memory footprint.       described below) and multiplied with the classical school
   Oblivious Transfer (OT). All OTs are pre-computed in             method. Overall, multiplying two ` bit numbers with Karat-
the setup phase (cf. Fig. 5) using the construction of [2]; the     suba’s method requires three multiplications of `/2 bit num-
resulting online phase for OT is highly efficient (transfer and     bers and some additions and subtractions with linear bit
XOR of bitstrings) and depends mostly on the network la-            complexity resulting in costs
tency for two messages. To minimize the computation com-
                                                                                  TKara (`) = 3TKara (`/2) + c` + d
plexity of the setup phase, we use the efficient OT extension
of [24] to reduce the usually large number of OTs needed            for constants c and d. The master theorem [8, §4.3f] yields
in the protocol down to at most t real OTs and some in-             asymptotic complexity TKara (`) ∈ O(`log2 3 ) ≈ O(`1.585 ).
vocations of SHA-256 modeled as random oracle, where t is             Circuit Complexity. In TASTY we have implemented
the symmetric (computational) security parameter. The re-           both methods for multiplication based on efficient addition
maining real OTs (at most t) are implemented with the OT            and subtraction circuits of [30]. As shown in Fig. 8 and Ta-
protocol of [39, Sect. 3.1] using elliptic curves and SHA-256       ble 5, Karatsuba multiplication is more efficient, i.e., results
Algorithm 1 Karatsuba multiplication                              consider the case where both inputs are provided by one
 1: function karatsuba(x, y)          . x, y are `-bit integers   party (S for GC1 and C for HE1), or one by each of the
 2:    if ` ≤ 19 then                                             parties (GC2 and HE2). The inputs are Unsigned `-bit val-
 3:         return Textbook(x, y)                                 ues and the output, a 2`-bit Unsigned value is converted
 4:    end if                                                     into a Plain output for C. In the following, we compare
 5:    xh ||xl ← x                        . x = xh 2d`/2e + xl    the communication- and the computation complexity of the
 6:    yh ||yl ← y                         . y = yh 2d`/2e + yl   setup- and online phase of the protocols.
 7:    Ph ← KARATSUBA(xh , yh )                                                   1e+07
 8:    Pl ← KARATSUBA(yl , yl )                                                                HE1: Online
                                                                                               HE2: Online

 9:    xs ← xh + xl                                                                            GC1: Setup
                                                                                               GC1: Online
                                                                                               GC2: Setup
10:    ys ← yh + yl                                                               1e+06
                                                                                               GC2: Online

11:    Ps ← KARATSUBA(xs , ys )
12:    Pd ← Ps − Ph − Pl                                                          100000

                                                                  Data in Bytes
13:    return (Ph 22d`/2 e) + Pd 2d`/2e + Pl
14: end function
                                                                                  10000

in circuits with less non-XOR gates, than Textbook multipli-                       1000

cation already for multiplication of 20 bit operands. By in-
terpolating through the points for bitlength ` ∈ {32, 64, 128}                       100
                                                                                           1                 2   4   8               16   32   64   128
and solving the resulting system of linear equations we ob-                                                              Bitlength

tain as approximation for the number of non-XOR gates
           TKara (`) ≈ 9.0165`1.585 − 13.375` − 34.               Figure 9: Multiplication Protocols: Communication

                                                                     Communication (cf. Fig. 9). Our experiments show
                                                                  that GC-based multiplication requires a substantial amount
                                                                  of setup communication (for transfer of GCs) whereas the
                                                                  online communication of GC is better than HE for mul-
                                                                  tiplication of small values. The online communication for
                                                                  multiplying with HE is independent of the bitlength ` as a
                                                                  constant number of ciphertexts (2 for HE1 and 5 for HE2) is
                                                                  exchanged. For multiplying with GC, the setup communica-
                                                                  tion grows rapidly due to the large size of the GCs, whereas
                                                                  the online communication complexity grows much slower.
                                                                     Setup Time (cf. Fig. 10(a)). The time of the setup
                                                                  phase for GC-based multiplication protocols depends on the
                                                                  bitlength ` as GCs need to be computed; for better visual-
                                                                  ization we do not plot GC setup times for S in Fig. 10(a) as
                                                                  they are similar to those of C. For HE-based multiplication,
                                                                  the setup time is independent of ` as a constant number of
                                                                  encryptions is pre-computed.
        Figure 8: Size of Multiplication Circuits                    Online Time (cf. Fig. 10(b)). For GC-based multi-
                                                                  plication, the time needed by C depends on the size of the
                                                                  evaluated GC which grows with the bitlength `; GC’s online
                                                                  time for S is negligible. For HE-based multiplication, the
                                                                  time in the online phase is almost independent of ` for small
Table 5: Size of Multiplication Circuits (in number               bitlengths.
of 2-input non-XOR gates)                                            Conclusion. The setup phase for GC-based multiplica-
                                                                  tion is substantially more expensive than that of HE-based
                                                                  multiplication. However, for small values, GC-based multi-
 Bitlength `        19      20        32        64      128       plication can result in a faster online time than HE-based
 Textbook          703     780      2,016     8,128    32,640     multiplication. Furthermore, GC-based multiplication, in
 Karatsuba         703     721      1,729     5,683    17,973     contrast to HE-based multiplication, needs no (when com-
 Improvement      0.0 %   7.6 %    14.2 %    30.1 %    44.9 %     posed with other GC-based computations) or negligible on-
                                                                  line interaction and workload for S.
                                                                     Parallel Multiplications. When N multiplications are
                                                                  done in parallel, e.g., component-wise multiplication of two
5.1.2     Multiplication Protocols                                vectors of N components, time and data complexity of GC-
  Using TASTY we compare the performance of different             based multiplication grows linearly in N . HE-based paral-
secure multiplication protocols based on homomorphic en-          lel multiplication increases slower as multiple homomorphic
cryption (HE) and garbled circuits (GC). For this we con-         values can be packed before sending from S to C (cf. §2.1).
structed four basic test cases. For each SFE paradigm, we            Security Level. We note that when the security level is
10000
                                 HE1: C
                                 HE1: S
                                 HE2: C
                                                                                             Table 6: GC Evaluation of AES. Times in seconds.
                                 HE2: S
                     1000        GC1: C
                                 GC2: C
                                                                                                                                Time              KByte
                                                                                                          Security     Setup    Online    Total   Total
                                                                                               [37]      ultra-short     -        -         4      3760
 Setup Time in ms

                      100

                                                                                              TASTY      ultra-short    2.9      0.4       3.3      567
                       10
                                                                                               [48]          long        2        5         7       503
                                                                                              TASTY          long       4.0      0.5       4.5      860
                        1

                                                                                             than that of [48] by an order of magnitude. Recall, a short
                       0.1
                             1            2   4         8               16   32   64   128
                                                                                             online phase, i.e., latency from providing the inputs until
                                                            Bitlength                        obtaining the outputs, is important for many real-world ap-
                                                  (a) Setup Time                             plications. To minimize this, TASTY shifts most computa-
                                                                                             tions into the less time-critical setup phase (cf. §4). Also
                     1000
                                HE1: C
                                                                                             TASTY has a slightly shorter total time than [48], whereas
                                HE1: S
                                HE2: C
                                                                                             the data complexity is slightly larger due to less optimal
                                HE2: S
                                GC1: C                                                       data serialization in Python. More detailed, the setup time
                                GC1: S
                      100       GC2: C
                                GC2: S
                                                                                             of [48] is 1s for GC creation and 1s for data transfer, and
                                                                                             the online time is 3s for OT7 and 2s for GC evaluation. In
 Online Time in ms

                                                                                             TASTY the setup time is dominated by 1.1s for OT and
                       10                                                                    1.8s for GC creation, and the online time is dominated by
                                                                                             0.4s for GC evaluation.

                        1                                                                    5.3   Evaluation of Large GCs
                                                                                                As AES, compared in §5.2, is the only non-trivial circuit
                                                                                             for which performance results are reported in [48] and the
                      0.1
                            1             2   4        8                16   32   64   128   code is not available yet, we compare the performance of
                                                            Bitlength
                                                                                             TASTY for evaluation of large GCs with the original Fair-
                                                  (b) Online Time                            play system [37] implemented in Java.
                                                                                                Our test circuits have 10 unsigned 8-bit input values8 pro-
                                                                                             vided by C, |C| non-XOR 2-input gates and one output bit
                      Figure 10: Multiplication Protocols: Times                             for S. For our measurements we use ultra-short term secu-
                                                                                             rity parameters which provide the same security as Fairplay.
                                                                                             Our measurement results are shown in Table 7.
increased to medium- or even long-term security, the perfor-                                    Comparison. Memory. With 4GB memory, Fairplay
mance of HE-based multiplication decreases rapidly while                                     was not able to evaluate circuits with 222 gates and raised
the performance of GC-based multiplication is affected only                                  an OutOfMemoryError, while TASTY ran out of memory
moderately, as the asymmetric security parameter grows                                       for circuits with 223 gates. In both cases, the huge blowup
substantially faster than the symmetric one (cf. Table 4).                                   in memory is due to allocation of intermediate objects for
                                                                                             each gate.
5.2                          Evaluation of Fairplay Circuits and AES                            Communication. As expected, TASTY needs less com-
                                                                                             munication than Fairplay as the chosen GC technique needs
   As described in §4.1.2, TASTY can evaluate externally
                                                                                             3 instead of 4 entries per garbled 2-input non-XOR gate.
generated file circuits. Using this feature, we compare the
                                                                                             Further, as GC is transferred in the setup phase, TASTYś
performance of TASTY for evaluation of the AES function-
                                                                                             online communication is independent of the circuit size.
ality with the state of the art software implementation of
                                                                                                Time. TASTY’s setup time is slower than Fairplay’s to-
GCs reported in [48, Table 2] which is implemented in C++
                                                                                             tal time by approximately a factor of 2 which we assume
and measured on two machines also with Intel Core 2 Duo’s
                                                                                             is due to less efficient internal data structures and nested
running at 3.0 GHz and 4GB of RAM connected by gigabit
                                                                                             function calls in the completely interpreted Python language
ethernet. We use the AES circuit of [48] which has 128 bit
                                                                                             compared to bytecode-compiled Java. On the other hand,
input bits provided by each party, 128 output bits for C and
                                                                                             TASTY’s online time is faster than the total time of Fairplay
is optimized for a low number of non-XOR gates (22, 594
                                                                                             by factor 2 as we shifted complexity into the setup phase.
XOR gates and 11, 286 non-XOR 2-input gates).
                                                                                                Conclusion. From the performance measurements with
   The performance of different GC implementations for eval-
                                                                                             previous GC implementations in §5.2 and §5.3 we conclude
uating the AES functionality is compared in Table 6:
                                                                                             that choosing more efficient primitives has large impact on
   For ultra-short-term security, when evaluating AES with
                                                                                             the communication complexity whereas the memory and
Fairplay’s Java runtime [37], we see that Fairplay requires
                                                                                             time complexity are dominated by non-cryptographic factors
substantially more communication than TASTY, as Fairplay
provides no free XOR gates (2/3 of the gates are XOR gates).                                 7
                                                                                               As OT seemed not to be the performance bottleneck in [48],
Also TASTY’s time complexity is slightly better than that                                    they implemented a less efficient, UC secure OT protocol.
of Fairplay due to free XOR and more efficient OT.                                           8
                                                                                               This yields the maximum number of 80 OTs before the OT
   Also for long-term security, TASTY’s online phase is faster                               extension kicks in which is not provided by Fairplay.
SFE protocols can be adapted to this streaming scenario
            Table 7: Evaluation of Large GCs                       as described in [26, 27, 51]: In contrast to the compilation
                Time [s]         Communication [MB]
                                                                   paradigm used in Fairplay [37, 3], TASTY already gener-
                     TASTY                  TASTY                  ates the circuits on-the-fly gate-by-gate. Fore each gate, the
     |C|   [37]                  [37]
                Setup Online           Setup Online                garbled table can be generated on-the-fly [26], sent over the
     210     2   0.8     0.3    0.155   0.07    0.003              network and evaluated directly [27]. Also OT can be ex-
     215     4     6      1      3.67    1.7    0.003              tended on-the-fly as mentioned in [24].
     218    17    44       8     28.8   13.5    0.003
     220    80   177      32     113    54.2    0.003
     221   175   368      65     226    109     0.003              7.   ACKNOWLEDGEMENTS
     222     -   787     132       -    217     0.003                We thank anonymous reviewers of ACM CCS’10 for their
     223     -     -       -       -      -       -                helpful comments and Benny Pinkas for his suggestions on
                                                                   an early version of this paper. This work was funded by EU
                                                                   FP7 project CACE and ECRYPT II.
such as optimizing data structures and flow for the selected
programming language.                                              8.   REFERENCES
                                                                    [1] M. Barni, P. Failla, V. Kolesnikov, R. Lazzeretti,
6.     FUTURE WORK                                                      A.-R. Sadeghi, and T. Schneider. Secure evaluation of
   To facilitate future work, TASTY is available for down-              private linear branching programs with medical
load at [56]. It is ready for being used as a tool for describ-         applications. In European Symposium on Research in
ing, implementing, benchmarking, and comparing protocols                Computer Security (ESORICS’09), volume 5789 of
for many privacy-preserving applications. It could also be              LNCS, pages 424–439. Springer, 2009.
extended into a platform for comparing cryptographic primi-         [2] D. Beaver. Precomputing oblivious transfer. In
tives, e.g., rapidly emerging (fully) homomorphic encryption            Advances in Cryptology – CRYPTO’95, volume 963 of
schemes.                                                                LNCS, pages 97–109. Springer, 1995.
   Further Primitives. By adding 1-out-of-n OT as fur-              [3] A. Ben-David, N. Nisan, and B. Pinkas. FairplayMP:
ther primitive, TASTY could be used for Hamming distance                a system for secure multi-party computation. In ACM
based computations [25] (based on HE and 1-out-of-n OT)                 Conference on Computer and Communications
with application to secure face identification [44]. Also other         Security (ACM CCS’08), pages 257–266. ACM, 2008.
additively homomorphic encryption schemes for large [12] or             http://fairplayproject.net/fairplayMP.html.
small [9, 10] ciphertext space, or the schemes of [6, 18] which     [4] M. Ben-Or, S. Goldwasser, and A. Wigderson.
allow arbitrary many additions and one multiplication might             Completeness theorems for non-cryptographic
be useful for some protocols. If applications require multi-            fault-tolerant distributed computation. In Symp. on
plication of very large numbers within a circuit one might              Theory of Comp. (STOC’88), pages 1–10. ACM, 1988.
consider implementation of multiplication circuits which are        [5] D. Bogdanov, S. Laur, and J. Willemson. Sharemind:
asymptotically faster than Karatsuba multiplication, e.g.,
                                                                        A framework for fast privacy-preserving computations.
Toom-3 [29, Sect. 4.3.3.A] splits each factor into 3 parts and
                                                                        In European Symposium on Research in Computer
performs 5 instead of 9 sub-multiplications resulting in com-           Security (ESORICS’08), volume 5283 of LNCS, pages
plexity Θ(nlog3 5 ) ≈ Θ(n1.47 ).                                        192–206. Springer, 2008.
   Compilation to TASTYL. As a long-term goal it would
                                                                    [6] D. Boneh, E.-J. Goh, and K. Nissim. Evaluating
be beneficial to automatically generate TASTYL programs
                                                                        2-DNF formulas on ciphertexts. In Theory of
from a high-level description of the algorithm to be com-
                                                                        Cryptography (TCC’05), volume 3378 of LNCS, pages
puted securely in a function description language such as
                                                                        325–341. Springer, 2005.
Fairplay’s SFDL language (cf. §1.2). Using TASTY’s ca-
pabilities for measuring the costs of the generated protocols       [7] J. Brickell, D. E. Porter, V. Shmatikov, and
the compilation process could automatically choose between              E. Witchel. Privacy-preserving remote diagnostics. In
circuits or homomorphic encryption (with one out of multi-              ACM Conference on Computer and Communications
ple homomorphic encryption schemes) for specific sub-tasks              Security (ACM CCS’07), pages 498–507. ACM, 2007.
to generate highly efficient protocols.                             [8] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and
   Streaming vs. Pre-Computation. As discussed in §4,                   C. Stein. Introduction to Algorithms, Second Edition.
a crucial design goal of TASTY is to shift as many operations           MIT Press and McGraw-Hill Book Company, 2001.
as possible into the setup phase resulting in a highly efficient    [9] I. Damgård, M. Geisler, and M. Krøigaard. Efficient
online phase. This paradigm is justified when the two par-              and secure comparison for on-line auctions. In
ties have a long-term relation where the function is known              Australasian Conference on Information Security and
in advance and pre-computations can be performed ahead                  Privacy (ACISP’07), volume 4586 of LNCS, pages
of time; this leads to a quick response time once the parties           416–430. Springer, 2007.
provide their inputs in the online phase. In some applica-         [10] I. Damgård, M. Geisler, and M. Krøigaard. A
tion scenarios where parties make ad-hoc decision when and              correction to “Efficient and Secure Comparison for
what to compute securely, such pre-computations are not                 On-Line Auctions”. Cryptology ePrint Archive, Report
possible. HE-based SFE protocols naturally allow to change              2008/321, 2008. http://eprint.iacr.org.
the evaluated functionality on-the-fly and stream data as          [11] I. Damgård, M. Geisler, M. Krøigård, and J. B.
soon as it is ready to keep CPUs and network busy simulta-              Nielsen. Asynchronous multiparty computation:
neously (e.g., as implemented in VIFF [11]). Also GC-based              Theory and implementation. In Public Key
Cryptography (PKC’09), volume 5443 of LNCS, pages             383–397. Springer, 2010.
       160–179. Springer, 2009. http://viff.dk.                 [28] A. Karatsuba and Y. Ofman. Multiplication of
[12]   I. Damgård and M. Jurik. A generalisation, a                 many-digital numbers by automatic computers. SSSR
       simplification and some applications of paillier’s            Academy of Sciences, 145:293–294, 1962.
       probabilistic public-key system. In Public-Key           [29] D. E. Knuth. The Art of Computer Programming,
       Cryptography (PKC’01), volume 1992 of LNCS, pages             Volume II: Seminumerical Algorithms, 3rd Edition.
       119–136. Springer, 2001.                                      Addison-Wesley, 1997.
[13]   M. Dijk, C. Gentry, S. Halevi, and                       [30] V. Kolesnikov, A.-R. Sadeghi, and T. Schneider.
       V. Vaikuntanathan. Fully homomorphic encryption               Improved garbled circuit building blocks and
       over the integers. In Advances in Cryptology –                applications to auctions and computing minima. In
       EUROCRYPT’10, LNCS, pages 24–43. Springer, 2010.              Cryptology and Network Security (CANS’09), volume
[14]   T. El Gamal. A public key cryptosystem and a                  5888 of LNCS, pages 1–20. Springer, 2009.
       signature scheme based on discrete logarithms. In        [31] V. Kolesnikov, A.-R. Sadeghi, and T. Schneider. A
       Advances in Cryptology – CRYPTO’84, volume 196 of             systematic approach to practically efficient general
       LNCS, pages 10–18. Springer, 1985.                            two-party secure function evaluation protocols and
[15]   Z. Erkin, M. Franz, J. Guajardo, S. Katzenbeisser,            their modular design. Journal of Computer Security
       I. Lagendijk, and T. Toft. Privacy-preserving face            (JCS), 21(2):283–315, 01 2013. Preliminary version
       recognition. In Privacy Enhancing Technologies                available at http://eprint.iacr.org/2010/079.
       (PET’09), volume 5672 of LNCS, pages 235–253.            [32] V. Kolesnikov and T. Schneider. A practical universal
       Springer, 2009.                                               circuit construction and secure evaluation of private
[16]   M. J. Freedman, K. Nissim, and B. Pinkas. Efficient           functions. In Financial Cryptography and Data
       private matching and set intersection. In Advances in         Security (FC’08), volume 5143 of LNCS, pages 83–97.
       Cryptology – EUROCRYPT’04, volume 3027 of LNCS,               Springer, 2008.
       pages 1–19. Springer, 2004.                              [33] Y. Lindell and B. Pinkas. A proof of Yao’s protocol for
[17]   C. Gentry. Fully homomorphic encryption using ideal           secure two-party computation. Journal of Cryptology,
       lattices. In ACM Symposium on Theory of Computing             22(2):161–188, 2009. Cryptology ePrint Archive:
       (STOC’09), pages 169–178. ACM, 2009.                          Report 2004/175.
[18]   C. Gentry, S. Halevi, and V. Vaikuntanathan. A           [34] Y. Lindell and B. Pinkas. Secure multiparty
       simple BGN-type cryptosystem from LWE. In                     computation for privacy-preserving data mining. J. of
       Advances in Cryptology – EUROCRYPT’10, volume                 Privacy and Confidentiality, 1(1):59–98, 2009.
       6110 of LNCS, pages 506–522. Springer, 2010.             [35] Y. Lindell, B. Pinkas, and N. Smart. Implementing
[19]   D. Giry and J.-J. Quisquater. Cryptographic key               two-party computation efficiently with security against
       length recommendation, 2010. http://keylength.com.            malicious adversaries. In Security and Cryptography
[20]   GMP – GNU multi precision arithmetic library.                 for Networks (SCN’08), volume 5229 of LNCS, pages
       http://gmplib.org.                                            2–20. Springer, 2008.
[21]   gmpy – multiprecision arithmetic for Python.             [36] P. MacKenzie, A. Oprea, and M. K. Reiter. Automatic
       http://code.google.com/p/gmpy.                                generation of two-party computations. In ACM
[22]   Gnuplot.py. http://gnuplot-py.sourceforge.net.                Conference on Computer and Communications
[23]   W. Henecka, S. Kögl, A.-R. Sadeghi, T. Schneider,            Security (ACM CCS’03), pages 210–219. ACM, 2003.
       and I. Wehrenberg. TASTY: Tool for Automating            [37] D. Malkhi, N. Nisan, B. Pinkas, and Y. Sella. Fairplay
       Secure Two-partY computations. In ACM Conference              — a secure two-party computation system. In
       on Computer and Communications Security (ACM                  USENIX Security Symposium, 2004.
       CCS’10), pages 451–462. ACM, 2010.                            http://fairplayproject.net/fairplay.html.
[24]   Y. Ishai, J. Kilian, K. Nissim, and E. Petrank.          [38] T. Moran. The qilin crypto SDK – an open-source
       Extending oblivious transfers efficiently. In Advances        java SDK for rapid prototyping of cryptographic
       in Cryptology – CRYPTO’03, volume 2729 of LNCS,               protocols. http://qilin.seas.harvard.edu.
       pages 145–161. Springer, 2003.                           [39] M. Naor and B. Pinkas. Efficient oblivious transfer
[25]   A. Jarrous and B. Pinkas. Secure hamming distance             protocols. In ACM-SIAM Symposium On Discrete
       based computation and its applications. In Applied            Algorithms (SODA’01), pages 448–457. Society for
       Cryptography and Network Security (ACNS’09),                  Industrial and Applied Mathematics, 2001.
       volume 5536 of LNCS, pages 107–124. Springer, 2009.      [40] M. Naor, B. Pinkas, and R. Sumner. Privacy
[26]   K. Järvinen, V. Kolesnikov, A.-R. Sadeghi, and               preserving auctions and mechanism design. In ACM
       T. Schneider. Embedded SFE: Offloading server and             Conf. on Electronic Commerce, pages 129–139, 1999.
       network using hardware tokens. In Financial              [41] J. D. Nielsen. Languages for Secure Multiparty
       Cryptography and Data Security (FC’10), volume 6052           Computation and Towards Strongly Typed Macros.
       of LNCS, pages 207–221. Springer, 2010.                       PhD thesis, University of Aarhus, Denmark, 2009.
[27]   K. Järvinen, V. Kolesnikov, A.-R. Sadeghi, and          [42] J. D. Nielsen and M. I. Schwartzbach. A
       T. Schneider. Garbled circuits for leakage-resilience:        domain-specific programming language for secure
       Hardware implementation and evaluation of one-time            multiparty computation. In Workshop on
       programs. In Cryptographic Hardware and Embedded              Programming Languages and Analysis for Security
       Systems (CHES’10), volume 6225 of LNCS, pages                 (PLAS’07), pages 21–30. ACM, 2007.
[43] J. Nzouonta, M. C. Silaghi, and M. Yokoo. Secure                (STOC’76), pages 196–203. ACM, 1976.
     computation for combinatorial auctions and market          [59] A. C. Yao. Protocols for secure computations. In
     exchanges. In Autonomous Agents and Multiagent                  IEEE Symposium on Foundations of Computer
     Systems (AAMAS’04), pages 1398–1399. IEEE, 2004.                Science (FOCS’82), pages 160–164. IEEE, 1982.
[44] M. Osadchy, B. Pinkas, A. Jarrous, and B. Moskovich.       [60] A. C. Yao. How to generate and exchange secrets. In
     SCiFI - a system for secure face identification. In             IEEE Symposium on Foundations of Computer
     IEEE Symposium on Security & Privacy (S&P’10),                  Science (FOCS’86), pages 162–167. IEEE, 1986.
     pages 239–254. IEEE, 2010.
[45] P. Paillier. Public-key cryptosystems based on             APPENDIX
     composite degree residuosity classes. In Advances in
     Cryptology – EUROCRYPT’99, volume 1592 of LNCS,            A.     SET INTERSECTION (§3.1) IN TASTYL
     pages 223–238. Springer, 1999.
[46] A. Paus, A.-R. Sadeghi, and T. Schneider. Practical        # −∗− c o d i n g : u t f −8 −∗−
     secure evaluation of semi-private functions. In Applied    from t a s t y . c r y p t . math import \
                                                                    getPolyCoefficients
     Cryptography and Network Security (ACNS’09),
     volume 5536 of LNCS, pages 89–106. Springer, 2009.         def p r o t o c o l ( c , s ) :
[47] T. P. Pedersen. Non-interactive and                            M = 100 # s i z e o f c l i e n t ’ s s e t
     information-theoretic secure verifiable secret sharing.        N = 100 # s i z e o f s e r v e r ’ s s e t
     In Advances in Cryptology – CRYPTO’91, volume 576
                                                                      c .X = ModularVec ( dim=M) . i n p u t ( d e s c=”X” )
     of LNCS, pages 129–140. Springer, 1992.                          s .Y = ModularVec ( dim=N) . i n p u t ( d e s c=”Y” )
[48] B. Pinkas, T. Schneider, N. P. Smart, and S. C.
     Williams. Secure two-party computation is practical.            # i n t e r p o l a t e c o e f f s o f p o l y w i t h r o o t s c .X
     In Advances in Cryptology – ASIACRYPT’09, volume                c . a = g e t P o l y C o e f f i c i e n t s ( ) ( c .X)
     5912 of LNCS, pages 250–267. Springer, 2009.
                                                                     # e n c r y p t and send c o e f f i c i e n t s t o s e r v e r
[49] A.-R. Sadeghi and T. Schneider. Generalized universal           s . ha
s . u = SignedVec ( b i t l e n =8 , dim=(K, N) ) . i n p u t ( )
    s . tau = Unsigned ( b i t l e n =50) . i n p u t ( )                   from t a s t y . t y p e s . d r i v e r import D r i v e r
                                                                            from t a s t y import u t i l s
    # Projection
    s . hgamma  tasty_post analyze.py \                                  t a s t y p l o t ( graph name , x l a b e l , y l a b e l ,
                                results/costs.bin                                     x v a l u e s , y v a l u e s , l e g e n d=” i n s i d e ” ,
                                                                                      o u t f i l e=” m u l t i p l i c a t i o n t i m e . p d f ” )
  The resulting graph is shown in Fig. 13. As expected, the
costs are similar to HE2 in Fig. 10(b).
                                                                            Figure 12: Postprocessing Script To Draw Graph
9
 To see a list of all costs which were measured use                         (analyze.py)
tasty post − i our example/results/costs.bin.
Times for HE Multiplication
             12
                      C Online
                      S Online

             10

              8
Time in ms

              6

              4

              2

              0
                  0              10        20          30               40                50   60   70   80
                                                                     bitlength

                                  Figure 13: Graph generated by TASTY(multiplication time.pdf )
You can also read