Robust P2P Primitives Using SGX Enclaves - Cryptology ...

Page created by Sara Joseph
 
CONTINUE READING
Robust P2P Primitives Using SGX Enclaves
Yaoqi Jia1,3       Shruti Tople1,4          Tarik Moataz2             Deli Gong1,5         Prateek Saxena1          Zhenkai Liang1
               1 NationalUniversity of Singapore {jiayaoqi, shruti90, gongdeli, prateeks, liangzk}@comp.nus.edu.sg
       2 Brown    University tarik moataz@brown.edu 3 Parity Technologies 4 Microsoft Research 5 Zilliqa Research

   Abstract—Peer-to-peer (P2P) systems such as BitTorrent and             Researchers have extensively worked in the byzantine model
Bitcoin are susceptible to serious attacks from byzantine nodes        to design solutions for fundamental P2P problems such as reli-
that join as peers. Due to well-known impossibility results            able broadcast and agreement among the peers [81], [65], [27],
for designing P2P primitives in unrestricted byzantine settings,
research has explored many adversarial models with additional          [52], [53], [25], [17], [18]. There are well-known impossibility
assumptions, ranging from mild (such as pre-established PKI)           results in the standard model of byzantine setting, such as
to strong (such as the existence of common random coins).              the inability to achieve reliable broadcast or agreement when
One such widely-studied model is the general-omission model,           over 13 of the network is byzantine [65], [81]. In a quest for
which yields simple protocols with good efficiency, but has been       efficient protocols that tolerate a larger fraction of malicious
considered impractical or unrealizable since it artificially limits
the adversary only to omitting messages.                               nodes, several failure models have been proposed which limit
   In this work, we study the setting of a synchronous network         the capabilities of the byzantine adversaries. For instance, one
wherein peer nodes have CPUs equipped with a recent trusted            such model is the general-omission model where the byzantine
computing mechanism called Intel SGX. In this model, we                node can only omit messages that are either sent or received by
observe that the byzantine adversary reduces to the adversary          it during the execution of a protocol [82], [79]. In this weaker
in the general-omission model. As a first result, we show
that by leveraging SGX features, we eliminate any source of
                                                                       adversarial model, it is possible to tolerate N2 adversarial
advantage for a byzantine adversary beyond that gained by              nodes and design relatively simple and efficient protocols for
omitting messages, making the general-omission model realizable.       reliable broadcast [82], [79], [40], [58]. However, many of
Second, we present new protocols that improve the commu-               these models make strong assumptions, which are not always
nication complexity of two fundamental primitives — reliable           realistic and have not had a concrete basis for implementation.
broadcast and common random coins (or beacons) — in the
synchronous general-omission model, by utilizing SGX features.
                                                                       Our approach. To this end, we study the possibility of using
Our evaluation of 1000 nodes running on 40 DeterLab machines           recent hardware-root-of-trust mechanisms for making previous
confirms theoretical efficiency claim.                                 adversarial models realizable in practical systems. We ob-
                                                                       serve that emerging hardware, specifically Intel SGX, provides
                       I. I NTRODUCTION                                stronger trusted computing capabilities, which allow running
                                                                       hardware-attested user-level enclaves on commodity OSes [7],
   Peer-to-peer systems such as BitTorrent [2], Symform [14],          [8], [45]. Enclaves provide hardware-isolated execution en-
CrashPlan [5], StorJ [13], Tor [15] and Bitcoin [1] are                vironment which guarantees that an application executing in
becoming popular among users due to ease of accessibility.             an enclave is tamper-resistant and can be attested remotely.
In such P2P systems, online users can simply volunteer as              Assuming that SGX-like capabilities become commodity and
peers (nodes) to join the network. However, this exact property        widescale in end hosts, we ask if it is feasible to build robust
allows adversarial or Sybil peers to be a part of the network          P2P protocols. Our main observation is that by leveraging
and exhibit a byzantine (malicious) behavior. The presence             the capabilities of such a trusted hardware, one can restrict
of byzantine adversaries is a major security concern in P2P            the behavior of byzantine adversaries to the general-omission
systems. For example, recently, researchers have demonstrated          model in synchronous networks [82], [79], [40], [58].
that in a popular cryptocurrency — Bitcoin — byzantine nodes              Specifically, we use four SGX features, i.e., enclave ex-
can collude to eclipse or partition the honest nodes leading to        ecution (F1), unbiased randomness (F2), remote attestation
double-spending and selfish mining attacks [59], [77]. Further,        (F3) and trusted elapsed time (F4). Based on these hardware
byzantine nodes in anonymous P2P networks can become                   features, we enforce six security properties (P1 - P6). First,
the entry and exit nodes of an honest node’s communication             we enforce execution integrity (P1), message integrity &
circuit, by advertising high-bandwidth connections and high-           authenticity (P2) and blind-box computation (P3) to restrict the
uptimes falsely [21]. These byzantine entry / exit nodes               attacker to not forge messages or deviate from the execution
can selectively deny service or severely weaken the core               of the given protocol. Thus, the adversarial node can only
anonymity properties of such systems as Tor, Cashmere and              delay, replay and omit messages. We further leverage lockstep
Hydra-Onions [30], [15]. In addition, byzantine nodes in the           execution (P5) and message freshness (P6) to reduce the adver-
network can selectively forge, divert, delay or drop messages          sarial model to the general-omission model, where byzantine
to disrupt the protocol execution. Therefore, designing robust         nodes have no additional advantage than omitting to send /
P2P protocols continues to be an important research problem            receive messages.In such model, P3 disallows the adversary
due to the attacks possible in a byzantine setting.                    to selectively omit messages based on the content. Lastly,
the halt-on-divergence (P4) allows us to detect and eliminate         Definition II.1. (Reliable Broadcast). A protocol for reliable
peers that selectively omit messages based on identities of           broadcast in synchronous settings satisfies the following con-
senders / receivers, thus in turn reducing round complexity and       ditions:
“sanitizing” the network. Leveraging these properties we can          • (Validity) If the sender is honest and broadcasts a message
further improve the efficiency of two fundamental protocols.            m, then all honest nodes eventually accept m.
We present efficient designs for reliably broadcasting messages       • (Agreement) If an honest node accepts m, then all honest
called Enclaved Reliable Broadcast (ERB) protocol and an                nodes eventually accept m.
unbiased common random generator called Enclaved Random               • (Integrity) For any message m, every honest node accepts m
Number Generator (ERNG) protocol. Both ERB and ERNG                     at most once, if m was previously broadcast by the sender.
primitives can be used as building blocks to solve a wide             • (Termination) Every honest node eventually accepts a mes-
range of problems in distributed systems, such as random                sage.
beacons [84], voting schemes [75], random walks [57], shared
                                                                        To define a common unbiased random number generator,
key generation [54], [55], cryptocurrency protocols [71] and
                                                                      we define the bias of any multi-variate function in a standard
load balancing protocols [46], [85].
                                                                      way [19].
Results. Our work targets synchronous network where every
machine is running an SGX-enabled CPU. Both of our pro-               Definition II.2. (Unbiasedness). Let G : {0, 1}k×N →
tocols asymptotically reduce the round and communication              {0, 1}k be a deterministic multi-variate function that maps
complexity as compared to previous works in the byzantine             N elements in {0, 1}k to one element in {0, 1}k . We define
model, and match with (or outperform) the results in general-         bias of G, β(G) below:
omission model. For a network of size N , the round and
                                                                                                                      
                                                                                                        E[S] EG [S] 
communication complexity for ERB are min{f + 2, t + 2}                         β(G) = max        max          ,          ,
                                                                                      S⊆{0,1}k         EG [S] E[S]
and O(N 2 ), where t / f (f ≤ t < N2 ) is the number of
byzantine peers / peers actually behaving maliciously for one           where EG [S] is the expected number of values in
execution of ERB. The communication complexity of the                 G(x1 , · · · , xN ) ∈ S, and E[S] = |S|
                                                                                                           2k
                                                                                                              , which is the expected
basic ERNG is O(N 3 ), and the optimized ERNG further                 value when the output of G is distributed uniformly at random.
reduces the complexity to O(N log N ). We have implemented            Definition II.3. (Common Unbiased Random Number). A
our solution and the source code is available online [10]. We         protocol G generates a common unbiased random number r
evaluate both ERB and ERNG, and our experimental results              among N nodes if it satisfies the following conditions with
match our theoretical claims.                                         high probability (w.h.p.):
Contributions. The main contributions of this paper are:              • (Agreement) At the end of the protocol, all the honest nodes
• Realizable General-Omission Model.        We leverage SGX             agree on the same value r.
  features to reduce byzantine model to general-omission              • (Unbiasedness) The bias of β(G) = 1.
  model, where byzantine nodes have no extra advantage than             For the analysis of protocols, we define the following
  omitting messages.                                                  complexities with respect to a single execution of the protocol.
• Better Synchronous P2P Protocols. By enforcing our prop-            • The message / communication complexity is defined as the
  erties, we can improve the efficiency of P2P protocols. As            total number of messages / bits transferred among all nodes
  the first attempt, we propose efficient protocols for reliable        in the worst case.
  broadcast (ERB) and unbiased random number generation               • The round complexity is defined as the number of executed
  (ERNG).                                                               rounds (or steps) in the worst-case.
• Security Analysis & Evaluation. We provide security anal-
  ysis and proof for our protocol constructions. Our experi-          B. Attacker Model
  mental evaluation confirms the theoretical expectations of             We consider a widely-studied standard synchronous model
  our solutions.                                                      of P2P systems [25], [52], [53], [17], [81], [18]. In this model,
                        II. P ROBLEM                                  our only new requirement is that every peer1 in the network
   Designing efficient solutions for P2P protocols in the byzan-      uses an SGX-enabled CPU to run the P2P protocols. In a
tine setting is a widely-recognized problem with limited solu-        network of N nodes, the number of byzantine nodes t is
tions [25], [52], [53], [17], [81], [18]. Our goal is to shed light   strictly bounded under a fraction of N2 . The number of peers
on how SGX can aid to improve efficiency of synchronous P2P           that actually behave maliciously for a particular execution of
protocols. In this work, we take two fundamental problems             the protocol is f (≤ t). Thus, a P2P network P is composed
as examples: 1) reliable broadcast and 2) common unbiased             of N peers P = {p1 , · · · , pN } such that N = 2t + 1.
random number generator.                                              Every peer pi in the P2P overlay has an identifier idi and can
                                                                      communicate with other peers using their ids. The underlying
A. Problem Definition
                                                                      TCP/IP substrate is assumed to provide reliable message
   In light of the previous works, we recall the standard defi-       delivery within a known bounded delay say ∆. Moreover,
nition of reliable broadcast [79], [40] and common unbiased
                                                                        1 We
random number [19] in the synchronous network:                                 use the word peer and node interchangeably in this work.
we consider a round-based synchronous model where each                   Algorithm 1:          Strawman distributed random number generation
round is equal to the time an honest node requires to send               protocol using reliable broadcast.
                                                                         Input: A P2P network P composed of N nodes, an initiator node idinit
a message and receive a response. Every peer is directly                 Output: A message m̂
connected to all other peers in the network and knows the           1         Initialization: m̂ ←⊥; Sm ← ∅; rnd ← 1
network size N . To summarize, we assume: the network size          2         upon self id is initiator:
is N (S1); the protocol starts synchronously (S2); the round        3         get(m) // m is a random number
                                                                    4         m̂ ← m
time is 2∆ (S3); the number of byzantine nodes is limited upto      5         add self id to Sm
N
 2 (S4); the peers are connected to each other (S5). This is a
                                                                    6         multicast INIT(m) to other peers
                                                                    7         for rnd ≤ t + 1 do
prominently used model in the previous literature of distributed    8               upon receiving INIT(m):
P2P systems [82], [79], [57], [17], [18], [19]. We discuss the      9               m̂ ← m
                                                                    10              add self id and sender id to Sm
validity of these assumptions in Appendix F in the extended         11              multicast ECHO(m) to other peers in round rnd + 1
version [60].                                                       12              upon receiving ECHO(m):
                                                                    13              if m̂ =⊥ then
Our Model using SGX. In our model, a byzantine peer                 14                    m̂ ← m
                                                                    15                    add self id to Sm
has a compromised or malware-ridden operating system but            16                    multicast ECHO(m) to other peers in round rnd + 1
executes protocols using SGX enclaves [7], [8], [45]. Enclaves      17              end
                                                                    18              if m = m̂ and sender id ∈   / Sm then
guarantee untampered execution in presence of malicious             19                    add sender id to Sm
underlying software or co-processes. The byzantine nodes can        20                    if |Sm | = N − t then
                                                                    21                         accept m̂
take arbitrary software actions as long as it does not violate      22                    end
SGX guarantees.                                                     23              end
                                                                    24              rnd ← rnd + 1
Scope. Our focus is showing how to leverage SGX features to         25        end
                                                                    26        if rnd > t + 1 then
improve the efficient of synchronous P2P protocols. Our model       27              accept ⊥
does not consider an adversary that can perform hardware            28        end

attacks and break SGX security guarantees. We do not aim
                                                                    Attacks by Byzantine Adversary. Byzantine initiator / peers
to prevent any information leakage through side-channels
                                                                    can tamper with the execution of Algorithm 1 and forge the
such as pagefaults, memory accesses or timing attacks to
                                                                    values of INIT and ECHO messages to perpetrate the following
which SGX-enabled CPUs are known to be susceptible [90],
                                                                    attacks.
[74], [66]. Indeed these problems are under investigation and
recent research shows that defending against them is feasible.      A1 (Execution Deviation): For this attack, an adversary
Existing solutions against these problems can directly apply        deviates from the control flow of the running program for
to our work [86], [78], [72].                                       the given protocol. The adversary can disregard essential
                                                                    conditions to jump to the desired instructions and execute
                                                                    them directly. For example, the adversary can skip all the
C. Strawman Solution & Attacks                                      conditions like Line 7 & 13 to directly multicast its ECHO
   Consider a strawman protocol for distributed random num-         value to parts of honest nodes but not all of them, to introduce
ber generation using reliable broadcast, where the initiator        equivocation to their final decisions. Moreover, the adversary
broadcasts a random number m using an initialization message        can also repeat particular instructions to obtain an output she
INIT to all the peers in a synchronous network (shown in            wants. For instance, if m is generated from a random source
Algorithm 1). If m is generated randomly and unbiasedly as          without being tampered during the execution of the protocol,
well as reaches every honest node without being tampered,           an unbiased common random number can be agreed among
then all honest nodes will agree on the common unbiased             all the peers in the network. A byzantine peer, however, can
random number m and the goal of the protocol is achieved.           repeat the step that generates m (Line 3) from the random
In Algorithm 1, upon receiving the INIT message, each peer          source until it returns a favorable random number. Hence, the
further multicasts an ECHO message to all other peers. After        output is biased as per Definition II.3.
receiving the ECHO messages from the majority of nodes,             A2 (Message Forgery): Suppose that the adversary does
each peer accepts m as the final message m̂. Note that if           not deviate from the execution of the given protocol, she
the initiator is honest, all honest nodes receive the message       can still alter the data flow (including input / output and
INIT during the first round and multicast ECHO messages             intermediate states) of the program to forge messages. As per
at the beginning of the second round. In the second round,          Definition II.1, a reliable broadcast protocol requires that if
every honest node receives at least N − t ECHO messages             one honest node accepts message m then all honest nodes
from N − t honest nodes and maybe some byzantine nodes.             accept m. The adversary can tamper with the INIT and ECHO
Thus, after two rounds, every honest node will output the same      messages to violate this agreement property of the protocol.
value m from the initiator, which satisfies all the conditions of   A byzantine initiator colluding with other byzantine peers
reliable broadcast in Definition II.1. However, we show how         in the network can tamper with Line 6, 11 and 16 in the
a byzantine initiator and other byzantine peers can attack this     algorithm such that some honest nodes receive most ECHO
protocol to violate Definitions II.1 and II.3.                      messages with m0 while others with m. This results in a
fraction of honest nodes assigning m̂ with m0 and accepting                 Peers                                                 Peerr
m0 , while other honest nodes accept m as the final output,                              Enclaves)                  Enclaver)
thereby causing inconsistency in the network.
                                                                                              Write    Transfer    Read
A3 (Selective Omission): Assume that the adversary does not                                 OSs)                      OSr)
deviate from the control flow (i.e., the execution) of the given
protocol or tamper with the data flow to forge messages, she         Fig. 1: Each peer consists of two entities: an Enclave and an OS. The OS
                                                                     models the operating system and memory. The Enclave models the isolated
can still omit, delay and replay messages in this restricted         memory and the secure execution of a program. The sender Enclaves can
model. For an omission attack, it has two types: one is based        send a message via a secure channel to the receiver Enclaver . The grey areas
on the content of the transmitted message and the other is           are secure against malicious OSes of byzantine nodes.
dependent on the identity of the sender / receiver. For the          A. SGX Features and Security Properties
first type, the adversary can observe its generated or received
random number m and selectively decide to drop or forward                We first start by recalling Intel SGX features which can also
it to other nodes based on its value, which introduces a bias        be provided by other trusted hardware.
in the final output for the honest nodes. For example, if the        F1: Enclaved Execution - SGX supports hardware-isolated
adversarial peers receive or initiate a message m, which is          memory region called enclaves such that a compromised un-
not the favorable one, they can omit to relay the message to         derlying OS cannot tamper the execution of the code running
the other nodes, thus all honest nodes may finally agree on ⊥        inside this enclave.
instead of m. Further, to violate the agreement condition in         F2: Unbiased Randomness - SGX provides a function
Definition II.1 and II.3, the adversary can selectively decide       sgx_read_ rand that executes the RDRAND instruction to
to omit the message m depending on whether the destination           generate hardware-assisted unbiased random numbers.
peer is honest or malicious. It can broadcast m correctly to a       F3: Remote Attestation - SGX allows a remote party to verify
few honest nodes and not send the message to the others for          that an application is running in an enclave on an SGX-enabled
the last round. The honest nodes receiving m can multicast m         CPU.
to the others, but the others will not accept it as the execution    F4: Trusted Elapsed Time - SGX provides a function
ends. Thus, the honest nodes that do not receive a message           sgx_get_tr usted_time that returns a trusted elapsed
will agree on ⊥ while others will agree on m.                        time in seconds relative to a reference point.
A4 (Message Delay): Alternatively, to generate an unbiased               Abstractly, a peer can be considered as the composition of
common random number, every peer can broadcast its random            two entities: an OS and an Enclave as shown in Figure 1. The
number to all other peers using Algorithm 1. All peers can           OS models the untrusted entity including the operating system
then XOR the random numbers in the final set to generate             and memory. It has access to all the system resources such as
the output. To bias this final output, a byzantine peer can          file system and network. The OS can arbitrarily invoke an
intentionally hold its random number until it receives inputs        enclave program and start its execution. The Enclave models
from all other honest peers [19]. In this way, the adversary         the isolated memory space that loads the program and executes
can “look ahead” in the protocol, calculate the final output         it securely. Thus, Enclave corresponds to the trusted entity of
and then decide whether to participate in the protocol by            a peer. We illustrate how to enforce P1 - P6 properties using
sending its random number. If the final random number already        SGX features to thwart A1 - A5 attacks.
favors the adversary then it does not participate in the protocol,   P1 (Execution Integrity): With remote attestation (F3), an
otherwise it sends its message to all the peers. Note that, for      enclave in one peer can verify the correctness of the running
t < N2 , all the byzantine adversaries can collude to introduce      program for the given protocol on the other nodes and whether
an exponential bias in the final value.                              it is executing on a valid SGX-enabled CPU or not. Moreover,
A5 (Message Replay): In the restricted model, the adversarial        F1 ensures that the execution in an enclave cannot be tampered
node can use a message mprev from an instance of the                 with by the OS. F1 and F3 together enforce the execution
protocol running in parallel, or which was run in the past to        integrity against A1. Hence, an adversary cannot deviate from
one (or more) honest node(s) and forward the correct message         the execution of the protocol in an enclave arbitrarily by
m to other honest nodes [69]. This results in an inconsistency       skipping / repeating instructions to violate the control flow
where few honest nodes agree on mprev and others agree on            of the running program.
m, thereby violating the agreement condition.                        P2 (Message Integrity & Authenticity): In designing our
                  III. S OLUTION OVERVIEW                            protocols, we first perform a setup phase where each peer
                                                                     connects to every other in the network and then performs a
   In this work, our ultimate goals, with the help of SGX
                                                                     series of steps. Analogous to P1, every enclave first uses F3
features, are twofold: 1) reducing the byzantine model to the
                                                                     to verify the correctness of the protocol executing on other
general-omission model; 2) achieving improvement for the
                                                                     peers. Next, they generate public / private key pairs inside the
efficiency of synchronous P2P protocols (e.g., lower commu-
                                                                     enclaves and exchange the public keys with each other. Then
nication / round complexity). In this section, we put forward
                                                                     all the messages transmitted between any two enclaves can
ideas using SGX features to enforce six security properties to
                                                                     be signed to ensure the integrity and authenticity against A2.
restrict the capabilities (A1 - A5) of a byzantine adversary, as
                                                                     Moreover, the internal states of the program are also protected
shown in Section II.
using F1. Therefore, the integrity of all messages including        P5 (Lockstep Execution): F4 allows us to realize a syn-
input / output / intermediate states is guaranteed. In this case,   chronized network across all rounds of a protocol. Each peer
it is clear that an adversary cannot forge valid messages to        uses F4 to decide the correct value of the ongoing round
bias the honest nodes to make inconsistent decisions.               and inserts this round number in all the sent messages. To
P3 (Blind-box Computation): F1 ensures that all interme-            detect delay attacks (A4), a peer simply matches the round
diate states of the protocol’s computation are hidden from          number present in an incoming message with the current
the OS. Leveraging F2, the provided randomness is also              round number. This defense is hard in the byzantine model
hidden from the OS. This guarantees that the input state is         with public-key infrastructure even if it supports F1, since the
hidden along with the intermediate states of the protocol’s         OS can tamper with the relative time to either increase or
execution We say in this case that the computation is a blind-      decrease the rounds of a node. Therefore, having access to a
box computation. As the adversarial node does not know the          trusted elapsed time functionality allows to perform lockstep
random number and given that the output of the computation is       execution and detect delay attacks in the restricted model.
encrypted between the Enclave and the OS, she cannot selec-         P6 (Message Freshness): Similar to [69], we use sequence
tively omit or drop messages based on their contents. Note that     numbers to ensure message freshness and therefore defend
an important part of instantiating such a blind-box computation     against replay attacks (A5). The main challenge lies in en-
is the ability to instantiate a secure channel between two or       suring secure exchange of the initial sequence numbers for
more enclaves. In fact, enclaves can agree on a shared key to       each peer and ensuring that the sequence number remains
establish a secure channel using Diffie-Hellman key exchange.       untampered with during the entire intermediate states of the
Nodes can then encrypt all the messages (including program’s        protocol execution. Using the secure channel, the peers se-
intermediate input / output) transmitted between each other         curely exchange a nonce or a sequence number, which is
to provide confidentiality against malicious OSes. Note that,       incremented sequentially by the peer. The nonce is generated
establishing such a shared key in the enclaved setting is           using F2 supported by SGX. This prevents the malicious
slightly weaker than the standard byzantine model, as the           adversary from tampering the initial nonce value to its own
malicious operating system cannot access the shared secret          advantage. Note that the keys and initial sequence numbers
keys and decrypt the exchanged messages due to F1. With             exchange occur only once during the setup phase. If an
P1 - P3, we can reduce the byzantine model to a restricted          adversarial node restarts or relaunches its enclave, all the data
model, where an adversarial node can only replay, omit and          in the enclave will be removed. Since the enclave does not have
delay messages.                                                     the valid sequence number and round number, it cannot re-join
P4 (Halt-on-Divergence): To mitigate selective omission             the same or any on-going execution, which is equivalent to be
based on nodes’ identities (A3), we enforce a security mech-        considered as a new node for the protocol.
anism called halt-on-divergence. This property halts any ma-        B. Overview of Our Results
licious node deviating from the protocol under some given              In this work, we achieve the results below.
condition. As an instance, if an adversarial node sends a mes-      R1: By enforcing (P1 - P6), we reduce the byzantine model
sage, but does not receive adequate responses, it will be forced    to the general-omission model.
to leave the current protocol execution. Halt-on-divergence            By enforcing P1 - P3, we first reduce byzantine model to
mechanism should be incorporated through a specific ac-             a restricted model, in which byzantine nodes can only delay
knowledgment protocol instantiation in such a way that every        / omit / replay messages. Due to space constraints, we defer
malicious node will be forced to leave if the acknowledgment        the formalization and proof to Appendix A [60]. We believe
is not verified. In particular, we introduce an acknowledgment      that the formalization, while based on traditional cryptographic
scheme where every receiver acknowledges the sender on              primitives, provides a new conceptual framing of SGX-enabled
receiving every valid message. A message sent over a secure         CPUs security features, and may be of independent interest.
channel is considered valid only if it contains the expected        By applying P5 and P6, we further confine the adversarial
sequence and round number. Naturally, an acknowledgment is          nodes into the general-omission model in the synchronous
not sent for a replayed, omitted or delayed message. Since          setting.
all honest receivers will reply with acknowledgment (ACK)           R2: We propose an efficient reliable broadcast protocol
messages on receiving valid messages, an honest sender should       (ERB) with early stopping, which improves communication
at least receive t + 1 ACK messages. Any node receiving less        complexity from O(N 3 ) to O(N 2 ) (refer to Section IV).
than t + 1 ACK messages will halt its execution and leave the          For this result, we leverage four properties. First, P1 -
network.                                                            P3 ensure that the adversarial nodes cannot forge messages
   The key idea here is to penalize any deviating adversary         and deviate from the execution of the protocol. Second, we
by churning the node out of the network. This effectively           leverage P4 to show that ERB can broadcast a message to
“sanitizes” the network. Thus, to remain a part of the network,     the entire network in min{f + 2, t + 2} rounds with better
every peer should send valid messages to the majority of the        performance than the classical approach by [82]. (Details can
network. This property also aids honest nodes in the protocol       be found in Table I [60].)
to decide the final output early and finish the execution           R3: We propose a new unbiased random number generation
immediately.                                                        protocol (ERNG) with communication complexity O(N 3 ) for
the basic version, or O(N log N ) for the optimized one (refer     value 1. The overhead of the setup is in O(N 2 ) while the
to Section V).                                                     storage overhead per node is in O(N ).
   With P3 and P5, our unoptimized ERNG solution directly          Initialization Phase: An initiator node first multicasts the
runs our ERB protocol as a sub-routine on the entire network       value val = hINIT, idinit , seqinit , m, rnd)i, where seqinit is the
to agree on a random number generated using F2. It has round       sequence number of the initiator node, and rnd is the round
and communication complexity of O(N ) and O(N 3 ), respec-         number. The round rnd is first initialized to 1, the enclave
tively. The ERB protocol is executed within this small cluster     will now increment the rnd after every 2∆ seconds—we take
to generate the final unbiased random number. The round and        advantage of the elapsed time feature of SGX to tie a round
communication complexity of this optimized ERNG is further         to an interval of 2∆ seconds.
reduced to O(log N ) and O(N log N ).                              Echo Phase: Until round t + 2, if a node receives an INIT
                                                                   or ECHO message for the first time, it performs the following
    IV. E NCLAVED R ELIABLE B ROADCAST P ROTOCOL                   actions: (1) start the local clock and initialize the round rnd
   We propose an enclaved reliable broadcast (ERB) in the          to 1, the round will increment every 2∆ seconds, (2) if both
synchronous model using SGX features. The transmitted mes-         rnd and seq are consistent with the expected values, it will
sage, val, between any two peers has the format: val :=            store the message m, else it just ignores it and treats it as an
htype, id, seq, m, rndi, where type ∈ {INIT, ECHO, ACK} and        omitted message. If there is no delay or replay detected, then
rnd represents the current round of the ERB protocol. If           it multicasts an ECHO message to all nodes at the end of the
type = INIT, then the initiator peer idinit is initiating the      current round. If the node has already received a valid ECHO
broadcast by sending the message m with sequence number            message from a distinct node, it will only add the sender’s
seqinit at round rnd. If type = ECHO, it means that its sender     identifier into the set Secho . Recall that at the end of the setup
knows that idinit has sent m, as it has already received either    phase, all honest nodes have the same copy of the sequence
a value with INIT or ECHO for the first time. Finally, if          number of all honest nodes. After every valid instance of the
type = ACK, it means that the peer acknowledges that it has        protocol, nodes will increase all sequence numbers by 1.
already received either INIT or ECHO values from the sender.       Decision Phase: If the node has received at least t + 1 correct
   We introduce three functions Halt, Multicast and Wait           ECHO messages from distinct nodes, i.e., |Secho | = t + 1, then
below:                                                             the node accepts m̂. After t + 2 rounds, if the node has not
• Halt(st): is a function that sets the state st to ⊥.
                                                                   received adequate distinct ECHO messages, it accepts m̂ :=⊥.
                     
• Multicast idi , val : is a functionality that multicasts the
                                                                   Every multicast requires the node to receive at least t + 1
   value val from the sender pi to the receiver pj , for all       ACK messages, else the node churns out itself using the Halt
   j ∈ [N ] \ {i}.                                                 function.
• Wait(τ ): is a function that has as an input the current            Algorithm 2 has a worst-case round complexity equal to
   elapsed time τ in the ongoing round, and suspends the           t + 2 with communication complexity in O(N 2 ) and t < N2
   protocol for (2∆ − τ ) seconds.                                 byzantine nodes.Due to space limitation, we defer the detailed
   Note that Halt function enforces the halt-on-divergence         proof to Appendix B the extended version [60].
property (P4) that we have introduced in Section III. When the
state of the node is set to ⊥ the node halts on-divergence and           V. E NCLAVED R ANDOM N UMBER G ENERATION
is ejected from the P2P network P. For the sake of exposition,        We present our algorithm that generates an unbiased com-
we write Wait(rnd) in the code description, we say in this case    mon random number called enclaved random number gener-
that the protocol waits until the end of the round rnd.            ation (ERNG).

A. ERB details                                                     A. Unoptimized ERNG
   Prior to running the very first instance of the ERB protocol,   We detail our unoptimized ERNG in Algorithm 3. At a higher
there is a setup phase. The setup is performed whenever the        level, every node generates a random number from the enclave,
program (ERB) needs to be updated or changed. We detail            and then performs ERB protocol to broadcast to every node.
the setup phase followed by the explanation of our algorithm.      All honest nodes in this case will receive the random numbers
Setup Phase: Every pair of sender and receiver peer use            from all honest nodes after t + 2 rounds, and may eventually
remote attestation (F3) along with enclaved execution (F1) to      receive several random numbers from other byzantine nodes.
verify the correctness of the execution, and therefore enforcing   According to the validity requirement, for each ERB instance,
P1 - P3. Then they establish a secure channel using Diffie-        every honest node will accept a random number from its
Hellman key exchange. This setup enforces P1 - P3, which           initiator or ⊥ so that all honest nodes have the same final set
restricts the byzantine nodes to only omit, replay and delay       Sfinal of random numbers. By performing exclusive disjunction
messages. Next, each peer picks at random a sequence number        (or XOR) of all received random numbers, every honest
                      $
such that seqs , seqr ← {0, 1}k and send it to each other. That    node obtains an unbiased common random number eventually.
is, every node has to store the sequence numbers of all other      Detailed unbiasedness and randomness analysis can be found
nodes in P. Finally, every node sets the variable rnd to the       in Appendix C [60].
Algorithm 2: ERB: Enclaved reliable broadcast protocol (for a node                Algorithm 4: ERNG: Enclaved unbiased random number generation
 idi with the initiator idinit sending a message m and a sequence number           protocol executed by peer pi .
 seqinit ).                                                                        Input: A P2P network P composed of N nodes
                                                                                   Output: A unbiased random number r
 Input: A P2P network P composed N nodes, a message m and a sequence
        number seqinit for the initiator idinit                                      •   initialization: SM ← ∅; Sfinal ← ∅; Schosen ← ∅; rnd ← 1
 Output: A message m̂                                                                    for rnd ≤ γ + 4 do
                                                                                                if rnd = 1 then
   •   initialization: m̂ ←⊥; Secho ← ∅; rnd ← 1                                                                                 $          N
   •   upon idi = idinit and sti 6= ⊥:                                                           •   every peer pi compute ri ← {0, · · · , 2γ − 1};
       m̂ ← m;                                                                                       if ri = 0 then
       Secho ← Secho ∪ {idinit };                                                                         Multicast(idi , val), where
       Multicast idinit , hINIT, idinit , seqinit , m, rndi ;                                              val = hCHOSEN, idi , seqi , ⊥, 1i;
   •   for rnd ≤ t + 2 do                                                                                  Schosen ← {idi };
                                                           0                                         end
          • upon receiving hINIT, idinit , seq, m, rnd i from idinit :
              if rnd0 = rnd and seq = seqinit then                                               •   upon receiving val = hCHOSEN, idj , seqj , mj , rndj i
                    send hACK, idinit , seq, H(m), rndi to idinit ;                                  if type = CHOSEN and rndj = 1 and seqj = seqi then
                    m̂ ← m;                                                                                Schosen ← Schosen ∪ {idj };
                    Secho ← Secho ∪ {idinit } ∪ {idi };                                             end
                    Wait rnd then                                                              end
                                                                                               if ri = 0 and rnd = 2 then
                                                                      
                    Multicast idi , hECHO, idinit , seq, m, rnd + 1i ;                     •
                                                                                                                  $
              end                                                                                    compute ri0 ← {0, · · · , γ 0 − 1};
                                                              0
          • upon receiving hECHO, idinit , seq, m, rnd i from peer idj :                             if ri0 = 0 then
              if rnd0 = rnd and seq = seqinit then                                                                                         $
                    send hACK, idinit , seq, H(val), rndi, where                                          initiate ERB with inputs mi ← {0, 1}k , seqi and peers in
                    val = hECHO, idinit , seq, m, rndi to peer idj ;                                      Schosen ;
                                                                                                     end
                    if m̂ =⊥ then
                          m̂ ← m;                                                                    seq0j ← seqj , for all idj ∈ Schosen ;
                          Secho ← Secho ∪ {idi };                                             end
                          Wait rnd then                                                        if ri = 0 and 3 ≤ rnd ≤ γ + 2 then
                                                                                                    execute ERB instances and wait for the output;
                          Multicast idi , hECHO, idinit , seq, m, rnd + 1i ;
                                                                                               end
                    end
                                                                                               if ri = 0 and   rnd = γ + 3 then
                    if idj ∈/ Secho then
                          Secho ← Secho ∪ {idj }                                                     Wait rnd then obtain Mi = {m̂1 , · · · , mˆli };
                          if |Secho | = N − t then                                                   seqj ← seq0j , for all idj ∈ Schosen ;
                                accept m̂;                                                     end
                          end                                                                  if rnd = γ + 4 then
                    end                                                                           • if ri = 0 then
              end                                                                                          SM ← SM ∪ {Mi };                                
          • upon Multicast(idi , val):                                                                     Multicast idi , hFINAL, idi , Mi , seqi , γ + 4i ;
              send val to idk , for all k ∈ [N ] \ {i};                                              end
                                                                                                                                               0
              receive Nack acknowledgements hACK, idinit , seq, H(val), rnd0 i,                   • upon receiving val = hFINAL, Mj , seqj , rndj i:
              where rnd0 = rnd and seq = seqinit ;                                                   if rndj = γ + 4 and seq0j = seqj then
              if Nack < t then                                                                             SM ← SM ∪ {Mj };
                    Halt(sti ) ;                                                                           if # of Mκ ≥ γ + 1 where Mκ ∈ SM then
              end                                                                                                Sfinal ← MκL ;
          • rnd ← rnd + 1;                                                                                       accept r =     v∈Sfinal v.
       end                                                                                                 end
   •   if |Secho | < N − t then                                                                      end
              m̂ ←⊥;                                                                           end
              accept m̂;                                                                       rnd ← rnd + 1;
       end                                                                               end
   •   seqinit ← seq + 1;                                                            •   seqj ← seqj + 1, for all j ∈ [N ];

                                                                                  intuition behind our optimization can be formulated as follows:
 Algorithm 3: Unoptimized-ERNG: Unoptimized enclaved unbiased
                                                                                  we notice that if we select uniformly at random nodes to
 random number generation protocol executed by peer pi .
 Input: A P2P network P composed of N nodes
                                                                                  constitute a representative cluster in P, we note that we can
 Output: A unbiased random number r                                               still guarantee w.h.p. an honest majority within this smaller
   •   initialization: Sfinal ← ∅; rnd ← 1                                        representative cluster. By leveraging F2 to generate a random
       for rnd ≤ t + 2 do
          • if rnd = 1 then
                                                                                  number and blind-box computation (P3), we can sample a
                                                $
                 initiate ERB with inputs mi ← {0, 1}k and seqi ;                 set of peers forming the representative cluster. The main
            end                                                                   remaining question, therefore, is how big this cluster should
            if 2 ≤ rnd ≤ t + 2 then
                 execute ERB instances and wait for the output                    be. As a starting point, note that if the cluster size is equal to
                 (Mi = {m̂1 , · · · , mˆli });                                    2N
            end                                                                    3 , the probability of having an honest majority is equal to
            rnd ← rnd + 1;                                                        one. Therefore, this remark already suggests that the cluster
       end
   •   Sfinal ← Mi ;
                                                                                  size can be smaller. Conceptually, the optimized ERNG can
       seqj ← seqL j + 1, for all j ∈ [N ]                                        be decomposed into three main steps:
       accept r =            v.
                     v∈S final                                                    Cluster Selection: The purpose of this step is to construct a
                                                                                  representative cluster of the entire P2P network. The cluster
B. Optimized ERNG
                                                                                  will consist of nodes selected uniformly at random from P. At
Next, we illustrate the main steps for ERNG with optimiza-                        round 1, every node picks uniformly at random a number from
tions. In this section, we consider that at most t ≤ N3                                       N
                                                                                  {0, · · · , 2γ − 1} using SGX (F2). This operation is protected
nodes of the network can be byzantine. ERNG terminates                            leveraging property P3 in such a way that the computation is
after γ + 4 rounds, where γ is a statistical parameter. The                       hidden from the OS. If the random number equals 0, then the
node is chosen to be part of the cluster, and then it multicasts      therefore run O(γ) Multicast functions. The communication
a CHOSEN message to all nodes in P. Upon receiving the                complexity of this first step is O(γ 2 ). Then, among this first
CHOSEN message, every chosen node adds the identifier of              representative cluster, a second cluster will be composed such
the sender to its own set Schosen . The size of the set Schosen       that all nodes of this cluster will run each an ERB instance.
                                                                                                                            √
represents the size of the cluster.                                   If the size of the second representative cluster is O( γ), then
                                                                                                                                √
ERB Instances: We first detail a pseudo-solution and then             the communication complexity of this step is O(γ 2 · γ).
detail our main construction in Algorithm 4. In round 2, the          Finally, the member of the first representative cluster will
nodes constituting the cluster will each generate a random            multicast the output of the ERB instances to all peers in P.
number and broadcast it only to the nodes constituting the            The communication complexity of this final step is O(N · γ).
cluster (i.e., peers’ identifiers in Schosen ). That is, every node   That is, overall, the communication complexity of ERNG
                                                                                             5
in the cluster will run an independent ERB instance. The              equals O(N · γ + γ 2 ). If N is large such that it verifies
intuition behind these multiple instances is the following: for       γ ∈ o(N ), then we can set γ ∈ O(log N ). In this case, the
the broadcast to be effective, at least one broadcast instance        communication complexity and round complexity of ERNG
has to succeed in that the accepted message is different from         are equal to O(N log N ) and O(log N ).
⊥. However, the complexity of such solution is cubic in                                      VI. E VALUATION
O(|Schosen |3 ) which can be a handicap in term of efficiency.
                                                                      Implementation. We have implemented a prototype of ERB,
As a solution, we incorporate a two-phases clustering. The
                                                                      unoptimized ERNG and ERNG in C/C++ using Intel SGX’s
idea behind this choice is the following: in order to generate a
                                                                      Linux SDK [8]. The implementation contains 4030 lines of
random number we only require one honest node to output a
                                                                      code (LOC) measured using CLOC tool [4]. Our prototype
random number r (otherwise the ERNG protocol may output
                                                                      implementation is open source and available online [10]. We
⊥). We can then proceed to select just a few number of nodes
                                                                      re-use the ported OpenSSL library including cryptographic
to perform the ERB protocol. As long as at least one of
                                                                      utilities (libcrypto available with Intel SDK), to perform
these nodes is honest, the correctness of our ERNG holds.
                                                                      Diffie-Hellman key exchange and AES encryption/decryption.
Concretely, to generate the second representative cluster, we
                                                                      We use boost [3] library to implement the communica-
perform the following: from nodes in Schosen , we uniformly
                                                                      tions between any two nodes and use Google protobuf
pick at random a value from {0, · · · , γ 0 − 1}, where γ 0 is a
                                                                      libraries [11] and rapidjson [12] to serialize transferred data.
parameter in function of γ that verifies γ 0 ≤ γ. The peers
                                                                      Experimental Setup. We use the DeterLab network testbed
that output a random number equal to zero will be the only
                                                                      for our experiments [6]. It consists of 40 servers running
peers able to initiate the ERB protocol. We will show that this
                                                                      Ubuntu 14.04 with dual Intel(R) Xeon(R) hexacore processors
strategy will greatly decrease the communication complexity
                                                                      running at 2.2 GHZ with 24 cores and 24 GB of RAM. All
and defer its analysis to Appendix D [60]. Note that this phase
                                                                      machines are connected and share the same link with the
lasts for γ + 2 rounds when all ERB instances terminate.
                                                                      bandwidth of 128MBps. Every node in our protocol takes up
Selection Decision: At the end of the broadcast phase, the            to 1 - 800 MB memory which limits the maximum number
node of the clusters will have each a set containing eventually       of nodes to 210 in our experiments. Since the trusted elapsed
several random numbers. Note that, as ERB is a reliable               time (F4) is not supported by all platforms yet and we need to
broadcast primitive, we know that all honest peers in the             run multiple nodes on each machine, we use SGX simulation
cluster will have the same set of random numbers. Once a              modefor our program and use a simulated Intel attestation
node in P receives at least γ + 1 sets of random numbers,             service (IAS).
Mκ , originating from the nodes in the cluster, it will output        Evaluation Methodology. To evaluate the correctness of
the set Mκ as Sfinal . All honest nodes will output the same set      our protocols, we measure the round complexity (time to
under the assumption that there is a majority of honest nodes         terminate) and communication complexity (network traffic) for
in the cluster. Finally, the random number equals the XOR             ERB, unoptimized-ERNG and ERNG, by varying the number
value of all random numbers in Sfinal .                               of nodes from 22 to 210 . We have highly optimized our system
   Due to space limitation, we defer the proofs for the Lemma         to handle dynamic ports allocations to handle a larger number
and Theorems below to Appendix D [60].                                of nodes within one machine (order of 25 nodes per machine).
Lemma V.1. If up to t = N3 nodes are byzantine, then with             Part of our results reported in this section are for the optimistic
at least 1 − negl(γ) probability, the representative cluster has      case where all nodes behave honestly. We evaluate the round
more than γ honest nodes, and less than γ byzantine nodes.            complexity of ERB while varying the number of byzantine
Theorem V.1. Agreement: All honest nodes eventually agree             nodes in the network up to 14 of the entire network composed
on the same common set Sfinal in ERNG.                                of 512 nodes. We also compare our experiment results for the
                                                                      traffic size with theoretical ones to verify if they match our
Theorem V.2. Unbiasedness: The output of the ERNG pro-                asymptotic analysis.
tocol is an unbiased random number.
                                                                      A. ERB Evaluation
ERNG Performance Analysis. Note that in ERNG, O(γ)                    Honest Termination: Constant Scalability. Determining
nodes will be chosen to form the first representative cluster and     the termination of ERB is essential to validate our reliable
100
                                          One Round                                                  100
                                       ERB termination
                                                                                                                                   One Round
                                                                                                                               ERNG termination

                                                                                                                                                                                    100
                                                                                                                                                                                                             N=512
    Time (s)

                                                                                          Time (s)

                                                                                                                                                                         Time (s)
                       10
                                                                                                      10

                                                                                                                                                                                     10

                        1                                                                              1
                             2   23     25        27          29        211                                2         23          25            27        29                               1/512 1/256 1/128 1/64 1/32 1/16             1/8    1/4
                                         Number of peers                                                                       Number of peers                                                         Byzantine peers (fraction)

(a) Termination of ERB slightly increase with the  (b) Termination time of ERNG in function of the (c) Time termination of ERB linearly increase with
number of peers.                                   number of nodes in P.                           the number of byzantine nodes in P.
Fig. 2: Termination time in seconds for ERB, both unoptimized and optimized versions of ERNG in honest and byzantine network with different fractions.
                       300                                                                           10000                                                                          100

                                                                                                                                                                                                    Ex
                       250                                                                           1000                                                                                           Th

                                                                                                                   Ex-ERNG-0
                       200                                                                            100          Ex-ERNG-1
       Size (MBytes)

                                                                                    Size (MBytes)

                                                                                                                                                                   Size (MBytes)
                                                                                                                   Th-ERNG-0
                                                                                                                   Th-ERNG-1
                       150                                                                             10

                       100                                                                                 1
                                       Ex
                                       Th                                                              0.1
                        50

                         0                                                                            0.01                                                                          10
                             2    23        25          27         29         211                              2          23        25              27        29                          1/256   1/128     1/64    1/32     1/16    1/8     1/4
                                            Number of peers                                                                    Number of peers                                                            Byzantine peers (fraction)

(a) Communication of ERB in function of the (b) Communication of ERNG in function of the (c) Communication of ERB in function of different
number of nodes in P.                              number of nodes in P.                             byzantine peers in P.
Fig. 3: (Th) theoretical and (Ex) experimental comparisons off network overall communication bandwidth in MB for ERB, both unoptimized (ERNG-0) and
optimized (ERNG-1) versions of ERNG in honest and byzantine network with different fractions.

broadcast primitive. Fig. (2a) shows that the termination time,                                                                       values of the number of peers in the network did not allow
in the case of an honest initiator, is nearly equal to twice the                                                                      us to optimally select a cluster size that can guarantee w.h.p.
value of one round. This validates our theoretical results where                                                                      the agreement. In this case, we fix the cluster to be 32 of the
we show that ERB finishes in 2 rounds when the initiator is                                                                           network and we show that the traffic size decreases and has
honest. The small increase at 28 is purely due to the bandwidth                                                                       a 60% improvement over the unoptimized one. Note that this
bottleneck of our testbed, as the nodes share the same link.                                                                          result can get much better for a larger number of peers in
Traffic Size: Quadratic Scalability. Fig. (3a) demonstrates                                                                           realistic settings. Here, we draw our theoretical curve for the
that the communication complexity quadratically increases in                                                                          ideal evaluation which can be guaranteed only for larger N .
function of the number of peers in P (note that the x-axis                                                                            C. Byzantine case
is logarithmic). The message size of INIT and ACK is around
100 Bytes and 80 Bytes, respectively. For 1024 nodes in P, the                                                                           In Fig (2c), we show that the termination time of ERB
traffic size equals 277 MB. We show that this result matches                                                                          linearly increases with the number of byzantine nodes behav-
our theoretical expectation.                                                                                                          ing maliciously in the current instance. We gradually increase
                                                                                                                                                                               1
                                                                                                                                      the fraction of byzantine nodes from 512    to 14 . As a strategy
B. ERNG Evaluation                                                                                                                    of byzantine nodes, we have taken into consideration the
Honest Termination: Limited Scalability. We show in                                                                                   worst-case where byzantine nodes create a chain (a byzantine
Fig (2b) that ERNG termination remains slightly constant                                                                              sends its message to only one byzantine node each round
from 22 to 27 and then increases afterwards. Unfortunately,                                                                           and then gets eliminated) in order to delay the termination
this does not reflect our theoretical findings and this is mainly                                                                     as much as possible. In the case of 14 byzantine fraction,
due to the limitation of our testbed, namely, the upper bound                                                                         the ERB termination takes 389 seconds while it only takes
on the communication link of 128MBps that all nodes have                                                                              4 seconds in the honest case. For traffic size, if the number
to share. For small values of peers N , the communication                                                                             of byzantine nodes increases, the communication complexity
complexity of the unoptimized ERNG is cubic in N , while the                                                                          of ERB decreases as shown in Fig. (3c). This is mainly due
optimized version is also (nearly) cubic for smaller values of                                                                        to the halt-on-divergence property that will eject the nodes
N . Given a fixed bandwidth, this explains why the termination                                                                        whenever it behaves maliciously. That is when an honest
increases for larger values of N to reach 103 s for one instance                                                                      node multicasts a message, the eliminated byzantine node
of ERNG.                                                                                                                              will not acknowledge this message which greatly reduces
Traffic Size: Cubic Scalability. Fig. (3b) demonstrates that                                                                          the communication complexity. For example, for 14 byzantine
the communication complexity cubically increases in function                                                                          fraction in a 512-node network, the traffic size equals 35 MB,
of the number of peers in P for the unoptimized ERNG. Our                                                                             while in an honest node, it is equal to 69 MB, a 50% decrease.
theoretical results back up our experimental result. For ERNG                                                                                             VII. R ELATED W ORK
as the bandwidth links get overflowed much faster, we limited                                                                            Reliable broadcast has been extensively investigated in
our experiments to 512 nodes. For the optimized ERNG, small                                                                           various adversarial models. In our work, we show how Intel
SGX improves the efficiency of existing protocols in these          introduce an attested append-only memory (A2M) to remove
models, renewing interest in studying these protocols with          the ability of adversarial replicas to equivocate without de-
SGX-based implementations.                                          tection, which helps to increase the resilience from N3 to
                                                                    N
Reliable Broadcast: Reliable broadcast has been extensively          2 [41]. However, these works have concentrated on han-
                                                                    dling asynchronous protocols with weak time assumptions
studied since the 1980s, and is closely related to the problem
                                                                    like PBFT. In this paper, in contrast to previous approaches,
of byzantine agreement (BA). Several excellent surveys on the
                                                                    we work on the round-based synchronous model. Our work
problem are available [64], [88]. Byzantine agreement can also
                                                                    extends these ideas to detecting and remediating failures of
achieve reliable broadcast [31], [83], [61], [34], [36], [73],
                                                                    synchronous network assumptions (e.g. our lockstep execution
[76], [88]. For the asynchronous network, Bracha’s classic
                                                                    and halt-on-divergence). Additionally, we investigate the use
reliable broadcast protocol requires O(N 2 ) communication
                                                                    of our blind-box execution primitive in our new distributed
complexity and tolerates up to N3 byzantine nodes [33], [32].
                                                                    RNG protocol which is bias-resistant, and more efficient using
Cachin and Tessaro [37] leverage erasure codes to improve
                                                                    secure sampling for cluster creation. We leave the extension
efficiency and reduce communication complexity. However, as
                                                                    of applying our properties and primitives to asynchronous
the time is not bounded, messages may incur arbitrary delays,
                                                                    protocols for future work.
and most protocols do not guarantee terminating runs, except
under some special assumptions such as sharing a “common            Distributed RNG: Generating common coins in a distributed
coin” [83], [31].                                                   manner for randomized BA in asynchronous networks can also
   Without any extra assumptions, reliable broadcast and            be used for generating unbiased random numbers [35], [83],
byzantine agreement in the synchronous setting can tolerate         [26]. However, these protocols either require a trusted dealer
N                                                                   to set up the initial states of different nodes or pre-distribute
 3 byzantine nodes at most, and with min{f + 2, t + 1} round
complexity [65], [81], [47]. Lamport et al. and Pease et            data to the nodes in the network. Other works employing asyn-
al. propose protocols terminating within t + 1 rounds and           chronous verifiable secret sharing (AVSS) protocols do not
tolerating up to N3 byzantine nodes, but with exponential           have the trusted dealer, but can probabilistically execute with
communication complexity [65], [81]. Berman et al. achieve          errors [31], [38], [23], [87]. Most of these works employ some
O(poly(N )) communication complexity but only tolerating            cryptographic primitives that, in most case, can be considered
upto N4 byzantine nodes [25]. Garay et al. later present a BA       heavy-weight and performance unfriendly. Awerbuch et al.
protocol terminating within min{f + 5, t + 1} rounds [53].          propose a solution that tolerates up to N6 byzantine nodes,
                                                                    with O(N ) round complexity and O(N 3 ) communication
   To tolerate a larger fraction of byzantine nodes, additional
                                                                    complexity [19] to generate a random number with a constant
assumptions are often needed. A common assumption is that
                                                                    bias. Other works, such as Andrychowicz et al.’s one, generate
of having a one-time trusted dealer that pre-deploys PKI in
                                                                    a common random number based on proof of work [18]
the infrastructure. This assumption, for instance, allows digital
                                                                    with O(N 4 ) communication complexity, but the output can
signatures to be used for authentication, wherein a message
                                                                    eventually be biased. Moreover, the large communication cost
claimed to be sent by a node A can be assured to be originating
                                                                    for most of these approaches prevents scalability to a large
from A [65], [48], [51], [62]. This weakens the capabilities
                                                                    number of nodes. We present more efficient (with O(N log N )
of the byzantine adversary, which cannot forge messages on
                                                                    communication complexity) and unbiased RNG generation for
behalf of honest nodes. Researchers have proposed protocols
                                                                    the synchronous network case.
to use digital signatures to boost the resilience from N3 to                              VIII. C ONCLUSION
N − 1, but the communication complexity is still large, i.e.,
                                                                       The recent availability of Intel SGX in commodity laptops
O(exp(N )) and O(N 3 ) [65], [48]. Katz et al. extend the
                                                                    and servers provides a promising research direction for ad-
work of Feldman and Micali [50] to employ authenticated
                                                                    vancing the area of P2P systems. Our main observation is
channels, and present protocols tolerating N2 byzantine nodes
                                                                    that leveraging SGX features can restrict a byzantine model
with O(poly(N )) complexity [62]. Fitzi et al. also give an
                                                                    to a general-omission model in synchronous systems. We
authenticated BA protocol that beats this bound ( N2 ) but under
                                                                    highlight that using SGX we can improve the efficiency of
specific number-theoretic assumptions [51]. Abraham et al.
                                                                    P2P protocols such as reliable broadcast and unbiased random
provide a solution with early stopping (min{f + 2, t + 1})
                                                                    number generator in synchronous settings.
and polynomial complexity [17]. In this work, we use SGX
features to reduce the byzantine model to the general omission                                   R EFERENCES
model, and further propose ERB to achieve min{f + 2, t + 2}          [1] “Bitcoin,” https://bitcoin.org/en/, Accessed: 2020.
round complexity and O(N 2 ) communication complexity.               [2] “BitTorrent,” http://www.bittorrent.com/, Accessed: 2020.
   Researchers also have proposed byzantine fault-tolerant           [3] “Boost C++ library,” http://www.boost.org/, Accessed: 2020.
                                                                     [4] “CLOC,” http://cloc.sourceforge.net/, Accessed: 2020.
algorithms using trusted services, such as by using trusted          [5] “CrashPlan,” http://www.code42.com/crashplan/, Accessed: 2020.
computing primitives, primarily focusing on making PBFT              [6] “DeterLab,” https://www.isi.deterlab.net/index.php3, Accessed: 2020.
more efficient [39], [41], [67], [43], [44], [89], [70], [22].       [7] “Intel Software Guard Extensions,” https://software.intel.com/en-us/sgx,
                                                                         Accessed: 2020.
These works have observed similar relation to crash-fault-           [8] “Intel Software Guard Extensions for Linux OS,” https://01.org/
tolerant protocols, as we have. For example, Chun et al.                 intel-softwareguard-extensions, Accessed: 2020.
You can also read