Realizing the Metaverse with Edge Intelligence: A Match Made in Heaven - arXiv

Page created by Jerome Mcdonald
 
CONTINUE READING
1

                                                         Realizing the Metaverse with Edge Intelligence:
                                                                    A Match Made in Heaven
                                                    Wei Yang Bryan Lim, Zehui Xiong, Dusit Niyato, IEEE Fellow, Xianbin Cao, Chunyan Miao,
                                                                      Sumei Sun, IEEE Fellow, Qiang Yang, IEEE Fellow

                                           Abstract—Dubbed “the successor to the mobile Internet”, the               pandemic has resulted in a paradigm shift in how social inter-
                                        concept of the Metaverse has recently exploded in popularity.                actions are conducted today, thereby positioning the Metaverse
                                        While there exists lite versions of the Metaverse today, we are              as a necessity in the near future. Second, emerging technolog-
                                        still far from realizing the vision of a seamless, shardless, and
                                        interoperable Metaverse given the stringent sensing, communica-              ical enablers have made the Metaverse a growing possibility.
                                                                                                                     For example, advances in VR/AR and haptic technologies
arXiv:2201.01634v1 [cs.NI] 5 Jan 2022

                                        tion, and computation requirements. Moreover, the birth of the
                                        Metaverse comes amid growing privacy concerns among users.                   enable users to be visually and physically immersed in a virtual
                                        In this article, we begin by providing a preliminary definition of           world. To date, there exist “lite” versions of the Metaverse that
                                        the Metaverse. We discuss the architecture of the Metaverse and              have evolved mainly from Massive Multiplayer Online (MMO)
                                        mainly focus on motivating the convergence of edge intelligence
                                        and the infrastructure layer of the Metaverse. We present major              games. Among others, Roblox1 and Fornite2 started as online
                                        edge-based technological developments and their integration to               gaming platforms. Yet, just recently, the virtual concerts held
                                        support the Metaverse engine. Then, we present our research                  on Roblox and Fornite attracted millions of views.
                                        attempts through a case study of virtual city development in the                However, we are still far from realizing the Metaverse. For
                                        Metaverse. Finally, we discuss the open research issues.
                                                                                                                     one, the aforementioned “lite” versions are distinct platforms
                                           Index Terms—Metaverse, Edge intelligence, Future communi-                 operated by separate entities. In other words, one’s Fortnite
                                        cations, Resource allocation                                                 avatar and virtual items mean nothing in the Roblox world.
                                                                                                                     In contrast, the Metaverse is envisioned to be a seamless
                                                                                                                     integration of virtual worlds. Next, while MMO games can
                                                                 I. I NTRODUCTION
                                                                                                                     host more than a hundred players at once, albeit with high-
                                           The concept of Metaverse first appeared in the science                    specification system requirements, an open-world VRMMO
                                        fiction novel Snow Crash written by Neal Stephenson in 1992.                 application is still a relatively nascent concept even in the
                                        More than twenty years later, the Metaverse has re-emerged as                gaming industry. Similarly, it will be a challenge to develop
                                        a buzzword. In short, the Metaverse is commonly described as                 a “shardless” Metaverse that is persistent, rather than one that
                                        an embodied version of the Internet. Just as how we navigate                 separates players into different sessions. This is exacerbated
                                        today’s Internet with a mouse cursor, users will explore the                 by the expectation that large parts of the Metaverse have to
                                        Metaverse with the aid of virtual reality (VR) or augmented                  integrate the physical and virtual worlds, e.g., through digital
                                        reality (AR) technologies. Moreover, powered by Artificial                   twins. The stringent sensing, communication, and computation
                                        Intelligence (AI), blockchain technology, and 5G and Beyond                  requirements impede the real-time, scalable, and ubiquitous
                                        (B5G), the Metaverse is envisioned to facilitate peer-to-peer                implementation of the Metaverse. Finally, the birth of the
                                        interactions and support novel, decentralized ecosystems of                  Metaverse comes amid increasingly stringent privacy regula-
                                        service provisions that will blur the lines between the physical             tions.
                                        and virtual worlds.                                                             In this article, we begin by motivating a definition and
                                           To date, tech giants have invested heavily towards realizing              introduction to the architecture of the Metaverse. To realize the
                                        the Metaverse as “the successor to the mobile Internet”.                     Metaverse amid its unique challenges, we mainly focus on the
                                        Among others, Facebook was even rebranded as “Meta” to                       edge intelligence driven infrastructure layer, which is a core
                                        reinforce its commitment towards the development of the                      feature in B5G wireless networks. In short, edge intelligence
                                        Metaverse. There are two fundamental driving forces behind                   is the convergence between edge computing and AI. We adopt
                                        the excitement surrounding the Metaverse. First, the Covid-19                the two commonly-quoted divisions of edge intelligence, i.e.,
                                                                                                                     i) Edge for AI: which refers to the end-to-end framework
                                           WYB. Lim is with Alibaba Group and Alibaba-NTU Joint Research             of bringing sensing, communication, AI model training, and
                                        Institute (JRI), Nanyang Technological University (NTU), Singapore. Email:
                                        limw0201@e.ntu.edu.sg. Z. Xiong is with Singapore University of Tech-
                                                                                                                     inference closer to where data is produced, and ii) AI for
                                        nology and Design, Information Systems Technology and Design (ISTD)          Edge: which refers to the use of AI algorithms to improve
                                        Pillar, Singapore. Email: zehui xiong@sutd.edu.sg. D. Niyato and C. Miao     the orchestration of the aforementioned framework. Then, as a
                                        are with School of Computer Science and Engineering, NTU, Singapore.
                                        Emails: dniyato@ntu.edu.sg and ascymiao@ntu.edu.sg. X. Cao is with School
                                                                                                                     case study, we present a framework for the collaborative edge-
                                        of Electronic and Information Engineering, Beihang University, Beijing,      driven virtual city development in the Metaverse. Finally, we
                                        China. Email: xbcao@buaa.edu.cn. S. Sun is with Communications and           discuss the open research issues.
                                        Networks Department, Institute for Infocomm Research (I2R), Singapore.
                                        Email: sunsm@i2r.a-star.edu.sg. Q. Yang is with Department of Computer
                                                                                                                       1 https://www.roblox.com/
                                        Science and Engineering, Hong Kong University of Science and Technology,
                                        Hong Kong, and Webank. Email: qyang@cse.ust.hk.                                2 https://www.epicgames.com/fortnite/en-US/home
2

  Our contributions are as follows:                                  2) The Metaverse engine obtains inputs such as data from
  1) We present a general architecture of the Metaverse                 stakeholder-controlled components. The virtual world is
     and its major components, thereby providing a holis-               generated, maintained, and enhanced with these inputs.
     tic view of the Metaverse ecosystems. We outline the                   •   VR/AR enables users to experience the Metaverse
     key technologies that enable the edge-driven Metaverse,                    visually, whereas haptics enable users to experience
     emphasizing their roles to support virtual services.                       the Metaverse through the additional dimension of
  2) We discuss potential applications and services that can                    touch, e.g., using haptic gloves. This enhances user
     be delivered in the Metaverse, and through a case study                    interactions, e.g., through transmitting a handshake
     on virtual city development, demonstrate the conver-                       across the world, and opens up the possibilities
     gence between edge intelligence and the Metaverse                          of providing physical services in the Metaverse,
     engine.                                                                    e.g., remote surgery. These technologies are de-
  3) We present research perspectives and highlight the in-                     veloped by standards that facilitate interoperability,
     terdisciplinary open issues and research opportunities.                    e.g., Virtual Reality Modelling Language (VRML)3 ,
                                                                                that govern the properties, physics, animation, and
                                                                                rendering of virtual assets, so that users can traverse
 II. T HE M ETAVERSE : A RCHITECTURE , T ECHNOLOGIES ,                          the Metaverse smoothly.
                  AND A PPLICATIONS
                                                                            •   Digital twins enable some virtual worlds within the
   The Metaverse is an embodied version of the Internet that                    Metaverse to be modeled after the physical world in
comprises a seamless integration of interoperable, immersive,                   real-time. This is accomplished through modeling
and shardless virtual ecosystems navigable by user-controlled                   and data fusion. Digital twins add to the realism
avatars. In this section, we present the layers of the Metaverse                of the Metaverse and facilitates new dimensions
architecture (Fig. 1).                                                          of services and social interaction. For example,
                                                                                Microsoft Mesh allows users working from multiple
                                                                                sites to collaborate with each other in real-time
A. Physical-virtual world and the Metaverse engine                              digital copies of their office.
                                                                            •   Artificial Intelligence can be leveraged to incorpo-
  1) Physical-virtual world interaction: Each non-mutually-
                                                                                rate intelligence into the Metaverse for improved
     exclusive stakeholder in the physical world controls
                                                                                user experience, e.g., for efficient object rendering,
     components that influence the virtual world. The con-
                                                                                intelligent chatbots, and UGC. For example, the
     sequences in the virtual world in turn feedbacks to the
                                                                                MetaHuman project4 by EpicGames utilizes AI to
     physical world. The key stakeholders are:
                                                                                generate life-like digital characters quickly. The
       • Users can experience the virtual world through
                                                                                generated characters may be deployed by VSPs
         Head Mounted Displays (HMDs) or AR goggles.                            as conversational virtual assistants to populate the
         The users can in turn execute actions to interact                      Metaverse.
         with other users or virtual objects.                               •   Blockchain technology will be key to preserving
       • IoT and sensor networks deployed in the physi-
                                                                                the value and universality of virtual goods, as well
         cal world collect data from the environment. The                       as establishing the economic ecosystem within the
         insights derived are used to update the virtual                        Metaverse. It is difficult for current virtual goods
         environment, e.g, through feeding information to                       to be of value outside the platforms on which they
         update a digital twin. The sensor network may be                       are traded or created. Blockchain technology will
         independently owned by sensing service providers                       play an essential role in reducing the reliance on
         (SSPs) that contribute live data feeds to virtual                      such centralization. For example, a Non-fungible
         service providers (VSPs) to generate and maintain                      token (NFT) serves as a mark of a virtual good’s
         the virtual environment.                                               uniqueness and authenticates one’s ownership of the
       • Virtual service providers (VSPs) develop and main-
                                                                                good. This protects the value of virtual goods and
         tain the virtual worlds of the Metaverse. Similar                      facilitates the peer-to-peer trading in a decentralized
         to user-created videos today (e.g., YouTube), the                      environment. As virtual worlds in the Metaverse
         Metaverse is envisioned to be enriched with user-                      are developed by different parties, the user data
         generated content (UGC) that includes virtual art,                     may also be managed separately. To enable seamless
         games, and social applications. These UGC can be                       traversal across virtual worlds, multiple parties will
         traded in the Metaverse.                                               need to access and operate on such user data.
       • Physical service providers operate the physical in-
                                                                                Due to value isolation among blockchains, cross-
         frastructure that supports the Metaverse engine and                    chain is a crucial technology to enable secure data
         respond to transaction requests that originate from                    interoperability.
         the Metaverse. This includes the operations of com-
         munication and computation resources at the edge             3 https://www.web3d.org/documents/specifications/14772/V2.0/part1/
         of the network, or logistics services for the delivery    javascript.html
         of physical goods transacted in the Metaverse.               4 https://www.unrealengine.com/en-US/digital-humans
3

                                   Physical world
                                                                                                                                                    Virtual service                       Physical service
                                                                        User                                IoT/Sensor
                                                                                                                                                       provider                              provider

                  Section II.A.1
                                                       Bridging the                          Synchronizing the         Development              Virtual                                  Tangible
                                                       physical and                         physical and virtual                             goods/services                           goods/services
                                                       virtual world                              world                                        provision                               transactions

                                   Virtual world

                                                                   Avatar                               Virtual Environment                      Virtual goods/services                 Tangible goods/services
                                                                For virtual world                          Constructing the                        E.g. virtual workspace,                    E.g. Ecommerce
                                                                   navigation                               virtual world                                 education                               logistics

                                                                                 Immersion                                     Real-time and Intelligent                Physical-Virtual World Ecosystem
                                    Metaverse Engine

                                                                       Physics                                                   Modeling/Simulation
                                                                                                   Human-machine                                                  Recommendation                      Smart contracts
                                                                                                                                    optimization
                  Section II.A.2

                                                                  Animation                        communication
                                                                                                                                                                      Translation                      Data storage
                                                                                                                                    Data processing
                                                             Auditory information                                                                                 Image generation              Decentralized trading
                                                                                                    Human-human
                                                                AR annotation                       communication                     Data fusion                     Storytelling               Data interoperability

                                                                       VR/AR                            Haptic                         Digital Twin                          AI                          Blockchain

                                                                                 Scalable                          Shardless                         Ubiquitous                         Trustworthy
                                   Infrastructure

                                                                        Ultra-reliable low-latency                        Cloud-Edge assisted rendering                      Macro/small base station caching
                  Section II.B

                                                                       High data rate and reliability                     Cloud-Edge AI model training                              Device-to-device caching

                                                                         Ultra-dense connectivity                        Cloud-Edge blockchain mining                                Edge AI model caching

                                                                 High spectral and energy efficiency                             Local computation                                Optimal cache replacement

                                                                            Communication                                          Computation                                             Storage

Fig. 1: The Metaverse architecture features the immersive and real-time physical-virtual world interaction supported by the
Metaverse engine. The supporting infrastructure ensures that the Metaverse is scalable, shardless, enables ubiquitous access,
and is trustworthy for users.

     B. Edge intelligence-empowered infrastructure                                                                                                         promising solution is the cloud-edge-end computa-
     The general functions of the infrastructure layer are:                                                                                                tion paradigm. Specifically, local computations can
                                                                                                                                                           be performed on end devices for the least resource
      • Communication and Networking: To prevent breaks
                                                                                                                                                           consuming task, e.g., computations required by the
         in presences (BIP), i.e., disruptions that cause a user
                                                                                                                                                           physics engine to determine the movement and
         to be aware of the real world setting, VR requires
                                                                                                                                                           positioning of an avatar. To reduce the burden on the
         a data rate of 250 Mbit/s and packet error rate on
                                                                                                                                                           cloud for scalability, and further reduce end-to-end
         the order of 10−1 ∼ 10−3 . Haptic traffic requires
                                                                                                                                                           latency, edge servers can be leveraged to perform
         a lower data rate of 1 Mbit/s and packet error rate
                                                                                                                                                           costly foreground rendering, which requires less
         on the order of 10−4 ∼ 10−5 [1]. This may be en-
                                                                                                                                                           graphical details but lower latency [2]. The more
         abled through enhanced mobile broadband (eMBB)
                                                                                                                                                           computation intensive but less delay sensitive tasks,
         and ultra-reliable and low latency communication
                                                                                                                                                           e.g., background rendering, can in turn be executed
         (URLLC) links, which are the main techonlogy pil-
                                                                                                                                                           on cloud servers. Moreover, popular contents can
         lars in B5G. Due to the expected explosive growth
                                                                                                                                                           be cached at the edge of the network for efficient
         of data traffic, ultra-dense networks deployed in
                                                                                                                                                           retrieval and reduction in computation overheads.
         B5G networks to alleviate the constrained system
         capacity.                                                                                                                               The infrastructure layer leverages edge intelligence (Fig.
      • Computation and Storage: Today, MMO games                                                                                                2) to (i) support AI for the intelligent Metaverse (i.e.,
         can host more than a hundred players in a single                                                                                        Edge for AI), and (ii) utilize AI to realize the resource-
         game session and hence require high-specification                                                                                       efficient collaborative edge paradigm (i.e., AI for Edge).
         GPU requirements. VRMMO games, which are the                                                                                                •     Edge for AI
         rudiment of the Metaverse system, are still scarce                                                                                                Edge offloading: Apart from offloading rendering
         in the industry and may require the devices such                                                                                                  computations to the edge or cloud, costly compu-
         as HMDs to be connected to powerful computers                                                                                                     tation tasks required for data processing and AI
         to render both the immersive virtual worlds and                                                                                                   model training, e.g., matrix multiplication, can also
         the interactions with hundreds of other players.                                                                                                  be decomposed into subtasks to be offloaded to
         To enable ubiquitous access to the Metaverse, a                                                                                                   edge servers (i.e., workers). The completed subtasks
4

                          User/Sensor                                       Edge                                            Cloud
                                                                  User inputs (e.g. location)

                        User inputs (e.g. video/control)            Feature matching

      VR/AR Rendering
                                                                    with edge cache             Semantic extraction, transmission
                           Edge server        Incentive                                             and matching at cloud
                           association        Mechanism
                                                                    Computation (e.g.                   Edge cache update
                                  Streaming                           foreground                                                    Computation. (e.g.
                                                                       rendering)                                                     Background
                                                                                                                                       rendering)
                                                                         Streaming

                                                                          Intelligence

                                                         Data                                            Cloud offloading
      Data Processing

                                                       transmit
                                 Data collection                      Data Processing
                                                       Semantic
                                                        Comm.
                                                      Parameter
                                                       Updates                                         Global aggregation
                                                                       Intermediate
                               Local model training
                                                                        aggregation

                                 Fig. 2: Applications of Edge Intelligence for the Metaverse.

are aggregated at a master node to recover the                                                      edge to perform costly inference tasks for faster
computation result. However, a major drawback of                                                    response to users.
computation offloading is the existence of strag-                                                   Local machine learning model training: As with
glers, which are the processing nodes that run slower                                               the Internet, the Quality of Experience (QoE) that
than expected or nodes that may be disconnected                                                     users derive from the Metaverse will improve with
from the network due to several factors such as                                                     more insights gathered from usage data. However,
imbalanced work allocation and network conges-                                                      following the introduction of increasingly stringent
tion. As a result, the overall time needed to execute                                               privacy laws such as the General Data Protection
the task is determined by the slowest processing                                                    Regulation (GDPR), the Metaverse will have to
node. One way to mitigate the straggler effect is to                                                be developed while preserving user privacy. More-
utilize worker selection schemes to eliminate strag-                                                over, the risk of data leaks increases in tandem
gling workers. Another way is to leverage coded                                                     with the increase in attack surfaces as more users
redundancy to reduce the recovery threshold, i.e.,                                                  are connected to the Metaverse. One solution is
the number of workers that need to submit their                                                     the privacy-preserving machine learning paradigm
results for the master to reconstruct the final result.                                             known as Federated Learning (FL) [5]. In FL,
For example, polynomial codes [3] can be used to                                                    users of the Metaverse can carry out AI model
generate redundant intermediate computations. The                                                   training on their local device before transmitting
total computation is not determined by the slowest                                                  the model parameters or gradient updates, instead
straggler but by the time taken for the master node                                                 of the raw data, to the model owner for aggrega-
to receive computed results from some decodable                                                     tion. This enables privacy-preserving collaborative
set of workers. For polynomial codes, the recovery                                                  machine learning while leveraging the computation
threshold does not scale with the number of workers                                                 capabilities of these users, e.g., during idle device
involved, thereby ensuring the scalability of the                                                   usage periods. As model parameters are smaller in
edge-empowered Metaverse.                                                                           size than raw data, FL also alleviates the burden
Caching: Edge caching is instrumental to reduce                                                     on backbone communication networks. Recently,
computation and communication redundancy, which                                                     the edge-assisted Hierarchical FL framework have
refers to the wastage of network resources as a                                                     also been proposed [6] in which intermediate model
result of repetitive user access of popular content                                                 aggregations are performed at edge servers before
or computations. For the former, the probabilistic                                                  global cloud aggregation, so as to reduce link dis-
model for the popularity distribution of files, e.g.,                                               tances and instances of costly global communication
field of views (FOV) in the Metaverse, can be                                                       with the cloud.
learned [4]. Then, the popular FOVs can be stored                                               •   AI for Edge
at edge servers close to users that demand it more                                                  Semantic communication: The advent of the Meta-
to reduce rendering computation cost and latency.                                                   verse will inevitably contribute to a growing de-
For the latter, the computation results from AI                                                     mand for bandwidth amid the explosive data traffic
models can be cached at edge servers to respond                                                     volume required to support the Metaverse engine.
to computation requests that are of a similar nature.                                               This necessitates a paradigm shift from Shannon’s
Moreover, pre-trained models can be cached at the                                                   conventional focus in how accurately the communi-
5

cation symbols can be transmitted to how precisely                  their participation, one may naturally consider a
the transmitted symbols can convey the meaning of                   one-size-fits-all reward in which a homogeneous
the message. In particular, the human-to-machine                    reward is allocated to all stakeholders. However,
(H2M) semantic communication can be a key tech-                     the result is that desirable stakeholders that can
nology to optimize VR/AR implementation for the                     contribute more to the process, e.g., in terms of
ubiquitous Metaverse [7]. As an illustration, we                    providing more resources for edge rendering, will
reference the AR architecture proposed in [8] that is               lack the incentive to do so. As such, it is essential for
divided into the user, edge, and cloud tiers (Fig. 2).              the service requesters (e.g., VSP) to design incentive
The user tier senses the environment and transmits                  mechanisms to motivate the participation of these
the raw video stream and other user controls to the                 stakeholders. In light of the interactions among
edge tier. At the edge tier, image frames from the                  stakeholders and complex system states in the dy-
video stream are utilized to find a match with the                  namic networks, AI approaches have increasingly
cached images, for the retrieval of relevant informa-               been proposed to design learning-based incentive
tion such as image annotations. If the image frame                  mechanisms.
is not found from the cache, the frame is offloaded        The edge intelligence empowered infrastructure layer con-
to the cloud for further matching. If a match is not     nects all users in the Metaverse and supports its scalable,
found, computation is executed at the cloud and the      shardless, ubiquitous, and trustworthy realization.
edge cache is updated. Clearly, the image frames
of the raw video streams are of heterogeneous im-        C. Applications
portance. With AI-enabled semantic extraction and
pre-processing of the video stream, the redundant           We identify some important emerging applications and
transmission of repetitive or unimportant frames to      services in Metaverse as follows.
the edge or cloud can be greatly reduced to alleviate       1) Entertainment and social activities: Currently, entertain-
the burden on backbone networks. Beyond seman-           ment and social activities are held on platforms that support
tic encoding for text, audio, or images, semantic        audio and video transmission. Nevertheless, user interactions
communication has also emerged as a key enabler          are limited to rigid 2D grids of users, and are still somewhat
of efficient communications in distributed machine       off what is experienced in the physical world. With the aid
learning, e.g., gradient quantization schemes can        of VR and haptic technology, social interactions will be more
significantly reduce the communication overhead of       immersive.
                                                            2) Pilot testing: Before products are being released in the
distributed AI model training.
                                                         market, they are usually tested by a small group of users
Edge resource optimization: In a heterogeneous user
                                                         in a controlled environment due to the cost of large-scale
network, it is of utmost importance that resources
                                                         deployment or for safety reasons. The Metaverse will be a
at the edge, e.g., for storage and computation,
                                                         channel to pilot test products before they are released to the
are well allocated to maximize the user QoE. AI-
                                                         physical world at a low cost with fewer safety considerations.
enabled solutions are increasingly utilized to solve
                                                            Moreover, users can have virtual twins of physical products
the allocation problem given the dense distribution
                                                         delivered to their inventories directly in the Metaverse for
and mobility of users. The study of [2] discusses
                                                         marketing purposes. As an example, Hyundai has begun
that the rendering strategies of VR/AR users can
                                                         experimenting with providing virtual test drives for users albeit
be calibrated among local rendering, edge-assisted
                                                         in the lower resolution Roblox world5 . In the Metaverse, test
rendering, and edge-cloud rendering (i.e., local ren-
                                                         drive environments can be modeled exactly after highways
dering of foreground interactions and edge render-
                                                         with realistic traffic conditions.
ing of background environment). The user QoE
                                                            3) Virtual education: The pandemic has necessitated the
can be formulated as a function of latency and
                                                         online delivery of education. However, a drawback of virtual
energy consumption, based on the user device and
                                                         education is the lack of personalization and difficulty of deliv-
the required functions. Then, an effective rendering
                                                         ering “hands-on” lessons. With more users in the Metaverse,
scheme can be formulated based on deep rein-
                                                         the wealth of data can be used to further refine AI tutors for
forcement learning (DRL) algorithm trained offline,
                                                         personalized lessons. Hands-on lessons that involve dealing
subjected to the queue states at the edge servers
                                                         with machines or tools can be delivered more effectively with
and service requirements of the user. Moreover, the
                                                         haptics technology.
algorithm can be further refined using mechanism
                                                            4) Gig economy and creative industries: The Metaverse
design when implemented online to account for the
                                                         will mitigate the adverse effects of piracy on the gig economy
ad-hoc transitions in user usage requirements that
                                                         and creative industry. The Metaverse will provide a platform
may affect other users’ QoE or rendering strategies.
                                                         for gig workers to create UGC and trade it actively as
Incentive mechanisms: The stakeholders of the
                                                         NFTs that uniquely identify the originality of the product,
Metaverse, e.g., users, blockchain miners, and edge
                                                         e.g., game object creation in GameFi6 . When the product is
servers, each own valuable resources such as data
and computation resources that can be leveraged for        5 https://www.roblox.com/games/7280776979/Hyundai-Mobility-Adventure

the enhancement of the Metaverse. To incentivize           6 https://gamefi.org/
6

                         A) Physical-Virtual World Synchronization                                                 B) Edge Rendering                                                  C) Physical-Virtual World Resource Allocation

                                                                                                                                                                                      Stochastic integer programming to derive
                                                                          VSPs with
                                                                           location-                                                                                                   optimal resource reservation (ex-ante)
 Virtual world

                                                                            specific
                                                                             sync                                                                                           Long-term resource                             Ad-hoc resource
                     VSP 1                VSP 2                           frequency                       Bid
                                                                 VSP 3                                                                                                   reservation with lower cost                       with higher cost
                                                                         requirement
                                   i) Reward calibration                                                               Buyer clock                                               Reservation                                 Reservation
                                                                                          User 1                       (adjust bids)                  Edge server 1                Stage                                       Stage
                                                                          Strategy
                                   ii) SSP-VSP pairing                   adaptation                                                                   Edge servers
                                   (evolutionary game)                                   Users with              Auction         Price and              provide                                         Actual demand
                                                                          process         different             mechanism        Allocate              rendering
                                                                                        requirements                                                    services
                                                                                       submit different                                                                         Edge services                               Edge services
                                                                                            bids                      Mechanism
                                                                                                                     matches clocks
 Physical world

                                                   Population 2                                                                                                                Physical services                           Physical services
                                                                                                                                                                                                          User actual
                                                                                                          Bid                                                                                           demand known
                        Population 1                                                                                   Seller clock
                                                                                                                                                      Edge server 2                                        only after
                                                                                                                    (adjust acceptable
                                                                                        User 2                            price)                                                                       reservation stage
                   SSPs of different                                                                                                                                            Virtual services                           Virtual services
                  types are grouped               Population 3
                   into populations                                                                                                                                            Resource bundle                              Resource bundle

Fig. 3: We propose a framework for virtual city development in the Metaverse. In the first study, we propose collaborative
sensing for the physical-virtual world synchronization. In the second study, we propose a pricing and allocation mechanism
for edge rendering services among resource-constrained users. In the third study, we propose a resource allocation scheme that
accounts for the unknown user demand to derive optimal resource reservation ex-ante.

transferred among buyers, a portion of the sales proceeds can                                                                                      1000                                                                           1000
be programmed to go to the creators automatically.                                                                                                                    DRL

                                                                                                                                                                                                                                               Auction Information Exchange Cost
                                                                                                                                                                      SOTA
                                                                                                                                                    800               Vanilla DDA                                                 800
 III. C ASE S TUDY: A F RAMEWORK FOR C OLLABORATIVE                                                                                                                   RANDOM
                                                                                                                                  Social Welfare

   E DGE -D RIVEN V IRTUAL C ITY D EVELOPMENT IN THE
                                                                                                                                                    600                                                                           600
                       M ETAVERSE
   In this section, we present a case study of developing a
                                                                                                                                                    400                                                                           400
virtual city in the Metaverse. For example, the development
of “Metaverse Seoul” has recently been proposed7 to cater
to both tourists and local users, e.g., to access civil services                                                                                    200                                                                           200
online using HMDs. We motivate the collaborative edge-
driven development of a virtual city in which the sensing,                                                                                            0                                                                           0
computation, communication, and storage resources at the                                                                                                       10         15     20       25    30
network edge are leveraged to achieve the desirable qualities                                                                                                          Required Bitrates (Mbps)

and features of the Metaverse (Fig. 3).                                                                                        Fig. 4: In [10], we compare the DRL based DDA against
                                                                                                                               the vanilla DDA and state-of-the-art method that adjusts the
A. Collaborative sensing for real-time physical-virtual world                                                                  auction clock stepsize using the Ornstein-Uhlenbeck process
synchronization                                                                                                                [11]. The DRL based DDA can achieve comparable social
   With continuous data synchronization, the virtual city is                                                                   welfare (based on user QoE and edge server utility) at a
able to reflect the physical city in real-time. An enabling                                                                    much lower auction information exchange cost under various
technology is collaborative sensing, in which IoT and wireless                                                                 bitrates.
sensor networks are deployed to feed digital twins within the
Metaverse with fresh data streams.
   In [9], we formulate a resource allocation problem in which
SSPs (e.g., Drones-as-a-Service) are employed to collect data
to maintain a regular sync between the physical and virtual
worlds. The Unmanned Aerial Vehicle (UAV) fleets are owned
                                                                                                                               of the rewards and may churn to service other VSPs. To
by distinct SSPs, whereas the virtual city is maintained by
                                                                                                                               model the dynamic strategy adaptation of non-cooperative
distinct VSPs, each of which develops different areas of the
                                                                                                                               SSPs across the network, we utilize an evolutionary game
virtual city that correspond to the real world. To employ the
                                                                                                                               based framework in which the SSPs are clustered into popula-
services of the SSPs, the VSP posts a reward pool (based on
                                                                                                                               tions based on their sensing capabilities, starting location, and
its budget) to be divided among SSPs that service the area. As
                                                                                                                               energy cost. Using our evolutionary game based framework,
more SSPs service the area, the data is uploaded at a higher
                                                                                                                               we are able to model how the calibration of rewards by
frequency. However, each SSP receives a smaller proportion
                                                                                                                               VSPs affect the composition of SSPs servicing it, and thereby
   7 https://www.euronews.com/next/2021/11/10/seoul-to-become-the-first-                                                       simulate how the synchronization frequency for each virtual
city-to-enter-the-metaverse-what-will-it-look-like                                                                             region vary with the rewards provided.
7

                                                                      C. Resource allocation in the physical-virtual world ecosys-
                      EVF
             800     SIP                                              tem
                     Random                                              To support the Metaverse engine, VSPs have to leverage
             700                                                      both virtual and physical resources that are often owned
Total cost

             600                                                      by separate entities. For example, VSPs can utilize logistic
                                                                      services for physical goods delivery or edge services for
             500                                                      computation offloading.
             400                                                         Similar to other shared services (e.g., cloud services), such
                                                                      resources are usually priced based on two subscription plans
             300                                                      i.e., reservation and on-demand plan. Generally, the reservation
                                                                      plan is cheaper than the on-demand plan, which is used on an
             200                                                      ad-hoc basis when demand spikes. However, the VSP will need
               1x cost      1.5x cost   2x cost       2.5x cost       to decide on the resources to be allocated via the reservation
                             On-demand cost                           plan before the actual user demand is known (i.e., ex-ante).
                                                                      Therefore, a resource over-provisioning problem can occur if
  Fig. 5: In [12], we compare the SIP with expected-value
                                                                      the VSP subscribes too many resources on the reservation plan.
  formulation (EVF) and the random scheme that models the
                                                                      In contrast, a resource under-provisioning problem can happen
  user demand as the average historical value. The SIP can
                                                                      if the VSP subscribes too little resources, i.e., the VSP has to
  always achieve the best solution among the three to reduce
                                                                      use the more expensive on-demand plan. Taking into account
  the on-demand cost.
                                                                      the demand uncertainty of the users, we propose a two-
                                                                      stage stochastic integer programming (SIP) formulation for the
                                                                      VSPs in Metaverse to minimize its operation cost by allocating
                                                                      the resources across the two plans most strategically [12].
  B. Edge-assisted efficient rendering of the immersive virtual       Using historical data on user demand, our resource allocation
  world                                                               scheme achieves a much lower cost than other schemes that
                                                                      do not consider the probability distribution of user demand
                                                                      (Fig. 5).
     In light of battery limitations of user devices, non-panoramic
  VR rendering has been proposed such that only the images                 IV. O PEN C HALLENGES AND F UTURE R ESEARCH
  to cover the viewport of each eye are rendered, thereby                                      D IRECTIONS
  demanding less data traffic and computation workload [13]. In       A. Redefining user QoE
  [10], we study the provision of non-panoramic VR rendering             The Internet has been optimized based on gradually evolv-
  services provided by edge servers and propose an incentive          ing QoE metrics. Similarly, there exists a need to redefine the
  mechanism based on Double Dutch Auction (DDA) for edge              user QoE for the Metaverse. This requires interdisciplinary
  server-user association, as well as to price the services of edge   efforts, e.g., to draw relations among network requirements
  rendering. The objective is to allow VR rendering service           and user visual perceptions. For example, the human eye
  providers to serve VR users in which their benefits (i.e.,          is unable to perceive images shown for less than 13 ms
  valuations of the services) are maximized.                          [14], thereby setting an upper-bound on the network timing
                                                                      requirements. Moreover, VR applications in the Metaverse
     To derive the user valuation of VR rendering services,
                                                                      will place less emphasis on the traditional focus of video
  we propose to formulate the user QoE as a function of
                                                                      resolution. Instead, foveated rendering studies eye tracking to
  Video Multi-Method Assessment Fusion (VMAF) and Struc-
                                                                      render important scenes and reduce the image quality of scenes
  tural SIMilarity (SSIM) values. The former reflects the user’s
                                                                      in the peripheral vision [15].
  perception of streaming quality, whereas the latter measures
  VR image quality. The VMAF and SSIM values for a user
                                                                      B. B5G and the Metaverse
  in the Metaverse are in turn affected by the user’s head
  rotation speeds (depending on VR functions) and expected              B5G communication systems will deviate from conventional
  bit rates of VR streaming from the edge. The edge server            metrics such as data transmission rate to Value of Information
  valuation is formulated based on energy cost and the available      (VoI) [14], that accounts for both contents and age of the
  computation and storage resources. To derive the edge server-       packet to be transmitted. As the Metaverse will feature novel
  user association, the users adjust their bids upwards, whereas      and differentiated service provision, the supporting communi-
  the edge servers adjust their sell price downwards till a match     cation and networking infrastructure must be semantic-aware
  in valuation is derived. The evaluation results show that the       and goal-oriented.
  proposed incentive mechanism can motivate the providers and
  the users to participate rationally in the auction with desirable   C. Interoperability standards
  properties such as truthfulness. Moreover, we design a DRL-            While tech companies race to compete for the upper-hand in
  based auctioneer to accelerate this auction process by adjusting    the development of the Metaverse, the need to develop interop-
  the stepsize of the auction clocks dynamically (Fig. 4).            erability standards have arisen so that the vision for a seamless
8

Metaverse can be realized. This is crucial to encourage the                      [6] M. S. H. Abad, E. Ozfatura, D. Gunduz, and O. Ercetin, “Hierarchical
proliferation of UGC in the Metaverse. Moreover, a unified                           federated learning across heterogeneous cellular networks,” in ICASSP
                                                                                     2020-2020 IEEE International Conference on Acoustics, Speech and
model to standardize the communication protocols of the                              Signal Processing (ICASSP). IEEE, 2020, pp. 8866–8870.
Metaverse will eventually be necessary to enable access from                     [7] Q. Lan, D. Wen, Z. Zhang, Q. Zeng, X. Chen, P. Popovski, and
diverse communication systems in different virtual worlds.                           K. Huang, “What is semantic communication? a view on convey-
                                                                                     ing meaning in the era of machine intelligence,” arXiv preprint
                                                                                     arXiv:2110.00196, 2021.
                                                                                 [8] J. Ren, Y. He, G. Huang, G. Yu, Y. Cai, and Z. Zhang, “An edge-
D. Security and Privacy                                                              computing based architecture for mobile augmented reality,” IEEE
                                                                                     Network, vol. 33, no. 4, pp. 162–169, 2019.
   The Metaverse will be built on blockchain-empowered                           [9] Y. Han, D. Niyato, C. Leung, C. Miao, and D. I. Kim, “A dynamic
economic ecosystems. As more transactions are conducted                              resource allocation framework for synchronizing metaverse with iot
on the blockchain, the attack surface increases and security                         service and data,” arXiv preprint arXiv:2111.00431, 2021.
                                                                                [10] M. Xu, D. Niyato, J. Kang, Z. Xiong, C. Miao, and D. I. Kim, “Wireless
concerns arise. For example, cyberattacks can utilize malicious                      edge-empowered metaverse: A learning-based incentive mechanism for
smart contracts8 to gain access to the user’s main crypto                            virtual reality,” arXiv preprint arXiv:2111.03776, 2021.
wallet. Moreover, new forms of hardware used to access the                      [11] D. Friedman, The double auction market: institutions, theories, and
                                                                                     evidence. Routledge, 2018.
Metaverse bring about security challenges, e.g., the finger                     [12] W. C. Ng, W. Y. B. Lim, J. S. Ng, Z. Xiong, D. Niyato, and C. Miao,
tracking of VR users can be used to infer the password.                              “Unified resource allocation framework for the edge intelligence-enabled
   In contrast to click-through rates for the Internet, new di-                      metaverse,” arXiv preprint arXiv:2110.14325, 2021.
                                                                                [13] V. Kelkkanen, M. Fiedler, and D. Lindero, “Bitrate requirements of
mensions of user data (e.g., eye tracking) can be collected and                      non-panoramic vr remote rendering,” in Proceedings of the 28th ACM
leveraged for more personalized advertising directly delivered                       International Conference on Multimedia, 2020, pp. 3624–3631.
to the FOV of users. This presents novel challenges to user                     [14] P. Popovski, F. Chiariotti, V. Croisfelt, A. E. Kalør, I. Leyva-Mayorga,
                                                                                     L. Marchegiani, S. R. Pandey, and B. Soret, “Internet of things (iot)
data privacy.                                                                        connectivity in 6g: An interplay of time, space, intelligence, and value,”
                                                                                     arXiv preprint arXiv:2111.05811, 2021.
                                                                                [15] A. Patney, M. Salvi, J. Kim, A. Kaplanyan, C. Wyman, N. Benty,
E. Economics of the edge-driven Metaverse                                            D. Luebke, and A. Lefohn, “Towards foveated rendering for gaze-tracked
                                                                                     virtual reality,” ACM Transactions on Graphics (TOG), vol. 35, no. 6,
   The Metaverse will open up novel possibilities of physical                        pp. 1–12, 2016.
and virtual service and resource trading among users and
service providers. The contention for resources now extends
                                                                                                              B IOGRAPHIES
from the physical to virtual world, in which rational users
and service providers will have to optimize the resource usage                     WEI YANG BRYAN LIM is currently pursuing the Ph.D. degree
efficiently in consideration of newly defined QoE.                              (Alibaba Talent Programme) with the Alibaba-NTU Joint Research
                                                                                Institute (JRI), Nanyang Technological University (NTU), Singapore.
                                                                                His research interests include edge intelligence and resource alloca-
                          V. C ONCLUSION                                        tion.
                                                                                   ZEHUI XIONG is an Assistant Professor at Singapore University
   In this article, we have discussed an architecture of the                    of Technology and Design. Prior to that, he was a researcher with
Metaverse and motivated the edge intelligence driven support-                   Alibaba-NTU Joint Research Institute, Singapore. He received the
ing infrastructure. Then, we present a case study of smart city                 Ph.D. degree in Computer Science and Engineering at Nanyang
                                                                                Technological University, Singapore. He was a visiting scholar with
development in the Metaverse, followed up with the future                       Princeton University and University of Waterloo. His research inter-
research directions. Our work serves as an initial attempt to                   ests include wireless communications, network games and economics,
motivate the confluence of edge intelligence and the Meta-                      blockchain, and edge intelligence.
verse.                                                                             SUMEI SUN [Fellow, IEEE] is the Principal Scientist, Acting
                                                                                Executive Director (Research), and the Head of the Communications
                                                                                and Networks Department with the Institute for Infocomm Research
                             R EFERENCES                                        (I2R), Singapore. She is the Editor-in-Chief of IEEE Open Journal of
                                                                                Vehicular Technology, member of the IEEE Transactions on Wireless
[1] J. Park and M. Bennis, “Urllc-embb slicing to support vr multimodal         Communications Steering Committee, and a Distinguished Speaker
    perceptions over wireless cellular systems,” in 2018 IEEE Global
                                                                                of the IEEE Vehicular Technology Society 2018–2024. She’s also the
    Communications Conference (GLOBECOM). IEEE, 2018, pp. 1–7.
[2] F. Guo, F. R. Yu, H. Zhang, H. Ji, V. C. Leung, and X. Li, “An
                                                                                Director of IEEE Communications Society Asia Pacific Board and a
    adaptive wireless virtual reality framework in future wireless networks:    member at large with the IEEE Communications Society.
    A distributed learning approach,” IEEE Transactions on Vehicular Tech-         DUSIT NIYATO [IEEE Fellow] is currently a Professor with
    nology, vol. 69, no. 8, pp. 8514–8528, 2020.                                the School of Computer Science and Engineering and, by courtesy,
[3] Q. Yu, M. A. Maddah-Ali, and A. S. Avestimehr, “Polynomial codes:           School of Physical and Mathematical Sciences, Nanyang Technolog-
    an optimal design for high-dimensional coded matrix multiplication,” in     ical University, Singapore. He has published more than 380 technical
    Proceedings of the 31st International Conference on Neural Information      papers in the area of wireless and mobile networking, and is an
    Processing Systems, 2017, pp. 4406–4416.                                    inventor of four U.S. and German patents. He was named the
[4] Y. Sun, Z. Chen, M. Tao, and H. Liu, “Communications, caching, and          2017–2021 Highly Cited Researcher in Computer Science. He is
    computing for mobile virtual reality: Modeling and tradeoff,” IEEE
    Transactions on Communications, vol. 67, no. 11, pp. 7573–7586, 2019.
                                                                                currently the Editor-in-Chief for IEEE Communications Surveys and
[5] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas,             Tutorials.
    “Communication-efficient learning of deep networks from decentralized          XIANBIN CAO received the Ph.D. degree in signal and infor-
    data,” in Artificial intelligence and statistics. PMLR, 2017, pp. 1273–     mation processing from the University of Science and Technology
    1282.                                                                       of China, Hefei, China, in 1996. He is the Dean and a Professor
                                                                                with the School of Electronic and Information Engineering, Beihang
 8 https://consensys.github.io/smart-contract-best-practices/known   attacks/   University, Beijing, China. His research interests include intelligent
9

transportation systems, airspace transportation management, and in-
telligent computation
   CHUNYAN MIAO is currently a professor in the School of Com-
puter Science and Engineering, Nanyang Technological University
(NTU), and the director of the Joint NTU-UBC Research Centre of
Excellence in Active Living for the Elderly (LILY).
   QIANG YANG [IEEE Fellow] is the head of AI at WeBank
(Chief AI Officer) and Chair Professor at the Computer Science and
Engineering (CSE) Department of Hong Kong University of Science
and Technology (HKUST).
You can also read