Lawrence Berkeley National Laboratory - Recent Work

Page created by Manuel Dixon
 
CONTINUE READING
Lawrence Berkeley National Laboratory - Recent Work
Lawrence Berkeley National Laboratory
Recent Work

Title
Euclidean Symmetry and Equivariance in Machine Learning

Permalink
https://escholarship.org/uc/item/7q29x3qh

Journal
Trends in Chemistry, 3(2)

ISSN
2589-5974

Author
Smidt, TE

Publication Date
2021-02-01

DOI
10.1016/j.trechm.2020.10.006

Peer reviewed

 eScholarship.org                            Powered by the California Digital Library
                                                             University of California
TRECHM 00209 No. of Pages 4

Trends in
Chemistry
Special Issue: Machine Learning for
Molecules and Materials

Forum
                                                 sciences is by whether they use symmetry
Euclidean Symmetry                               and, if so, where they use invariant (see
                                                                                                   Glossary
                                                                                                   3D Euclidean symmetry: the symmetry group of
and Equivariance in                              Glossary) versus equivariant operations.          3D space; includes 3D translations, rotations, and
                                                                                                   inversion.
Machine Learning                                 Between the two types of symmetry-aware
                                                                                                   3D point groups: subgroups of 3D Euclidean
                                                 models, invariant models eliminate coordi-        symmetry; symmetry groups of molecules; includes
                      1,2,
Tess E. Smidt             *                      nate systems by dealing only with quantities      rotations, mirrors, and inversion.
                                                 that are invariant to the choice of coordinate    3D space groups: subgroups of 3D Euclidean
                                                 system (scalars), while equivariant models        symmetry; symmetry groups of crystals; includes
                                                                                                   rotations, mirrors, glides, screws, inversion, and
                                                 preserve how quantities predictably change        discrete translations.
Understanding the role of symmetry
                                                 under coordinate transformations.                 Equivariant: changes deterministically under
in the physical sciences is critical for                                                           operation of group elements.
choosing an appropriate machine-                                                                   Equivariant model: a machine learning model that
                                                 Invariance Is Easier to Deal with                 can handle equivariant quantities as input, intermediate
learning method. While invariant                 Than Equivariance                                 data, and output and uses equivariant (i.e., tensor)
models are the most prevalent                    There is good reason for the popularity           operations.
symmetry-aware models, equivariant               of invariant scalar features for machine          Equivariant operations: operations on equivariant
                                                                                                   quantities (i.e., geometric tensors); for example, dot
models such as Euclidean neural                  learning; scalars can be given to any ma-         products, cross-products, tensor products; includes
networks more faithfully represent               chine learning algorithm without violating        invariant operations.
physical interactions and are ready              symmetry. More practically, scalars are           Euclidean neural networks: neural networks that
                                                                                                   are equivariant to 3D Euclidean symmetry.
to take on challenges across the                 easier to handle than geometric tensors
                                                                                                   Geometric tensors: equivariant properties of 3D
physical sciences.                               and invariant models are high (if not top) per-   physical systems; includes invariant scalars.
                                                 formers on many existing benchmarks [1].          Invariant: does not change under operation of group
Euclidean Symmetry Is the Freedom                                                                  elements.
                                                                                                   Neural networks: a subset of machine learning
to Choose a Coordinate System     It is common practice to use equivariant
                                                                                                   methods where models must be differentiable; model
There is no inherent way to orient physical      operations to generate invariant features for     weights are updated using gradients of the error; also
systems; yet we still need to choose a co-       machine learning models. For example,             known as deep learning.
ordinate system to articulate their geometry     SOAP [2] descriptors are equivariant func-
and physical properties (Figure 1A). Unless      tions that operate on the geometry and            with geometric coordinates and any relevant
coded otherwise, machine-learned models          atom types of local atomic environments           quantities you need to describe the system
make no assumption of symmetry and are           to produce invariant scalars (Figure 2A).         (e.g., external fields, atom-wise properties
sensitive to an arbitrary choice of coordinate   Geometry in invariant models is reduced to        such as velocities). Even if the desired
system. To be able to recognize a 3D pat-        bond lengths, bond angles, dihedral angles,       target is a scalar, the interactions that yield
tern in any orientation, such a model will       and other scalar invariants of geometric          that scalar may not be scalar in nature; for
need to be shown roughly 500 rotated ver-        elements; Figure 1B describes the invariant       example, the only way to interact a particle
sions of the same example, a process             versus equivariant properties of 3D vectors.      with charge q and velocity v with an external
called data augmentation. One of the             Invariant models can yield equivariant                             !
                                                                                                   magnetic field B is to use the equivariant
motivations of explicitly treating symmetry      quantities, but only through gradients of the                       !            !
                                                 equivariant operations used in featurization      cross-product, F ¼ q!      v  B , or the dif-
in machine-learning models is to eliminate
the need for data augmentation, so that          [3]; this may not be practical or possible        ference between the momentum !         p and
the model can instead focus on, for              depending on the quantity of interest.            the charge-weighted vector potential
                                                                                                     !                →
example, learning quantum mechanics.                                                               q A , H ¼ j! p − q A j2 =2m. To predict quan-
In 3D space, we can transform between coor-      When Invariance Is Not Enough                     tities that are fundamentally generated from
dinate systems using elements of Euclidean       Physical systems and their interactions           equivariant interactions: (i) equivariant inter-
symmetry [3D rotations, 3D translations,         are inherently equivariant. With an invariant     actions must be included in the scalar
and inversion (x, y, z) → (−x, −y, −z),          model, a featurization method must be             featurization used for an invariant model
which includes mirror symmetry]; hence, we       used to represent a naturally equivariant         (which requires knowing to include those
say that 3D space has Euclidean symmetry.        physical system in terms of invariant fea-        interactions); or (ii) an equivariant model
                                                 tures. With an equivariant model, a physical      may be used, which may make more accu-
One useful way of categorizing machine           system can be articulated using the same          rate or efficient predictions because it has
learning models applied in the physical          means used by many physical simulations:          more expressive operations [4]. The more

                                                                                                       Trends in Chemistry, Month 2020, Vol. xx, No. xx   1
Trends in Chemistry

(A)                                       (B)                            (C)                                              exotic or complex a property, the more
                                                                                                                          likely there are non-trivial geometric tensor
                                                                                                                          interactions at play (e.g., multipole interac-
                                                                                                                          tions, antisymmetric exchange).

                                                                                                                          Euclidean Neural Networks Are
                                                                                                                          Equivariant Models
                                                                                                                          Neural networks are one of the most
                                                                                                   Trends in Chemistry    flexible machine learning methods since
Figure 1. Examples of Equivariant and Invariant Properties of Geometric Objects in 3D. (A) A system                       the only requirement is that the network
of molecules described by two different coordinate systems. (B) A 3D vector has a magnitude (orange), direction           be differentiable. A neural network is a
(pink), and location (purple), which are invariant (filled square) or equivariant (open square) under (table from top to   function f that takes in inputs x and
bottom) 3D translations, rotations, and inversion. The magnitude is a scalar and thus invariant to all of these
                                                                                                                          trainable parameters W to produce outputs
operations. (C) In Euclidean symmetry there are four ‘vector-like’ geometric tensors that transform distinctly
under (table from top to bottom) inversion, perpendicular rotation by π, and parallel reflection: a vector, a              y, f(x, W) = y. Given pairs of inputs x and
pseudovector, a double-headed vector, and a helix [5].                                                                    target output ytrue, a neural network is

                                                                                                                                                       Trends in Chemistry

Figure 2. Visualizing Equivariant Properties as Spherical Harmonic Projections. (A) A simplified example of how invariant descriptors are derived from
equivariant operations. The local atomic environments of the two carbons in an ethane molecule can be projected onto spherical harmonics (shown for L ≤ 6
with negative magnitudes set to zero). The coloring on the projection surface signifies the distance from the origin. The coefficients of the projection, organized
in the visual by rows of equal degree L and columns of equal order m, form a geometric tensor in the irreducible basis (the same basis as spherical
harmonics). One method of computing scalars from these two geometric tensors is by computing the ‘power spectrum’, which is the tensor generalization of
the dot product. This results in seven scalars (one for each L used in the projection). (B) A symmetric 3 × 3 matrix and its visualization as a spherical harmonic
signal. The magnitude of the signal at a given angle is interpreted as a radial distance and the projection is colored to signify the magnitude. For both (B) and
(C), yellow indicates the maximum value and dark purple the minimum value, which in this case can be negative. (C) The output of three randomly initialized
Euclidean neural networks when given a tetrahedron geometry (top) and octahedron geometry (bottom) as input. The outputs have equal or higher symmetry
than the inputs.

2   Trends in Chemistry, Month 2020, Vol. xx, No. xx
Trends in Chemistry

trained by computing derivatives of the for these varied purposes is how equivariant and so on. Likewise, the uses for Euclidean
loss [e.g., L ¼ ðy − y true Þ2 ] with respect to operations and learnable equivariant symmetry equivariant machine learning
trainable parameters W, and updating W ac- modules are composed.                               models may similarly surprise us in the
                                               ∂L                                              as-yet-untapped modes of investigation
cording to a learning rate η, W ¼ W þ η ∂w.
                                                   Euclidean Neural Networks                   they offer. For example, we recently found
First proposed in 2018, Euclidean neural Naturally Handle Geometric                            that the gradients of Euclidean neural
networks (tensor field networks [6], Tensors, Point Groups, and                                 networks can be used to find symmetry-
Clebsch–Gordan nets [7], 3D steerable Space Groups                                             implied ‘missing’ data unbeknown to the
CNNs [8], and their descendants [9,10]) Euclidean neural networks provides a                   researcher     (e.g., order parameters of
are a flexible, general framework for learning mathematically rigorous framework for second-order phase transitions [13]). By
3D Euclidean symmetry equivariant func- articulating scientific questions in terms of bounding learnable functions of physical
tions that can be trained on the context of geometric tensors and their tensor interac- data to be equivariant to 3D Euclidean
a given dataset. To achieve equivariance tions. Inputs, outputs, and intermediate symmetry, we can ask more targeted ques-
in Euclidean neural networks, scalar multi- data are completely specified by their tions with machine learning and explore the
plication is replaced by the more general transformation properties.                           full-ranging consequences of our funda-
tensor product and convolutional kernels                                                       mental assumption of symmetry.
are restricted to be composed of spherical Geometric tensors take many forms and
harmonics and learnable radial functions, can represent many different quantities: nu- Acknowledgments
Wð!
                                                                                               T.E.S. thanks Lucy Morrell, Sinéad Griffin, Josh
     r Þ ¼ RðjrjÞY lm ð^r Þ. Scalar nonlinearities merical geometric tensors, atomic orbitals,
                                                                                               Rackers, and Risi Kondor for helpful discussions.
must also be replaced with equivariant             or projections of  geometry.  Figures  1C
                                                                                               T.E.S. was supported by the Laboratory Directed
equivalents.                                       and 2B give additional examples of this
                                                                                               Research and Development Program of Lawrence
                                                   variety.                                    Berkeley National Laboratory under US Department
Due to these mathematical complexities,                                                           of Energy Contract No. DE-AC02-05CH11231.
Euclidean neural networks can be challeng-      Another particularly useful aspect of
ing to implement; however, there are open-      handling Euclidean symmetry in full generality Resources
source implementations (e.g., e3nn is an        is that you get all subgroup symmetries i https://doi.org/10.5281/zenodo.3723557
open-source PyTorch library that combines       (e.g., 3D point groups, 3D space groups)
                                                                                               1
the implementations of [6,8] and imple-         for free. Figure 2C shows an example of Computational Research Division, Lawrence Berkeley National
                                                                                               Laboratory, Berkeley, CA 947020, USA
ments a variety of additional equivariant       how the output of even randomly initialized 2Center for Advanced Mathematics for Energy Research
layers and utility functions for converting     Euclidean neural networks will have equal Applications (CAMERA), Lawrence Berkeley National
                                                                                               Laboratory, Berkeley, CA 94720, USA
and visualizing geometric tensors)i.            or higher symmetry than the input.
                                                                                                  *Correspondence:
                                                Since these networks intrinsically uphold any     tsmidt@lbl.gov (T.E. Smidt).
Equivariance has also been encoded in
                                                                                                  https://doi.org/10.1016/j.trechm.2020.10.006
other machine learning approaches such          and all selection rules that occur in physical
as kernel methods by using equivariant          systems, they act as ‘symmetry compilers’         © 2020 Elsevier Inc. All rights reserved.

kernels and an equivariant definition of co-     that check your thinking about the data
variance [11]. Euclidean neural networks        types of your physical system. Using these        References
                                                                                                  1. Klicpera, J. et al. (2020) Directional Message Passing for
can learn equivariant functions that generate   models requires more forethought than a               Molecular Graphs, International Conference on Learning
                                                traditional neural network; in exchange,              Representations
scalar invariants or equivariant kernels for                                                      2. Bartók, A.P. et al. (2013) On representing chemical
use with these traditional machine learning     they cannot learn to do something that                environments. Phys. Rev. B 87, 184115
                                                does not symmetrically make sense.                3. Schütt, K. et al. (2017) Schnet: a continuous-filter
methods.                                                                                              convolutional neural network for modeling quantum
                                                                                                      interactions. Adv. Neural Inf. Proces. Syst. 31,
                                                                                                      991–1001
Euclidean neural networks can be used to        Beginning to Uncover Features of                  4. Miller, B.K. et al. (2020) Relevance of rotationally equivariant
build end-to-end models for the prediction      E(3) Equivariant Models                               convolutions for predicting molecular properties. arXiv
                                                                                                      Published online August 22, 2020. https://arxiv.org/abs/
of physical properties (e.g., molecular dy-     Euclidean symmetry is a simple assumption             2008.08461
namics forces) from atomic geometries and       that has many unintuitive consequences:           5. Hlinka, J. (2014) Eight types of symmetrically distinct
                                                                                                      vectorlike physical quantities. Phys. Rev. Lett. 113,
initial atomic features (e.g., atom types).     geometry, geometric tensors, normal                   165502
They can be used to manipulate atomic           modes, selection rules in spectroscopy,           6. Thomas, N. et al. (2018) Tensor field networks: rotation- and
                                                                                                      translation-equivariant neural networks for 3D point clouds.
geometries and craft hierarchical features      space groups, point groups, multipole in-             arXiv Published online May 18, 2018. https://arxiv.org/abs/
[12]. The only difference between models        teractions, second-order phase transitions,           1802.08219

                                                                                                        Trends in Chemistry, Month 2020, Vol. xx, No. xx           3
Trends in Chemistry

7.   Kondor, R. et al. (2018) Clebsch–Gordan nets: a fully        10. Fuchs, F.B. et al. (2020) SE(3)-transformers: 3D               protein complexes. arXiv Published online June 5, 2020.
     Fourier space spherical convolutional neural network.            roto-translation equivariant attention networks. arXiv         https://arxiv.org/abs/2006.09275
     Adv. Neural Inf. Proces. Syst. 32, 10117–10126                   Published online June 22, 2020. https://arxiv.org/         13. Smidt, T.E. et al. (2020) Finding symmetry breaking order
8.   Weiler, M. et al. (2018) 3D steerable CNNs: learning             abs/2006.10503                                                 parameters with Euclidean neural networks. arXiv.
     rotationally equivariant features in volumetric data. Adv.   11. Grisafi, A. et al. (2018) Symmetry-adapted machine learn-       Published online July 4, 2020. https://arxiv.org/abs/
     Neural Inf. Proces. Syst. 32, 10402–10413                        ing for tensorial properties of atomistic systems. Phys.       2007.02005
9.   Anderson, B.M. et al. (2019) Cormorant: covariant molec-         Rev. Lett. 120, 036002
     ular neural networks. Adv. Neural Inf. Proces. Syst. 32,     12. Eismann, S. et al. (2020) Hierarchical, rotation-
     14510–14519                                                      equivariant neural networks to predict the structure of

4    Trends in Chemistry, Month 2020, Vol. xx, No. xx
You can also read