Weighted Low-Rank Tensor Representation for Multi-View Subspace Clustering - Frontiers

Page created by Bertha Avila
 
CONTINUE READING
ORIGINAL RESEARCH
                                                                                                                                              published: 21 January 2021
                                                                                                                                          doi: 10.3389/fphy.2020.618224

                                             Weighted Low-Rank Tensor
                                             Representation for Multi-View
                                             Subspace Clustering
                                             Shuqin Wang 1, Yongyong Chen 2* and Fangying Zheng 3
                                             1
                                               Institute of Information Science, Beijing Jiaotong University, Beijing, China, 2School of Computer Science and Technology,
                                             Harbin Institute of Technology, Shenzhen, China, 3Department of Mathematical Sciences, Zhejiang Sci-Tech University,
                                             Hangzhou, China

                                             Multi-view clustering has been deeply explored since the compatible and complementary
                                             information among views can be well captured. Recently, the low-rank tensor
                                             representation-based methods have effectively improved the clustering performance by
                                             exploring high-order correlations between multiple views. However, most of them often
                                             express the low-rank structure of the self-representative tensor by the sum of unfolded
                                             matrix nuclear norms, which may cause the loss of information in the tensor structure. In
                                             addition, the amount of effective information in all views is not consistent, and it is
                                             unreasonable to treat their contribution to clustering equally. To address the above
                                             issues, we propose a novel weighted low-rank tensor representation (WLRTR) method
                                             for multi-view subspace clustering, which encodes the low-rank structure of the
                         Edited by:          representation tensor through Tucker decomposition and weights the core tensor to
                       Gang Zhang,
     Nanjing Normal University, China
                                             retain the main information of the views. Under the augmented Lagrangian method
                     Reviewed by:
                                             framework, an iterative algorithm is designed to solve the WLRTR method. Numerical
                    Guangqi Jiang,           studies on four real databases have proved that WLRTR is superior to eight state-of-the-art
   Dalian Maritime University, China
                                             clustering methods.
                           Wei Yan,
Guangdong University of Technology,          Keywords: clustering, low-rank tensor representation, subspace clustering, tucker decomposition, multi-view
                               China         clustering
               *Correspondence:
                  Yongyong Chen
       YongyongChen.cn@gmail.com             INTRODUCTION
                   Specialty section:        The advance of information technology has unleashed a multi-view feature deluge, which allows data
        This article was submitted to        to be described by multiple views. For example, an article can be expressed in multiple languages; an
    Radiation Detectors and Imaging,         image can be characterized by colors, edges, and textures. Multi-view features not only contain
              a section of the journal       compatible information, but also provide complementary information, which boost the performance
                   Frontiers in Physics
                                             of data analysis. Recently, [1] applied multi-view binary learning to obtain the supplementary
        Received: 16 October 2020            information from multiple views. [2] proposed a kernelized multi-view subspace analysis method via
      Accepted: 07 December 2020
                                             self-weighted learning. Due to the lack of label, clustering using multiple views has become a popular
       Published: 21 January 2021
                                             research direction [3].
                           Citation:             A large number of clustering methods have been developed in the past several decades. The most
Wang S, Chen Y and Zheng F (2021)
                                             classic clustering method is the k-means method [4–6]. However, it cannot guarantee the accuracy of
        Weighted Low-Rank Tensor
      Representation for Multi-View
                                             clustering since it is based on the distance of the original features and them are easily affected by
               Subspace Clustering.          outliers and noises. Many researchers have pointed out that the subspace clustering method can
             Front. Phys. 8:618224.          effectively overcome the above problem. As a promising technique, subspace clustering aims to find
    doi: 10.3389/fphy.2020.618224            clusters within different subspaces by the assumption that each data point can be represented as a

Frontiers in Physics | www.frontiersin.org                                           1                                          January 2021 | Volume 8 | Article 618224
Wang et al.                                                                                                Weighted Low-Rank Tensor Representation

linear combination of the other samples [7]. The subspace                 (1) We propose a weighted low-rank tensor representation
clustering-basaed methods can be roughly divided into three                   (WLRTR) method for multi-view subspace clustering, in
types: matrix factorization methods [8–11], statistical methods               which all representation matrices are stored as a
[12] and spectral clustering methods [7,13,14]. The matrix                    representation tensor with two spatial and one view modes.
factorization-based subspace clustering methods perform low-              (2) Tucker decomposition is used to calculate the core tensor for
rank matrix factorization on the data matrix to achieve clustering,           the representation tensor and the low-rank constraints are
but they are only suitable for noise-free data matrices and thus              applied to capture high-order correlation among multiple
loss of generalization. Although the statistical-based subspace               views and remove redundant information. WLRTR assigns
clustering methods can clearly deal with the influence of outliers             different weights on the singular values in the core tensor to
or noise, their clustering performance is also affected by the                differently treat singular values.
number of subspaces, which hinders their practical applications.          (3) Based on the augmented Lagrangian multiplier method, we
At present, the spectral clustering-based subspace clustering                 design an iterative algorithm to solve the proposed WLRTR
methods are widely used because they can well deal with high-                 model, and conduct experiments on four challenging
dimensional data with noise and outliers. Among them, two                     databases to verify the superiority of WLRTR method over
representative examples includes sparse subspace clustering                   eight state-of-the-art single-view and multi-view clustering
(SSC) [13] and low-rank representation (LRR) [7] by                           methods.
obtaining a sparse or low-rank linear representation of
datasets, respectively. When encountering multi-view features,                The remainder of this paper is organized as follows. Section 2
SSC and LRR can not well discover the high correlation among              summarizes the notations, basic definitions and related content of
them. To overcome this limitation, Xia et al. [15] applied LRR for        subspace clustering involved in this paper. In Section 3, we
multi-view clustering to learn a low-rank transition probability          introduce the proposed WLRTR model, and design an
matrix as the input of the standard Markov chain clustering               iterative algorithm to solve it. Extensive experiments and
method. Taking the different types of noise in samples into               model analysis are reported in Section 4. The conclusion of this
account, Najafi et al. [16] combined the low-rank                          paper is summarized in Section 5.
approximation with error learning to eliminate noise and
outliers. The work in [17] used low-rank and sparse
constraints for multi-view clustering simultaneously. One                 RELATED WORKS
common limitation of them is that the above methods only
capture the pairwise correlation between different views.                 In this section, we aim to introduce the notations, basic
Considering the possible high-order correlation of multiple               definitions through this paper and the framework of subspace
views, Zhang et al. [3] proposed a low-rank tensor constraint-            clustering methods.
regularied multi-view subspace clustering method. The study in
[18] was inspired by [3] to introduce Hyper-Laplacian constraint          Notations
to preserve the geometric structure of the data. Compared with            For a third-order tensor, we represent it using bold calligraphy
most matrix-based methods [15,17], the tensor-based multi-view            letter (e.g., X ). The matrices and vectors are represented by upper
clustering methods have achieved satisfactory results, which              case letters (e.g., X) and lower case letters (e.g., x), respectively.
demonstrates that the high-order correlation of the data is               The elements of tensor and matrix are defined as X ijk and xij ,
indispensable. The above methods impose the low-rank                      respectively. The l2,1 norm of matrix X is defined as
constraint on the constructed self-representative tensor                                       
                                                                          ||X||2,1   |xij |2  xj 2, where xj represents the j-th
through the unfolding matrix nuclear norm. Unfortunately,                            i        j     j
this rank-sum tensor norm lacks a clear physical meaning for              column vector
                                                                                   of X. The Frobenius norm of X is defined as
general tensor [19].                                                      ||X||F  |xij |2 . We denote the i-th horizontal, lateral and
                                                                                         ij
   In this paper, we proposed the weighted low-rank tensor
                                                                          frontal slice by X (i, :, :), X (:, i, :),X
                                                                                                                       (:, :, i).
                                                                                                                                The Frobenius norm
representation (WLRTR) method for multi-view subspace
clustering. Similar to the above tensor-based methods [3,18],             and L1 norm of tensor are ||X ||F  |X ijk |2 and ||X ||1   |X ijk |.
                                                                                                                 ijk                       i,j,k
WLRTR still stacks the self-representation matrices of all views          For 3 − mode tensor X ∈ Rn1 ×n2 ×n3, the Tucker decomposition of X
into a representation tensor, and then applies low-rank constraint        is X  S ×1 U1 ×2 U2 ×3 U3 . S ∈ Rn1 ×n2 ×n3 is the core tensor and
on it to obtain the high-order correlation among multiple views.          Ui ∈ Rni ×ni (i  1, 2, 3) is the orthogonal factor matrix. The Tucker
Different from them, we exploits the classic Tucker                       decomposition will be exploited to depict the low-rank property of the
decomposition to encode the low-rank property, which                      representation tensor.
decomposes the representation tensor into one core tensor and
three factor matrices. Considering that the information contained         Subspace Clustering
in different views may be partially different, and the                    Subspace clustering is an effective method for processing high-
complementary information between views contributes                       dimensional data clustering. It divides the original feature space
differently to clustering, the proposed WLRTR treats the                  into several subspaces and then imposes constraints on each
singular values differently to improve the capability. The main           subspace to construct the similarity matrix. Suppose X 
contributions of this paper are summarized as follows:                    [x1 , x2 , /, xn ] ∈ Rd×n is a feature matrix with n samples, and

Frontiers in Physics | www.frontiersin.org                            2                                      January 2021 | Volume 8 | Article 618224
Wang et al.                                                                                                               Weighted Low-Rank Tensor Representation

d represents the dimension of one sample. The subspace                            weight to penalize the singular values of Z(m) in the tensor
clustering model based on LRR is expressed as follows:                            nuclear norm. In order to overcome these limitations, we
                                                                                  propose a novel weighted low-rank tensor representation
                                   min||Z||* + β||E||2,1
                                    Z,E                                 (1)       (WLRTR) method, which uses Tucker decomposition to
                                     s.t. X  XZ + E,                             simplify the calculation of the tensor nuclear norm and
where ||Z||* denotes the nuclear norm (sum of all singular values                 assigns different weights to the core tensor to take advantage
of Z). This model has achieved the promising clustering effect,                   of the main information in different views. The proposed
because the self-representation matrix Z represents the                           WLRTR is formulated as:
correlation between samples, which is convenient to obtain the
                                                                                          min           ||Z − S ×1 U1 ×2 U2 ×3 U3 ||2F + α||ω+S||1 + β||E||2,1
final similarity matrix C  | |. However, the above LRR
                             |Z|+ Z T                                                Z,S,E,U1 ,U2 ,U3
                                              2
                                                                                     s.t. X  X (v) Z (v) + E(v) , (v  1, 2, /, V),
                                                                                            (v)
method is only suitable for single-view clustering. For the                          Z  ΦZ (1) , Z (2) , /, Z (V) , E  E(1) ; E(2) ; /; E(V) ,
                        V
dataset {X (v) }v1 with V views, the effective single-view                                                                                                     (4)
clustering method is usually extended to multi-view clustering:
                                                                                  where ω  c/|σ(Z)| + ϵ, c and ϵ are constants. α and β are two
                                    
                                          V
                  min||Z||* + β  E(v) 2,1                                nonnegative parameters. The WLRTR model consists of three
                   (v)
                  Z,E                 v1
                                                                        (2)       parts: the first term obtains the core tensor through Tucker
                            (v)      (v)      (v)
                  s.t. X           X Z + E , (v  1, 2, /, V),                   decomposition; the second term weights the core tensor to
                                                                                  preserve the main feature information, and uses the l1 norm
   The LRR-based multi-view method not only improves the                          to encode the low-rank structure of the self-representing tensor;
accuracy of clustering, but also detects outliers from multiple                   since the errors are specific with respect to samples, the third term
angles [20]. However, with the increase of feature views, the above               uses the l2,1 norm to encourage columns sparse and eliminate
models will inevitably suffer from information loss when fusing                   noise and outliers.
high-dimensional data views. It is urgent to explore efficient
clustering methods.
                                                                                  Optimization of WLRTR
                                                                                  In this section, we aim to use the ALM to solve the proposed
WEIGHTED LOW-RANK TENSOR                                                          WLRTR model in Eq. 4. Since the variable Z is involved in
REPRESENTATION MODEL                                                              the objective function and constraint conditions, it is
                                                                                  difficult to directly solve the proposed WLRTR model. To
In this section, we first introduce an existing tensor-based multi-                solve this problem, we use the variable-splitting technique
view clustering method, and then propose a novel weighted low-                    and introduce one auxiliary tensor variable Y. Therefore, the
rank tensor representation (WLRTR) method. Finally, the                           Eq. 4 can be transformed into the following formulation:
WLRTR is solved by the augmented Lagrangian multiplier
(ALM) method.                                                                             min           ||Z − S ×1 U1 ×2 U2 ×3 U3 ||2F + α||ω+S||1 + β||E||2,1
                                                                                     Z,S,E,U1 ,U2 ,U3
                                                                                     s.t. X  X (v) Y (v) + E(v) , (v  1, 2, /, V),
                                                                                            (v)
Model Formulation                                                                    Z  ΦZ (1) , Z (2) , /, Z (V) , E  E(1) ; E(2) ; /; E(V) , Z  Y.
In order to make full use of the compatible and complementary
information among multiple views, Zhang et al. [3] used LRR to                                                                                                  (5)
perform tensor-based multi-view clustering. The main process is                      Correspondingly, the augmented Lagrangian function of
to stack the self-representation matrix of each view as a frontal                 constrained model in Eq. 5 is obtained by
slice of the third-order representation tensor which is imposed
                                                                                  LZ, Y, S, E; U1 , U2 , U3 , Θ, Π, ρ  ||Z − S ×1 U1 ×2 U2 ×3 U3 ||2F + α||ω+S||1 +
low-rank constraint. The whole model is formulated as follows
                                                                                                                                                      
                                                                                                                                                   Π2 ⎠
                                                                                                                                      2
                                                                                              ρ ⎝ V (v)                          Θ(v)
        min||Z||* + β||E||2,1                                                      β||E||2,1 + ⎛  X − X (v) Y (v) − E(v) +             + Z − Y +  ⎞ ,
         Z,E                                                                                  2 v1                                ρ F               ρ F
        s.t. X (v)  X (v) Z (v) + E(v) , (v  1, 2, /, V),             (3)
                                                                                                                                                                 (6)
        Z  ΦZ (1) , Z (2) , /, Z (V) , E  E(1) ; E(2) ; /; E(V) ,
                                                                                  where Θ and Π are the Lagrange multipliers. ρ > 0 is the penalty
where the tensor nuclear norm is directly extended from the                       parameter. Then, each variable is updated iteratively by fixing the
                                                     3                            other variables. The detailed iteration procedure is shown as
matrix nuclear norm: ||Z||*   ξ m Z(m) p , Z(m) is the                          follows:
                                                    m1
unfolding matrix along the m-th mode, ξ m is a constant that                      Update self-representation tensor Z: When other variables are
              3                                                                   fixed, Z can be updated by
satisfies  ξ m  1, ξ m > 0. However, this rank-sum tensor norm
                                                                                                                                                
          m1
                                                                                                                                ρ        Π2
lacks a clear physical meaning for general tensor [19]. In                         Z *  arg min||Z − S ×1 U1 ×2 U2 ×3 U3 ||2F + Z − Y +  .
addition, the meaningful information contained in each view
                                                                                             Z                                  2            ρ F
is not completely equal, so it is unreasonable to use the same                                                                                                  (7)

Frontiers in Physics | www.frontiersin.org                                    3                                             January 2021 | Volume 8 | Article 618224
Wang et al.                                                                                                                         Weighted Low-Rank Tensor Representation

  By setting the derivative of Eq. 7 with respect to Z to zero,                                                    p
                                                                                                                Θv  Θv + ρX (v) − X (v) Y (v) − E(v) ;
we have                                                                                                         Πp  Π + ρ(Z − Y);                                         (16)
                  2S ×1 U1 ×2 U2 ×3 U3 + ρY − Π                                                                  ρp  minλpρ, ρmax ,
             Z*                                 .          (8)
                                2+ρ
                                                                                              where λ > 0 is to facilitate the convergence speed [22]. In order to
Update auxiliary variable Y (v) : Update auxiliary variable Y (v)                             increase ρ, we set β  1.5. ρmax is the max value of the penalty
with fixed residual variables is equivalent to optimizing                                      parameter ρ. The WLRTR algorithm is summarized in
                                                                                              Algorithm 1.
                                                                
              (v)*               ρ  (v) (v) (v) (v) Θv 2                          Algorithm 1: WLRTR for multi-view subspace clustering
          Y           arg min X − X Y − E +                                             Input: multi-view features {X (v) , v  1, 2, /, V} ; parameter
                           Y (v) 2                        ρ F
                                                                                    (9)       α, λ, c;
                                                       2                                    Initialize: Z, S, E, Θ1 , Π1 initialized to 0; ρ1  10− 3 , β  1.5,
                                               Π(v) 
                           + Z (v) − Y (v) +          .                              tol  10− 7 , t  1;
                                                  ρ F
                                                                                                    1: while not converged do
   The closed-form of Y (v)* can be calculated by setting the                                       2: Update Z t+1 according to Eq. 8;
                                                                                                                     (v)
derivative of Eq. 9 to zero                                                                         3: Update Yt+1        according to Eq. 10;
                                                                                                    4: Update S t+1 according to Eq. 13;
                                       −1
Y (v)*  ρI + ρX (v)T X (v)               × ρX (v)T X (v) − ρX (v)T E(v) + X (v)T Θv             5: Update Et+1 according to Eq. 15;
           + ρZ (v) + Π(v) .                                                                       6: Update Θ(v)   t+1 , Πt+1 , and ρt+1 according to Eqs. 16;
                                                                                                    7: Check     the  convergence condition:
                                                                                  (10)                                                                       
                                                                                                    8: X (v) − X (v) Y (v) − E(v) ∞ ≤ tol, Y (v) − Z (v) ∞ ≤ tol;
Update core tensor S: By fixing other variables, the subproblem                                      9: end while
of updating S can be written as follows                                                           Output: Z * and Y * .

   S p  arg min||Z − S ×1 U1 ×2 U2 ×3 U3 ||2F + α||ω+S||1 .                      (11)
                     S                                                                        EXPERIMENTAL RESULTS
According to [21], the Eq.(11) can be rewritten as                                            In this section, we conduct experiments on four real databases
                                                                                              and compare with eight state-of-the-art methods to verify the
                         S p  arg min||S − O||2F + α||ω+S||1 ,                   (12)        effectiveness of the proposed WLRTR. In addition, we reported a
                                       S
                                                                                              detailed analysis of the parameter selection and convergence
where O  Z ×1 U1T ×2 U2T ×3 U3T . The closed solution S * is as                              performance of the proposed WLRTR method.
follows
             S *  sign(O)max(|O| − ωα2, 0).               (13)                              Experimental settings
                                                                                              (1) Datasets: We evaluate the performance of WLRTR on three
Update error matrix E: Similar to the subproblems Z ,Y (v) and
S, the subproblem E is expressed as:                                                              categories of databases: news stories (BBC4view, BBCSport),
                                                                                                  face images (ORL), handwritten digits (UCI-3views).
                           M                          2                                   BBC4view contains 685 documents and BBCSport consists
                        ρ⎛
                         ⎝      (v)             Θ(v)  ⎞
E  arg min β||E||2,1 +
  *                                     (v) (v)
                            X − X Y − E +    (v)
                                                             ⎠                                  of 544 documents, which belong to 5 clusters. We use 4 and 2
         E              2 v1                    ρ F                                      features to construct multi-view data, respectively. ORL
               β          1                                                                       includes 400 face images with 40 clusters. We use 3
       arg min ||E||2,1 + ||E − F||2F ,                                                          features for clustering on ORL database, i.e., 4096d
             E ρ          2
                                                                                                  (dimension, d) intensity, 3304d LBP, and 6750d Gabor.
                                                                                  (14)            UCI-3views includes 2000 instance with 10 clusters. For
where F represents the matrix that vertically concatenates the                                    UCI-3views database, we adopted the 240d Fourier
matrix X (v) − X (v) Y (v) + (1/ρ)Θv along the column. The j-th                                   coefficients, the 76d pixel averages and the 6d
column of optimal solution E* can be obtained by                                                  morphological features to construct 3 views. Some examples
                                β                                                                 of ORL and UCI-3views are shown in Figure 1. Table 1
                ⎪
                ⎧
                ⎪ F:, j2 −
                ⎪
                ⎨               ρ          β                                                      summarizes the statistic information of these four databases.
    E* :, j  ⎪ F:, j F:, j, if ρ < F:, j2 ;   (15)                               (2) Compared methods: We compared WLRTR with eight
                ⎪
                ⎪             2
                ⎩                                                                                 state-of-the-art methods, including three single-view
                  0,                     otherwise.                                               clustering methods and five multi-view clustering
Update Lagrangian multipliers Θ, Π and penalty parameter ρ:                                       methods. Single-view clustering methods: SSC [13], LRR
The Lagrangian multipliers Θ, Π and the penalty parameter ρ can                                   [7] and LSR [23], which use nuclear norm, l1 norm and
be updated by                                                                                     least squares regression to learn a self-representing matrix,

Frontiers in Physics | www.frontiersin.org                                                4                                           January 2021 | Volume 8 | Article 618224
Wang et al.                                                                                                          Weighted Low-Rank Tensor Representation

  FIGURE 1 | Samples of (A) ORL and (B) UCI-3views databases.

TABLE 1 | Information of four real multi-view databases.

Categories                       Databases        Instance        Cluster               View 1           View 2      View 3                View 4           View 5

                                 BBC4view             685            5                  4659d            4633d        4655d                4684d                —
News stories                     BBCSport             544            5                  3183d            3203d          —                    —                  —
Face images                      ORL                  400           40                  4096d            3304d        6750d                  —                  —
Handwritten Digits               UCI-3views           2000          10                   240d             76d           6d                   —                  —

TABLE 2 | Clustering results on ORL database.

Method                    ACC                      NMI                      AR                     F-score               Precision                     Recall

SSC                  0.765   ±   0.008        0.893   ±   0.007     0.694   ±   0.013            0.682   ±   0.012     0.673   ±   0.007            0.764   ±   0.005
LRR                  0.773   ±   0.003        0.895   ±   0.006     0.724   ±   0.020            0.731   ±   0.004     0.701   ±   0.001            0.754   ±   0.002
LSR                  0.787   ±   0.029        0.904   ±   0.010     0.719   ±   0.026            0.726   ±   0.025     0.684   ±   0.029            0.774   ±   0.024
RMSC                 0.723   ±   0.007        0.872   ±   0.012     0.645   ±   0.003            0.654   ±   0.007     0.607   ±   0.009            0.709   ±   0.004
LT-MSC               0.795   ±   0.007        0.930   ±   0.003     0.750   ±   0.003            0.768   ±   0.004     0.766   ±   0.009            0.837   ±   0.005
MLAN                 0.705   ±   0.022        0.854   ±   0.018     0.384   ±   0.010            0.376   ±   0.015     0.254   ±   0.021            0.721   ±   0.020
GMC                  0.633   ±   0.000        0.857   ±   0.000     0.337   ±   0.000            0.360   ±   0.000     0.232   ±   0.000            0.801   ±   0.000
AWP                  0.753   ±   0.000        0.908   ±   0.000     0.697   ±   0.000            0.705   ±   0.000     0.615   ±   0.000            0.824   ±   0.000
SMSC                 0.728   ±   0.000        0.885   ±   0.000     0.660   ±   0.000            0.669   ±   0.000     0.601   ±   0.000            0.755   ±   0.000
WLRTR                0.846   ±   0.019        0.920   ±   0.007     0.776   ±   0.016            0.781   ±   0.016     0.751   ±   0.022            0.815   ±   0.010

    respectively. Multi-view clustering methods: RMSC [15]:                          Experimental Results
    RMSC utilized the low-rank and sparse matrix                                     Tables 2–5 report the clustering performance of all comparison
    decomposition to learn the shared transition probability                         methods on the four databases. The best results are highlighted in
    matrix; LT-MSC [3]: LT-MSC is the first tensor-based                              bold and the second best results are underlined. From four tables,
    multi-view clustering by the tensor nuclear norm constraint                      we can draw the following conclusions: Overall, the proposed
    to learn a representation tensor; MLAN [24]: MLAN                                WLRTR method has achieved better or comparable clustering
    performed clustering and local structure learning using                          results on all databases over all competing methods. Especially
    adaptive neighbors simultaneously; GMC [25]: GMC is a                            on the BBC4view database, WLRTR method outperforms all
    graph-based multi-view clustering method; AWP [26]:                              competing methods on six metrics. As for the ACC metric, the
    AWP is a multi-view clustering via adaptively weighted                           proposed WLRTR is higher than all methods on all datasets. In
    procrustes. SMSC [27]: SMSC used non-negative                                    particular, WLRTR method shows better results than single-view
    embedding and spectral embedding for multi-view clustering.                      clustering methods: SSC, LRR, LSR in most cases. This is because
(3) Evaluation metrics: We exploit six popular clustering                            the multi-view clustering methods fully capture the complementary
    metrics, including, accuracy (ACC), normalized mutual                            information among multiple views. The above conclusions have
    information (NMI), adjusted rank index (AR), F-score,                            verified the effectiveness of the proposed WLRTR method.
    Precision and Recall to comprehensively evaluate the                                On the ORL database, the proposed WLRTR and LT-MSC
    clustering performance. The closer the values of all                             methods have the best clustering effect among all the comparison
    evaluation metrics are to 1, the better the clustering                           methods. This shows that the tensor-based clustering methods can
    results are. We run 10 trials for each experiment and                            well explore the high-order correlation of multi-view features.
    report its average performance.                                                  Compared with LT-MSC method, WLRTR has improved ACC,

Frontiers in Physics | www.frontiersin.org                                       5                                    January 2021 | Volume 8 | Article 618224
Wang et al.                                                                                                      Weighted Low-Rank Tensor Representation

TABLE 3 | Clustering results on UCI-3views database.

Method                    ACC                    NMI                 AR                        F-score               Precision                 Recall

SSC                  0.815   ±   0.011       0.840   ±   0.001   0.770   ±   0.005          0.794   ±   0.004      0.747   ±   0.010        0.848   ±   0.004
LRR                  0.871   ±   0.001       0.768   ±   0.002   0.736   ±   0.002          0.763   ±   0.002      0.759   ±   0.002        0.767   ±   0.002
LSR                  0.819   ±   0.000       0.863   ±   0.000   0.787   ±   0.000          0.810   ±   0.000      0.756   ±   0.000        0.872   ±   0.000
RMSC                 0.915   ±   0.024       0.822   ±   0.008   0.789   ±   0.014          0.811   ±   0.012      0.797   ±   0.017        0.826   ±   0.006
LT-MSC               0.803   ±   0.001       0.775   ±   0.001   0.725   ±   0.001          0.753   ±   0.001      0.739   ±   0.001        0.767   ±   0.001
MLAN                 0.874   ±   0.000       0.910   ±   0.000   0.847   ±   0.000          0.864   ±   0.000      0.797   ±   0.000        0.943   ±   0.000
GMC                  0.736   ±   0.000       0.815   ±   0.000   0.678   ±   0.000          0.713   ±   0.000      0.644   ±   0.000        0.799   ±   0.000
AWP                  0.806   ±   0.000       0.842   ±   0.000   0.759   ±   0.000          0.785   ±   0.000      0.734   ±   0.000        0.842   ±   0.000
SMSC                 0.734   ±   0.000       0.779   ±   0.000   0.666   ±   0.000          0.700   ±   0.000      0.700   ±   0.000        0.734   ±   0.000
WLRTR                0.917   ±   0.001       0.846   ±   0.001   0.828   ±   0.001          0.845   ±   0.001      0.842   ±   0.001        0.848   ±   0.001

TABLE 4 | Clustering results on BBC4view database.

Method                    ACC                    NMI                 AR                        F-score               Precision                 Recall

SSC                  0.660   ±   0.002       0.494   ±   0.005   0.470   ±   0.001          0.599   ±   0.001      0.578   ±   0.001        0.622   ±   0.001
LRR                  0.802   ±   0.000       0.568   ±   0.000   0.621   ±   0.000          0.712   ±   0.000      0.697   ±   0.000        0.727   ±   0.000
LSR                  0.815   ±   0.001       0.589   ±   0.001   0.608   ±   0.002          0.699   ±   0.001      0.701   ±   0.001        0.697   ±   0.001
RMSC                 0.775   ±   0.003       0.616   ±   0.004   0.560   ±   0.002          0.656   ±   0.002      0.703   ±   0.003        0.616   ±   0.001
LT-MSC               0.591   ±   0.000       0.442   ±   0.005   0.400   ±   0.001          0.546   ±   0.000      0.525   ±   0.000        0.570   ±   0.001
MLAN                 0.853   ±   0.007       0.698   ±   0.010   0.716   ±   0.005          0.783   ±   0.004      0.776   ±   0.003        0.790   ±   0.004
GMC                  0.693   ±   0.000       0.563   ±   0.000   0.479   ±   0.000          0.633   ±   0.000      0.501   ±   0.000        0.860   ±   0.000
AWP                  0.904   ±   0.000       0.761   ±   0.000   0.797   ±   0.000          0.845   ±   0.000      0.838   ±   0.000        0.851   ±   0.000
SMSC                 0.816   ±   0.000       0.601   ±   0.000   0.608   ±   0.000          0.709   ±   0.000      0.648   ±   0.000        0.781   ±   0.000
WLRTR                0.931   ±   0.003       0.805   ±   0.001   0.851   ±   0.002          0.886   ±   0.002      0.885   ±   0.001        0.887   ±   0.002

TABLE 5 | Clustering results on BBCSport database.

Method                    ACC                    NMI                 AR                        F-score               Precision                 Recall

SSC                  0.627   ±   0.003       0.534   ±   0.008   0.364   ±   0.007          0.565   ±   0.005      0.427   ±   0.004        0.834   ±   0.004
LRR                  0.836   ±   0.001       0.698   ±   0.002   0.705   ±   0.001          0.776   ±   0.001      0.768   ±   0.001        0.784   ±   0.001
LSR                  0.846   ±   0.002       0.629   ±   0.002   0.625   ±   0.003          0.719   ±   0.001      0.685   ±   0.002        0.756   ±   0.001
RMSC                 0.826   ±   0.001       0.666   ±   0.001   0.637   ±   0.001          0.719   ±   0.001      0.766   ±   0.001        0.677   ±   0.001
LT-MSC               0.460   ±   0.046       0.222   ±   0.028   0.167   ±   0.043          0.428   ±   0.014      0.328   ±   0.028        0.629   ±   0.053
MLAN                 0.721   ±   0.000       0.779   ±   0.000   0.591   ±   0.000          0.714   ±   0.000      0.567   ±   0.000        0.962   ±   0.000
GMC                  0.807   ±   0.000       0.760   ±   0.000   0.722   ±   0.000          0.794   ±   0.000      0.727   ±   0.000        0.875   ±   0.000
AWP                  0.809   ±   0.000       0.723   ±   0.000   0.726   ±   0.000          0.796   ±   0.000      0.743   ±   0.000        0.857   ±   0.000
SMSC                 0.787   ±   0.000       0.715   ±   0.000   0.679   ±   0.000          0.762   ±   0.000      0.701   ±   0.000        0.835   ±   0.000
WLRTR                0.880   ±   0.002       0.736   ±   0.002   0.747   ±   0.001          0.806   ±   0.001      0.822   ±   0.001        0.791   ±   0.001

AR and F-score metrics by 5.1, 2.6, and 1.3%, respectively. The                  Model Analysis
main reason is that WLRTR takes the different contribution of each               In this section, we conduct the parameter selection and
view to the construction of the affinity matrix into consideration,               convergence analysis of the proposed WLRTR method.
and assigns weights to it to retain important information. On the
other hand, WLRTR uses Tucker decomposition technology to                        Parameter Selection
impose low-rank constraints on the core tensor instead of directly               We perform experiments on ORL and BBCSport databases to
calculating the tensor nuclear norm based on the matrix.                         investigate the influence of three parameters, i.e., α, β and c for the
    On UCI-3views and BBCSport databases, although MLAN is                       proposed WLRTR method, where parameters α and β are
better than WLRTR in some metrics, the clustering results of                     empirically selected from [0.001, 0.1] and c is selected from
MLAN on different databases are unstable, and even lower than                    [0.01, 0.2]. The influence of α and β on ACC is shown in the
all single-view clustering methods on ORL database. In addition,                 first column of Figure 2. After fixing c, we find that when α is set to
we can find that the results of the recently proposed GMC method                  a larger value, WLRTR can achieve the best result, which shows
on the four databases cannot achieve satisfactory performance.                   that noise has a greater impact on clustering. Similarly, we fix the
The reason may be the graph-based clustering methods: MLAN                       parameters α and λ to analyze the influence of c on ACC. As shown
and GMC usually use the original features to construct the affinity               in the second column of Figure 2, for the ORL database, when
matrix, however, the original features usually are destroyed by                  c  0.2, ACC reaches the maximum value. For the BBCSport
noise and outliers.                                                              database, the value of ACC has a peak at c  0.04, and when c

Frontiers in Physics | www.frontiersin.org                                   6                                    January 2021 | Volume 8 | Article 618224
Wang et al.                                                                                                               Weighted Low-Rank Tensor Representation

  FIGURE 2 | ACC versus different values of parameters α and β (left), and parameter c (right) on (A) ORL and (B) BBCSport.

  FIGURE 3 | Errors versus iterations on (A) ORL and (B) BBCSport databases.

becomes larger, the value of ACC decreases. The results show that c                       strong numerical convergence and the similar conclusions can be
is very important for the weight distribution of the core tensor.                         obtained on the BBC4view and UCI-3views databases.

Numerical Convergence                                                                     CONCLUSION AND FUTURE WORK
This subsection investigates the numerical convergence of the
proposed WLRTR method. Figure 3 shows the iterative error                                 In this paper, we developed a novel clustering method called weighted
curves on the ORL    and BBCSport databases.    The iterative error         low-rank tensor representation (WLRTR) for multi-view subspace
                                                           
is calculated by X (v) − X (v) Y (v) − E(v) ∞ and Y (v) − Z (v) ∞ .       clustering. The main advantage of WLRTR is to encode the low-rank
One can be seen that the error curves gradually decrease with the                         structure of the tensor through Tucker decomposition and l1 norm,
increase of iterations and the error are close to 0 after 25 iterations. In               avoiding the error in calculating the tensor nuclear norm with the sum
addition, the error curves stabilized only after a few fluctuations. The                   of nuclear norms of the unfolded matrices, and assigning different
above conclusions show that the proposed WLRTR method has                                 weights to the core tensor to exploit the main feature information of

Frontiers in Physics | www.frontiersin.org                                            7                                    January 2021 | Volume 8 | Article 618224
Wang et al.                                                                                                                         Weighted Low-Rank Tensor Representation

the views. In addition, the l2,1 norm is used to remove sample-specific                        AUTHOR CONTRIBUTIONS
noise and outliers. Extensive experimental results showed that the
proposed WLRTR method has promising clustering performance. In                                SW conducted extensive experiments and wrote this paper.
future work, we will further explore the structural information                               YC proposed the basic idea and revised this paper. FZ
representing tensors and use other tensor decomposition to                                    contributed to multi-view clustering experiments and
improve the performance of multi-view clustering.                                             funding support.

DATA AVAILABILITY STATEMENT
                                                                                              FUNDING
The original contributions presented in the study are included in
the article/Supplementary Material, further inquiries can be                                  This work was supported by Natural Science Foundation of
directed to the corresponding author.                                                         Zhejiang Province, China (Grant No. LY19A010025).

                                                                                              15. Xia R, Pan Y, Du L, Yin J. Robust multi-view spectral clustering via low-
REFERENCES                                                                                        rank and sparse decomposition. Proc AAAI Conf Artif Intell (2014) 28,
                                                                                                  2149–2155.
1. Jiang G, Wang H, Peng J, Chen D, Fu X (2019). Graph-based multi-view                       16. Najafi M, He L, Yu PS. Error-robust multi-view clustering. Proc Int Conf on big
   binary learning for image clustering. Available at: https://arxiv.org/abs/1912.                data (2017) 736–745. doi:10.1109/BigData.2017.8257989
   05159                                                                                      17. Brbic M, Kopriva I. Multi-view low-rank sparse subspace clustering. Pattern
2. Wang H, Wang Y, Zhang Z, Fu X, Li Z, Xu M, et al. Kernelized multiview                         Recognit (2018) 73, 247–258. doi:10.1016/j.patcog.2017.08.024
   subspace analysis by self-weighted learning. IEEE Trans Multimedia (2020) 32.              18. Lu G, Yu Q, Wang Y, Tang G. Hyper-laplacian regularized multi-view
   doi:10.1109/TMM.2020.3032023                                                                   subspace clustering with low-rank tensor constraint. Neural Networks
3. Zhang C, Fu H, Liu S, Liu G, Cao X. Low-rank tensor constrained multiview                      (2020) 125:214–223. doi:10.1016/j.neunet.2020.02.014
   subspace clustering. in: 2015 IEEE International Conference on Computer                    19. Xie Y, Tao D, Zhang W, Liu Y, Zhang L, Qu Y. On unifying multi-view self-
   Vision (ICCV); Santiago, Chile; December 7–13, 2015 (IEEE), 1582–1590.                         representations for clustering by tensor multi-rank minimization. Int J Comput
   (2015). doi:10.1109/ICCV.2015.185                                                              Vis (2018) 126:1157–1179. doi:10.1007/s11263-018-1086-2
4. Bickel S, Scheffer T. Multi-view clustering. in: Fourth IEEE International                 20. Li S, Shao M, Fu Y. Multi-view low-rank analysis with applications to outlier
   Conference on Data Mining (ICDM’04); Brighton, UK; November 1–4, 2004                          detection. ACM Trans Knowl Discov Data (2018) 12:1–22. doi:10.1145/
   (IEEE), 19–26. (2004). doi:10.1109/ICDM.2004.10095                                             3168363
5. Macqueen JB. Some methods for classification and analysis of multivariate                   21. Chang Y, Yan L, Fang H, Zhong S, Zhang Z. Weighted low-rank tensor
   observations. in: Proceedings of the 5th Berkeley Symposium on Mathematical                    recovery for hyperspectral image restoration. Comput Vis Pattern Recognit
   Statistics and Probability, Berkeley, CA, (University of California Press),                    (2017) 50:4558–4572. doi:10.1109/TCYB.2020.2983102
   281–297. (1967).                                                                           22. Chen Y, Guo Y, Wang Y, Wang D, Peng C, He G. Denoising of hyperspectral
6. Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, et al. Top 10                            images using nonconvex low rank matrix approximation. IEEE Transactions
   algorithms in data mining. Know Inf Syst (2008) 14:1–37. doi:10.1007/s10115-                   on Geoscience Remote Sensing (2017) 55:5366–5380. doi:10.1109/TGRS.2017.
   007-0114-2                                                                                     2706326
7. Liu G, Lin Z, Yan S, Sun J, Yu Y, Ma Y. Robust recovery of subspace structures             23. Lu C-Y, Min H, Zhao Z-Q, Zhu L, Huang D-S, Yan S. Robust and
   by low-rank representation. IEEE Trans Pattern Anal Mach Intell (2013) 35:                     efficient subspace segmentation via least squares regression. Proc Eur
   171–184. doi:10.1109/TPAMI.2012.88                                                             Conf Comput Vis (2012) 7578:347–360. doi:10.1007/978-3-642-
8. Gao J, Han J, Liu J. Multi-view clustering via joint nonnegative matrix                        33786-4_26
   factorization. in: Proceedings of the 2013 SIAM International Conference on                24. Nie F, Cai G, Li J, Li X. Auto-weighted multi-view learning for image clustering
   Data Mining; Austin, TX; May 2–4, 2013, 252–60. (2013). doi:10.1137/1.                         and semi-supervised classification. IEEE Trans Image Process (2018a) 27:
   9781611972832.28                                                                               1501–1511. doi:10.1109/TIP.2017.2754939
9. Shao W, He L, Yu P. Multiple incomplete views clustering via weighted                      25. Wang H, Yang Y, Liu B. GMC: Graph-based multi-view clustering. IEEE Trans
   nonnegative matrix factorization with l2,1 regularization. Proc Joint Eur                      Knowl Data Eng (2019) 32:1116–1129. doi:10.1109/TKDE.2019.2903810
   Conf Mach Learn Knowl Disc Data (2015) 9284:318–34. doi:10.1007/978-3-                     26. Nie F, Tian L, Li X. Multiview clustering via adaptively weighted
   319-23528-8_20                                                                                 procrustes. In Proceedings of the 24th ACM SIGKDD International
10. Zong L, Zhang X, Zhao L, Yu H, Zhao Q. Multi-view clustering via multi-                       Conference on Knowledge, July 2018, 2022–2030. (2018b). doi:10.1145/
     manifold regularized non-negative matrix factorization. Neural Networks                      3219819.3220049
     (2017) 88:74–89. doi:10.1016/j.neunet.2017.02.003                                        27. Hu Z, Nie F, Wang R, Li X. Multi-view spectral clustering via integrating
11. Huang S, Kang Z, Xu Z. Auto-weighted multiview clustering via deep matrix                     nonnegative embedding and spectral embedding. Information Fusion 55:
     decomposition. (2020) Pattern Recognit 97:107015. doi:10.1016/j.patcog.2019.                 251–259. (2020). doi:10.1016/j.inffus.2019.09.005
     107015
12. Rao SR, Tron R, Vidal R., Ma Y. Motion segmentation via robust subspace                   Conflict of Interest: The authors declare that the research was conducted in the
     separation in the presence of outlying, incomplete, or corrupted trajectories. in:       absence of any commercial or financial relationships that could be construed as a
     2008 IEEE Conference on Computer Vision and Pattern Recognition,                         potential conflict of interest.
     Anchorage, AK, June 23–28, 2008, 1–8. (2008). doi:10.1109/CVPR.2008.
     4587437                                                                                  Copyright © 2021 Wang, Chen and Zheng. This is an open-access article distributed
13. Elhamifar E, Vidal R. Sparse subspace clustering: Algorithm, theory, and                  under the terms of the Creative Commons Attribution License (CC BY). The use,
     applications. IEEE Trans Pattern Anal Mach Intell (2013) 35, 2765–2781.                  distribution or reproduction in other forums is permitted, provided the original
     doi:10.1109/TPAMI.2013.57                                                                author(s) and the copyright owner(s) are credited and that the original
14. Chen Y, Xiao X, Zhou Y. Jointly learning kernel representation tensor and                 publication in this journal is cited, in accordance with accepted academic
     affinity matrix for multi-view clustering. IEEE Trans on Multimedia (2020) 22,            practice. No use, distribution or reproduction is permitted which does not
     1985–1997. doi:10.1109/TMM.2019.2952984                                                  comply with these terms.

Frontiers in Physics | www.frontiersin.org                                                8                                           January 2021 | Volume 8 | Article 618224
You can also read