COREDIAG: Eliminating Redundancy in Constraint Sets

Page created by Heather Kim
 
CONTINUE READING
C ORE D IAG: Eliminating Redundancy in Constraint Sets

                                                            Alexander Felfernig 1 , Christoph Zehentner 1 , Paul Blazek 2
                                                                     1
                                                                         Graz University of Technology, Inffeldgasse 16b, 8010 Graz, Austria
                                                                                         alexander.felfernig@ist.tugraz.at
arXiv:2102.12151v1 [cs.AI] 24 Feb 2021

                                                                                         christoph.zehentner@ist.tugraz.at
                                                                     2
                                                                         cyLEDGE Media GmbH, Schottenfeldgasse 60, 1070 Vienna, Austria
                                                                                               p.blazek@cyledge.com

                                                                ABSTRACT                                    formally, if C={c1 , c2 , ..., cn } is the initial set of con-
                                                                                                            straints defined for the knowledge base and one con-
                                              Constraint-based environments such as config-                 straint ci is redundant, then (C − {ci }) ∪ C is incon-
                                              uration systems, recommender systems, and
                                              scheduling systems support users in different de-             sistent (C is the negation of C).
                                              cision making scenarios. These environments                      Redundancy elimination in knowledge bases is a
                                              exploit a knowledge base for determining solu-                topic extensively investigated by AI research. The
                                              tions of interest for the user. The development               identification of redundant constraints plays a ma-
                                              and maintenance of such knowledge bases is an                 jor role, for example, in the development and main-
                                              extremely time-consuming and error-prone task.                tenance of configuration knowledge bases (see, e.g.,
                                              Users often specify constraints which do not re-              (Sabin and Freuder, 1999)). The authors introduce
                                              flect the real-world. For example, redundant con-             concepts for the detection of redundant constraints
                                              straints are specified which often increase both,             in conditional constraint satisfaction problems (CC-
                                              the effort for calculating a solution and efforts re-         SPs). The approach is based on the idea of analyz-
                                              lated to knowledge base development and main-                 ing the solution space of the given problem (on the
                                              tenance. In this paper we present a new al-                   level of individual solutions) in order to detect differ-
                                              gorithm (C ORE D IAG) which can be exploited                  ent types of redundant constraints. (Piette, 2008) pro-
                                              for the determination of minimal cores (minimal               vide an in-depth discussion of the role of redundancy
                                              non-redundant constraint sets). The algorithm is              elimination in SAT solving. They introduce an (in-
                                              especially useful for distributed knowledge en-               complete) algorithm for the elimination of redundant
                                              gineering scenarios where the degree of redun-                clauses and show its applicability on the basis of an
                                              dancy can become high. In order to show the ap-               empirical study. The role of redundancies in ontology
                                              plicability of our approach, we present an empir-             development is analyzed by (Fahad and Qadir, 2008).
                                              ical study conducted with commercial configura-               The authors point out the importance of redun-
                                              tion knowledge bases.                                         dancy elimination and discuss typical modeling errors
                                                                                                            that occur during ontology development and mainte-
                                           Keywords: redundant constraints, minimal cores.                  nance. (Grimm and Wissmann, 2011) introduce algo-
                                                                                                            rithms for redundancy elimination in OWL ontologies.
                                                                                                            The authors propose an algorithm that computes re-
                                         1 INTRODUCTION                                                     dundant axioms by exploiting prior knowledge of the
                                         The central element of a constraint-based ap-                      dispensibility of axioms. (Levy and Sagiv, 1992) ana-
                                         plication is a knowledge base (constraint set).                    lyze two types of redundancies in Datalog programs.
                                         When developing and maintaining constraint sets,                   First, they interpret redundancy in terms of reachabil-
                                         users are often defining faulty constraints (the                   ity, i.e., rules and predicates are eliminated that are not
                                         system calculates solutions which are not al-                      part of any derivation tree. Second, redundancy is de-
                                         lowed or – in the worst case – no solution can                     fined on the basis of the concepts of minimal deriva-
                                         be found) (Bakker et al., 1993; Felfernig et al., 2004)            tion trees which do not include any pair of identical
                                         or redundant constraints which are not needed                      atoms where one is the predecessor of the other one.
                                         to express the domain knowledge in a com-                             All the mentioned approaches focus on the identifi-
                                         plete fashion (Sabin and Freuder, 1999; Piette, 2008;              cation of redundant constraints in centralized scenarios
                                         Fahad and Qadir, 2008; Levy and Sagiv, 1992). In this              where a knowledge engineer is interested in identify-
                                         paper we focus on situations where users are defin-                ing redundant constraints in the given knowledge base.
                                         ing redundant constraints which – when deleted from                In such scenarios it is assumed that only a small subset
                                         the constraint set (knowledge base) – do not change                of the given constraints is redundant (this assump-
                                         the semantics of the remaining constraint set. More                tion is also denoted as low redundancy assumption

                                                                                                                                                                        1
22nd International Workshop on Principles of Diagnosis

(Grimm and Wissmann, 2011)). Existing algorithms                   The following configuration task will be used as
are focusing on such centralized scenarios. In this             a working example throughout the paper. The vari-
paper we go one step further and propose an algorithm           able type represents the type of the car, pdc is the
which is especially useful in distributed knowledge             park distance control feature, fuel represents the av-
engineering scenarios where we can expect a larger              erage fuel consumption per 100 kilometers, a skibag
number of redundant constraints due to the fact that            allows convenient ski stowage inside the car, and 4-
different contributors add constraints which are related        wheel represents the actuation type (4-wheel supported
to the same topic (see, e.g., (Chklovski and Gil, 2005;         or not supported). These variables represent the possi-
Richardson and Domingos, 2003)) – we denote the                 ble combinations of customer requirements. The set
assumption of larger sets of redundant constraints              CKB = {c1 , c2 , c3 , c4 , c5 } defines additional re-
the high redundancy assumption. For example, we                 strictions on the set of possible customer requirements
envision a scenario where a large number of users               CR = {c6 , c7 , c8 , c9 , c10 }.
propose constraints to be applied by a constraint-                  • V = {type, f uel, skibag, 4 − wheel, pdc}
based configuration or recommendation engine
(Felfernig and Burke, 2008) and the task of an un-                  • D={
derlying diagnosis algorithm is to identify minimal                    dom(type) = {city, limo, combi, xdrive},
sets of constraints which retain the semantics of the                  dom(f uel) = {4l, 6l, 10l},
original constraint set – we denote such constraint sets               dom(skibag) = {yes, no},
as minimal cores. Note that the following discussions                  dom(4 − wheel) = {yes, no},
are based on the assumption of consistent constraint                   dom(pdc) = {yes, no}}
sets. Methods for consistency restoration are dis-                  • CKB = {
cussed in (Bakker et al., 1993; Felfernig et al., 2004;                c1 : 4 − wheel = yes → type = xdrive,
Friedrich and Shchekotykhin, 2005;                                     c2 : skibag = yes → type 6= city,
Felfernig et al., 2011).                                               c3 : f uel = 4l → type = city,
   The major contributions of our paper are the fol-                   c4 : f uel = 6l → type 6= xdrive,
lowing. First, we introduce a new algorithm which                      c5 : type = city → f uel 6= 10l}
allows for a more efficient determination of redun-
dant constraints especially in the context of distributed           • CR = {
(community-based) knowledge engineering scenarios.                     c6 : 4 − wheel = no,
Second, we present the results of a performance anal-                  c7 : f uel = 4l,
ysis of our algorithm conducted with real-world con-                   c8 : type = city,
figuration knowledge bases.                                            c9 : skibag = no,
   The remainder of this paper is organized as follows.                c10 : pdc = yes}
In Section 2 we introduce a simple example configu-                On the basis of this example configuration task we
ration knowledge base from the automotive domain.               now give a definition of a corresponding configuration
In Section 3 we introduce a basic algorithm for the             (solution).
determination of redundant constraints in centralized
settings (S EQUENTIAL). In Section 4 we introduce the           Definition (Configuration) : A configuration (so-
C ORE D IAG algorithm. Thereafter we report the re-             lution) for a configuration task is an instantiation
sults of a performance evaluation conducted with real-          I={v1 = ins1 , v2 = ins2 , ..., vn = insn } where insk
world configuration knowledge bases (Section 5). The            is an element of the domain of vk . A configuration is
paper is concluded with Section 6.                              consistent if the assignments in I are consistent with
                                                                the constraints in C. A complete solution is one in
2 WORKING EXAMPLE                                               which all the variables are instantiated. Finally, a con-
For illustration purposes we use a car configuration            figuration is valid if it is both, consistent and complete.
knowledge base throughout this paper. A configura-                 A configuration for our example configuration task
tion task can be defined as a basic constraint satisfac-        would be I = {4 − wheel = no, f uel = 4l, type =
tion problem (CSP) (Tsang, 1993) (see the following             city, skibag = no, pdc = yes}.
definition).1
                                                                3   DETERMINING REDUNDANT
                                                                    CONSTRAINTS
Definition (Configuration Task) : A configura-
tion task can be defined as a CSP (V, D, C). V =                Let us now consider a simple adaptation of the original
                                                                                                                   ′
{v1 , v2 , ..., vn } is a set of finite domain variables. D     set of constraints CKB which we denote with CKB       .
                                                                  ′
= {dom(v1 ), dom(v2 ), ..., dom(vn )} is a set of cor-          CKB includes an additional constraint ca which has
responding domain definitions where dom(vk ) is the             been added by a knowledge engineer.
domain of the variable vk . C = CKB ∪ CR where                       ′
                                                                   CKB     = {
CKB = {c1 , c2 , ..., cq } is a set of domain-specific                ca : skibag 6= no → type = limo ∨
constraints (the configuration knowledge base) and                         type = combi ∨
CR = {cq+1 , cq+2 , ..., ct } is a set of customer re-                     type = xdrive,
quirements (as well represented as constraints).                      c1 : 4 − wheel = yes → type = xdrive,
                                                                      c2 : skibag = yes → type 6= city,
    1
      Note that the presented concepts are as well applicable         c3 : f uel = 4l → type = city,
to other types of knowledge representations such as SAT or            c4 : f uel = 6l → type 6= xdrive,
description logics.                                                   c5 : type = city → f uel 6= 10l}

                                                                                                                         2
22nd International Workshop on Principles of Diagnosis

   It is obvious that ca is redundant since it does not      S EQUENTIAL in situations with a large amount of re-
further restrict the solution space defined by the con-      dundant constraints in CKB . Large amounts of redun-
straints CKB = {c1 , c2 , c3 , c4 , c5 }. In order to dis-   dant constraints typically occur in distributed knowl-
cuss constraint redundancy on a more formal level, we        edge engineering scenarios where a large number of
introduce the following definitions.                         users specify rules that in the following have to be ag-
                                                             gregated into one consistent constraint set (see, e.g.,
                                                             (Chklovski and Gil, 2005)).
Definition (Redundant Constraint) : Let ca be
a constraint element of the configuration knowledge             In the following section we introduce the C ORE -
base CKB . ca is called redundant iff CKB − {ca } |=         D IAG algorithm which is a valuable alternative to S E -
ca . If this condition is not fulfilled, ca is said to be    QUENTIAL in situations with a large number of redun-
non-redundant. Redundancy can also be analyzed by            dant constraints. After having introduced C ORE D IAG
checking CKB − {ca } ∪ CKB for consistency – if              we will analyze the performance of both algorithms
consistency is given, ca is non-redundant.                   (S EQUENTIAL and C ORE D IAG) on the basis of real-
   Iterating over each constraint of CKB , executing the     world configuration knowledge bases (Section 5).
non-redundancy check CKB − {ca } ∪ CKB , and
deleting redundant constraints from CKB results in a         4   COREDIAG
set of non-redundant constraints (the minimal core).
If the non-redundancy check fails (no solution can           The C ORE D IAG algorithm (together with C ORE D) is
be found), the constraint ca is redundant and can be         based on the principle of divide-and-conquer: when-
deleted from CKB . Otherwise (the non-redundancy             ever a set S which is a subset of CKB is inconsistent
check is successful), ca is non-redundant.                   with CKB , it is or contains a minimal core, i.e. a set
                                                             of constraints which preserve the semantics of CKB .
Definition (Minimal Core) : Let CKB be a config-             In our implementation C ORE D is responsible for de-
uration knowledge base. CKB is denoted as minimal            termining such minimal cores, C ORE D IAG returns the
core iff ∀ci ∈ CKB : CKB − {ci } ∪ CKB is consis-            complement of a minimal core which is a maximal set
tent. Obviously, CKB ∪ CKB |= ⊥.                             of redundant constraints in CKB . C ORE D is based on
   The principle of the following algorithm                  the principle of QuickXPlain (Junker, 2004) – as a con-
(S EQUENTIAL - Algorithm 1) is often used for de-            sequence a minimal core (minimal set of constraints
termining such redundancies (see, e.g., (Piette, 2008;       that preserve the semantics of CKB ) can be interpreted
Grimm and Wissmann, 2011)).                                  as a minimal conflict, i.e., a minimal set of constraints
                                                             that are inconsistent with CKB .
Algorithm 1 S EQUENTIAL(CKB ): ∆                                C ORE D allows the determination of preferred min-
                                                             imal cores since the algorithm is based on the as-
  {CKB : configuration knowledge base}                       sumption of a strict lexicographical ordering of the
  {CKB : the complement of CKB }                             constraints in CKB . On an informal level a pre-
  {∆: set of redundant constraints}                          ferred minimal core can be characterized as follows:
  CKBt ← CKB ;                                               if we have different options for choosing a mini-
  for all ci in CKBt do                                      mal core, we would select the one with the most
    if isInconsistent((CKBt − ci ) ∪ CKB ) then              agreed-upon constraints. For more details on the role
       CKBt ← CKBt − {ci };                                  of strict lexicographical orderings of constraints we
    end if                                                   refer the reader to the work of (Junker, 2004) and
  end for                                                    (Felfernig et al., 2011).
  ∆ ← CKB − CKBt ;                                              The C ORE D IAG algorithm generates CKB from
  return ∆;                                                  CKB . It then activates C ORE D (see Algorithm 3)
                                                             which determines a minimal core on the basis of
   The approach of S EQUENTIAL is straightforward:           a divide-and-conquer strategy that divides the con-
each individual constraint ci is evaluated w.r.t. redun-     straints in C into two subsets (C1 and C2 ) with the goal
dancy by checking whether CKBt − ci is still in-             to figure out whether one of those subsets already con-
consistent with CKB . If this is the case, ci can be         tains a minimal core. If C2 contains a minimal core,
considered as redundant. If CKBt − ci is consistent          C1 is not further taken into account. If C contains only
                                                             one element (cα ) and B is still consistent, then cα is
with CKB , ci is a non-redundant constraint since its        part of the minimal core.
deletion induces consistency with CKB . Applying the
                                              ′
algorithm S EQUENTIAL to our example CKB          results
in ∆ = {ca } since CKB − {ca } ∪ CKB is inconsistent         Algorithm 2 C ORE D IAG (CKB ): ∆
and no further constraint ci can be deleted from CKB           {CKB = {c1 , c2 , ..., cn }}
such that CKB − {ca } − {ci } is still inconsistent.           {CKB : the complement of CKB }
   The problem of checking whether a given constraint          {∆: set of redundant constraints}
can be inferred from the remaining part of a constraint
set has shown to be Co-NP-complete in the general              CKB ← {¬c1 ∨ ¬c2 ∨ ... ∨ ¬cn };
case (Piette, 2008). The major goal of our work was            return(CKB − C ORE D(CKB , CKB , CKB ));
to figure out whether there exist alternative algorithms
that have a better runtime performance compared to

                                                                                                                    3
22nd International Workshop on Principles of Diagnosis

                                                            Redundancy Rate
 KB (|cKB |)     Alg.          ˜0-10%                    ˜50%                  ˜75%                         ˜87.5%
 Bike A (32)      S        32.0 / 205.4 / 0        64.0 / 408.6 / 32   128.0 / 1209.0 / 96           256.0 / 4073.2 / 224
 Bike A (32)     CD        63.0 / 614.4 / 0        88.8 / 863.2 / 32   106.6 / 1352.0 / 96           107.4 / 1737.2 / 224
 Bike B (35)      S        35.0 / 256.8 / 1        70.0 / 616.4 / 36  140.0 / 1710.0 / 106           280.0 / 4854.0 / 246
 Bike B (35)     CD        68.6 / 693.4 / 1        94.0 / 960.8 / 36  109.6 / 1365.2 / 106           117.8 / 1893.0 / 246
 Bike C (37)      S        37.0 / 297.0 / 1        74.0 / 696.6 / 38  148.0 / 1824.8 / 112           296.0 / 5722.8 / 260
 Bike C (37)     CD        72.4 / 703.6 / 1       101.2 / 1091.2 / 38 114.8 / 1524.2 / 112           122.2 / 2115.4 / 260
 Bike D (34)      S        34.0 / 280.2 / 1        68.0 / 606.8 / 35  136.0 / 1672.0 / 103           272.0 / 5033.6 / 239
 Bike D (34)     CD        66.2 / 727.6 / 1       94.8 / 1031.8 / 35  104.8 / 1433.8 / 103           114.8 / 2000.8 / 239
 Bike E (35)      S        35.0 / 254.2 / 9        70.0 / 601.0 / 44  140.0 / 1628.6 / 114           280.0 / 5124.8 / 254
 Bike E (35)     CD        60.8 / 663.0 / 9        83.4 / 821.4 / 44   96.0 / 1182.6 / 114           103.6 / 1671.2 / 254
 Bike F (33)      S        33.0 / 274.0 / 1        66.0 / 601.8 / 34  132.0 / 1573.8 / 100           264.0 / 4525.0 / 232
 Bike F (33)     CD        64.6 / 632.8 / 1        88.6 / 931.2 / 34  108.2 / 1345.6 / 100           110.8 / 1822.2 / 232
 Bike G (36)      S        36.0 / 281.4 / 2        72.0 / 660.6 / 38  144.0 / 1729.8 / 110           288.0 / 5434.4 / 254
 Bike G (36)     CD        70.6 / 714.6 / 2        96.0 / 939.8 / 38  111.6 / 1409.6 / 110           122.4 / 2081.8 / 254
 Bike H (24)      S        24.0 / 194.2 / 0        48.0 / 398.8 / 24    96.0 / 1047.0 / 72           192.0 / 3010.0 / 168
 Bike H (24)     CD        47.0 / 443.4 / 0        63.0 / 587.4 / 24     77.2 / 869.8 / 72            80.0 / 1240.4 / 168
 Bike I (35)      S        35.0 / 268.4 / 1        70.0 / 647.0 / 36  140.0 / 1696.4 / 106           280.0 / 4976.4 / 246
 Bike I (35)     CD        68.4 / 708.6 / 1        93.6 / 985.0 / 36  112.2 / 1371.4 / 106           117.0 / 1897.0 / 246
 Bike J (46)      S        46.0 / 366.8 / 4        92.0 / 867.8 / 50  184.0 / 2309.8 / 142           368.0 / 7234.4 / 326
 Bike J (46)     CD        88.4 / 896.0 / 4       119.4 / 1258.8 / 50 139.8 / 1886.8 / 142           142.0 / 2413.6 / 326
 Bike K (35)      S        35.0 / 254.0 / 1        70.0 / 805.4 / 36  140.0 / 1852.8 / 106           280.0 / 5146.8 / 246
 Bike K (35)     CD        68.8 / 712.4 / 1       95.6 / 1021.6 / 36  108.8 / 1374.2 / 106           117.6 / 1945.4 / 246
 Bike L (37)      S        37.0 / 290.0 / 2        74.0 / 645.8 / 39  148.0 / 1822.0 / 113           296.0 / 5740.8 / 261
 Bike L (37)     CD        71.4 / 716.4 / 2       96.6 / 1001.6 / 39  113.2 / 1425.8 / 113           111.0 / 1829.8 / 261
 Bike 2 (32)      S        32.0 / 883.0 / 3       64.0 / 2386.4 / 35   128.0 / 8218.8 / 99          256.0 / 37784.4 / 227
 Bike 2 (32)     CD       61.2 / 2165.2 / 3       85.4 / 3749.6 / 35    97.2 / 5693.2 / 99          108.0 / 10276.8 / 227
 esvs (21)        S        21.0 / 340.0 / 0        42.0 / 870.8 / 21    84.0 / 2771.8 / 63          168.0 / 10231.8 / 147
 esvs (21)       CD        41.0 / 724.0 / 0       56.0 / 1170.6 / 21    65.6 / 1844.0 / 63           71.0 / 3296.4 / 147
 fs (16)          S        16.0 / 291.6 / 1        32.0 / 664.0 / 17    64.0 / 1989.2 / 49           128.0 / 7238.0 / 113
 fs (16)         CD        30.6 / 658.8 / 1        42.0 / 933.4 / 17    49.2 / 1504.2 / 49           52.2 / 2431.8 / 113
 hypo (21)        S        21.0 / 116.6 / 1        42.0 / 321.0 / 22     84.0 / 975.4 / 64           168.0 / 3297.6 / 148
 hypo (21)       CD        40.6 / 383.8 / 1        55.2 / 549.0 / 22     62.2 / 802.2 / 64            71.0 / 1293.0 / 148
 large2 (185)     S      130.0 / 2552.8 / 75     260.0 / 4721.8 / 260 520.0 / 7860.0 / 445          1040.0 / 15025.4 / 630
 large2 (185)    CD       76.8 / 1868.8 / 75      79.8 / 2085.6 / 260  96.6 / 2834.0 / 445           103.0 / 3870.4 / 630

Table 1: Application of S EQUENTIAL (S) (Algorithm 1) and C ORE D IAG (C) (Algorithm 2) to configuration
knowledge bases of www.itu.dk/research/cla/externals/clib. ”Bikex”: bicycles; ”esvs”: corporate networks; ”fs”:
financial services (insurances); ”hypo”: financial services (investments); ”large2”: electronic circuits. Evaluation
data: (#TP-calls / runtime (ms) / #redundant constraints).

5 EVALUATION                                                 ber of constraints contained in the minimal core (the
We now compare the performance of C ORE D IAG with           lower the number of constraints in the minimal core,
the S EQUENTIAL algorithm discussed in Section 3.            the better the performance of C ORE D IAG).
The worst case complexity (and best case complex-               Table 1 reflects the results of our analysis conducted
ity) of S EQUENTIAL in terms of the number of needed         with the knowledge bases of the configuration bench-
consistency checks is n (the number of constraints           mark.2 The tests have been executed on a standard
in CKB ). Worst case and best case complexity are            desktop computer (Intel®Core™2 Quad CPU Q9400
identical since S EQUENTIAL checks the redundancy            CPU with 2.66GHz and 2GB RAM) using the CLib
of each individual constraint ci with respect to CKB .       library. We compared the performance of S EQUEN -
In contrast, the worst case complexity of C ORE D IAG        TIAL and C ORE D IAG for the different configuration
depends on the number of redundant constraints in            knowledge bases. In order to show the advantages of
CKB . The worst case complexity of C ORE D IAG in            C ORE D IAG that come along with an increasing num-
terms of the number of needed consistency checks is          ber of redundant constraints, we generated three addi-
2c ∗ log2 ( nc ) + 2c where n is the number of constraints   tional versions from the benchmark knowledge bases
in CKB and c is the minimal core size. The best case         (see Table 1) that differ in their redundancy rate R (see
complexity in terms of the number of needed consis-          Formula 1). The number of iterations per setting was
tency checks can be achieved if all constraints element      set to 10; for each iteration we applied a randomized
of the minimal core are positioned in one branch of the      constraint ordering. Note that an evaluation of the indi-
C ORE D search tree: log2 ( nc ) + 2c. Consequently, the
                                                                2
performance of C ORE D IAG heavily relies on the num-               www.itu.dk/research/cla/externals/clib.

                                                                                                                      4
22nd International Workshop on Principles of Diagnosis

                                                     Redundancy Rate
                        KB         redundancy-free    ˜0-10% ˜50% ˜75%           ˜87.5%
                        Bike A           9.9            10.2   14.2  33.7         43.0
                        Bike B           9.3             9.5   29.0  24.6         41.9
                        Bike C           8.4            11.0   18.3  32.3         44.0
                        Bike D           7.8             8.9   25.6  28.0         42.6
                        Bike E           6.3            13.0   19.0  24.2         41.5
                        Bike F           7.7            12.8   19.5  22.3         41.4
                        Bike G           7.3            10.3   16.9  26.0         44.8
                        Bike H          10.5            11.0   13.6  24.1         37.2
                        Bike I           8.0             9.0   15.7  31.5         45.4
                        Bike J           9.1            19.3   24.8  26.6         49.2
                        Bike K           7.7            11.4   17.3  24.8         43.5
                        Bike L           9.3            13.4   25.8  26.7         45.5
                        Bike 2          44.8            48.3   84.1 162.0         323.4
                        esvs            22.7            26.0   45.1  79.5         157.8
                        fs              22.6            24.2   44.0  76.5         149.7
                        hypo             8.3             8.4   19.1  26.0         47.6
                        large2          15.5            16.6   22.0  25.5         36.4

    Table 2: Time in ms needed for calculating a solution for a given configuration knowledge base version.

Algorithm 3 C ORE D(B, D, C = {c1 , c2 , ..., cp }): ∆    redundant constraints included (see Table 2) – for this
                                                          evaluation as well the number of iterations per setting
  {B: consideration set}                                  was set to 10; for each iteration we applied a random-
  {D: constraints added to B}                             ized constraint ordering.
  {C: set of constraints to be checked for redundancy}
  if D 6= ∅ and inconsistent(B) then
     return ∅;                                            6   CONCLUSIONS
  end if                                                  The detection of redundant constraints plays a major
  if singleton(C) then                                    role in the context of (configuration) knowledge base
     return(C);                                           development and maintenance. In this paper we have
  end if                                                  proposed two algorithms which can be applied for the
  k ← ⌈ r2 ⌉;                                             identification of minimal cores, i.e., minimal sets of
  C1 ← {c1 , c2 , ..., ck };                              constraints that preserve the semantics of the original
  C2 ← {ck+1 , ck+2 , ..., cp };                          knowledge base. The S EQUENTIAL algorithm can be
  ∆1 ← C ORE D(B ∪ C2 , C2 , C1 );                        applied in settings where the number of redundant con-
  ∆2 ← C ORE D(B ∪ ∆1 , ∆1 , C2 );                        straints in the knowledge base is low. The second al-
  return(∆1 ∪ ∆2 );                                       gorithm (C ORE D IAG) is more efficient but restricted in
                                                          its application to knowledge bases that contain a large
                                                          number of redundant constraints.
vidual properties of the used knowledge bases is within
the scope of future work.                                 7   ACKNOWLEDGEMENTS
                                                          The work presented in this paper has been conducted
              |redundant constraints in CKB |             within the scope of the research project ICONE (Intel-
 R(CKB ) =                                    (1)
                   |constraints in CKB |                  ligent Assistance for Configuration Knowledge Base
                                                          Development and Maintenance) funded by the Aus-
   In addition to the original version (redundancy rate   trian Research Promotion Agency (827587).
= ˜0-10%) we generated three knowledge bases with
the redundancy rates 50%, 75%, and 87.5%. For ex-
ample, a knowledge base with redundancy rate 50%          REFERENCES
can be generated by simply duplicating each constraint    (Bakker et al., 1993) R. Bakker, F. Dikker, F. Tem-
of the original knowledge base. Starting with a re-         pelman, and P. Wogmim. Diagnosing and solv-
dundancy rate of 50% we can observe a transition in         ing over-determined constraint satisfaction prob-
the runtime performance (C ORE D IAG starts to per-         lems. In 13th International Joint Conference on
form better than S EQUENTIAL) due to the increased          Artificial Intelligence, pages 276–281, Chambery,
number of redundant constraints (see the large2 con-        France, 1993.
figuration knowledge base in Figure 1). Another out-
come of our analysis is that nearly each of the inves-    (Chklovski and Gil, 2005) T. Chklovski and Y. Gil.
tigated configuration knowledge bases contains redun-       An analysis of knowledge collected from volunteer
dant constraints (see Table 1). The average runtime         contributors. In 20th National Conference on Arti-
for determining configurations without the redundant        ficial Intelligence (AAAI-05), pages 564–571, Pitts-
constraints is lower compared to the runtime with the       burg, PA, 2005.

                                                                                                                 5
22nd International Workshop on Principles of Diagnosis

(Fahad and Qadir, 2008) M. Fahad and M. Qadir. A
   framework for ontology evaluation. In 16th Interna-
   tional Conference on Conceptual Structures (ICCS
   2008), pages 149–158, Toulouse, France, 2008.
(Felfernig and Burke, 2008) A.        Felfernig      and
   R. Burke. Constraint-based recommender sys-
   tems: Technologies and research issues. In ACM
   International Conference on Electronic Commerce
   (ICEC’08), pages 17–26, Innsbruck, Austria, 2008.
(Felfernig et al., 2004) A. Felfernig, G. Friedrich,
   D. Jannach, and M. Stumptner. Consistency-based
   diagnosis of configuration knowledge bases. Artifi-
   cial Intelligence, 152(2):213–234, 2004.
(Felfernig et al., 2011) A. Felfernig, M. Schubert, and
   C. Zehentner. An efficient diagnosis algorithm for
   inconsistent constraint sets. Artificial Intelligence
   for Engineering Design, Analysis, and Manufactur-
   ing (AIEDAM), 25(2):175–184, 2011.
(Friedrich and Shchekotykhin, 2005) G.         Friedrich
   and K. Shchekotykhin. A general diagnosis method
   for ontologies. In 4th Intl. Semantic Web Confer-
   ence (ISWC05), number 3729 in Lecture Notes
   in Computer Science, pages 232–246, Galway,
   Ireland, 2005. Springer.
(Grimm and Wissmann, 2011) S. Grimm and J. Wiss-
   mann. Elimination of redundancy in ontologies. In
   Extended Semantik Web Conference (ESWC2011),
   pages 260–274, Heraklion, Greece, 2011.
(Junker, 2004) U. Junker. Quickxplain: Preferred
   explanations and relaxations for over-constrained
   problems. In 19th National Conference on Arti-
   ficial Intelligence (AAAI04), pages 167–172, San
   Jose, CA, 2004.
(Levy and Sagiv, 1992) A. Levy and Y. Sagiv. con-
   straints and redundancy in datalog. In 11th Confer-
   ence on the Principles of Database Systems, pages
   67–80, San Diego, CA, 1992.
(Piette, 2008) C. Piette. Let the solver deal with re-
   dundancy. In 20th IEEE International Conference
   on Tools with Artificial Intelligence, pages 67–73,
   Dayton, OH, 2008.
(Richardson and Domingos, 2003) M.          Richardson
   and P. Domingos.        Building large knowledge
   bases by mass collaboration. In 2nd International
   Conference on Knowledge Capture (K-CAP 2003),
   pages 129–137, Sanibel Island, FL, 2003.
(Sabin and Freuder, 1999) M. Sabin and E. Freuder.
   Detecting and resolving inconsistency and redun-
   dancy in conditional constraint satisfaction prob-
   lems. In AAAI 1999 Workshop on Configuration,
   pages 90–94, Orlando, FL, 1999.
(Tsang, 1993) E. Tsang. Foundations of Constraint
   Satisfaction. Academic Press, 1993.

                                                                                      6
You can also read