Similarity Search for Web Services

Page created by Martha Hunt
 
CONTINUE READING
Similarity Search for Web Services

     Xin Dong             Alon Halevy              Jayant Madhavan                      Ema Nemes            Jun Zhang
                          {lunadong, alon, jayant, enemes, junzhang}@cs.washington.edu
                                        University of Washington, Seattle

                        Abstract                                          The growing number of web services available
                                                                       within an organization and on the Web raises a
   Web services are loosely coupled software compo-                    new and challenging search problem: locating de-
nents, published, located, and invoked across the web.                 sired web services. In fact, to address this problem,
The growing number of web services available within an                 several simple search engines have recently sprung
organization and on the Web raises a new and challeng-                 up [1, 2, 3, 4]. Currently, these engines provide only
ing search problem: locating desired web services. Tradi-              simple keyword search on web service descriptions.
tional keyword search is insufficient in this context: the                As one considers search for web services in more
specific types of queries users require are not captured,
                                                                       detail, it becomes apparent that the keyword search
the very small text fragments in web services are unsuit-
able for keyword search, and the underlying structure                  paradigm is insufficient for two reasons. First, key-
and semantics of the web services are not exploited.                   words do not capture the underlying semantics of
   We describe the algorithms underlying the Woogle                    web services. Current web service search engines re-
search engine for web services. Woogle supports similar-               turn a particular service if its functionality descrip-
ity search for web services, such as finding similar web-              tion contains the keywords in the query; such search
service operations and finding operations that compose                 may miss results. For example, when searching zip-
with a given one. We describe novel techniques to sup-                 code, the web services whose descriptions contain
port these types of searches, and an experimental study                term zip or postal code but not zipcode will not be
on a collection of over 1500 web-service operations that               returned.
shows the high recall and precision of our algorithms.                    Second, keywords do not suffice for accurately
                                                                       specifying users’ information needs. Since a web-
1    Introduction                                                      service operation is going to be used as part of an
                                                                       application, users would like to specify their search
Web services are loosely coupled software compo-                       criteria more precisely than by keywords. Current
nents, published, located, and invoked across the                      web-service search engines often enable a user to ex-
web. A web service comprises several operations                        plore the details of a particular web-service opera-
(see examples in Figure 1). Each operation takes                       tion, and in some cases to try it out by entering
a SOAP package containing a list of input param-                       an input value. Nevertheless, investigating a single
eters, fulfills a certain task, and returns the result                 web-service operation often requires several brows-
in an output SOAP package. Large enterprises are                       ing steps. Once users drill down all the way and find
increasingly relying on web services as methodology                    the operation inappropriate for some reason, they
for large-scale software development and sharing of                    want to be able to find similar operations to the
services within an organization. If current trends                     ones just considered, as opposed to laboriously fol-
continue, then in the future many applications will                    lowing parallel browsing patterns. Similarly, users
be built by piecing together web services published                    may want to find operations that take similar inputs
by third-party producers.                                              (respectively, outputs), or that can compose with the
                                                                       current operation being browsed.
Permission to copy without fee all or part of this material is            To address the challenges involved in searching for
granted provided that the copies are not made or distributed           web services, we built Woogle1 , a web-service search
for direct commercial advantage, the VLDB copyright notice
and the title of the publication and its date appear, and no-          engine. In addition to simple keyword searches,
tice is given that copying is by permission of the Very Large          Woogle supports similarity search for web services.
Data Base Endowment. To copy otherwise, or to republish,               A user can ask for web-service operations similar
requires a fee and/or special permission from the Endowment.           to a given one, those that take similar inputs (or
Proceedings of the 30th VLDB Conference,
Toronto, Canada, 2004                                                    1 See   http://www.cs.washington.edu/woogle

                                                                 372
W1 : Web Service: GlobalWeather                             mental evaluation. Section 7 discusses other types
         Operation: GetTemperature                              of search that Woogle supports, and Section 8 con-
          Input: Zip
                                                                cludes.
          Output: Return
    W2 : Web Service: WeatherFetcher
         Operation: GetWeather                                  2    Related Work
          Input: PostCode
                                                                Finding similar web-service operations is closely re-
          Output: TemperatureF, WindChill, Humidity
    W3 : Web Service: GetLocalTime
                                                                lated to three other matching problems: text doc-
         Operation: LocalTimeByZipCode                          ument matching, schema matching, and software
          Input: Zipcode                                        component matching.
          Output: LocalTimeByZipCodeResult                      Text document matching: Document matching
    W4 : Web Service: PlaceLookup                               and classification is a long-standing problem in infor-
         Operation1: CityStateToZipCode                         mation retrieval (IR). Most solutions to this problem
          Input: City, State                                    (e.g. [10, 20, 27, 19]) are based on term frequency
          Output: ZipCode
                                                                analysis. However, these approaches are insufficient
         Operation2: ZipCodeToCityState
          Input: ZipCode                                        in the web service context because text documenta-
          Output: City, State                                   tions for web-service operations are highly compact,
                                                                and they ignore structure information that aids cap-
Figure 1: Several example web services (not including           turing the underlying semantics of the operations.
their textual descriptions). Note that each web service
includes a set of operations, each with input and out-          Schema matching: The database community has
put parameters. For example, web services W1 and W2             considered the problem of automatically matching
provide weather information.                                    schemas [24, 12, 13, 22]. The work in this area
                                                                has developed several methods that try to capture
outputs), and those that compose with a given one.
                                                                clues about the semantics of the schemas, and sug-
This paper describes the novel techniques we have
                                                                gest matches based on them. Such methods include
developed to support these types of searches, and
                                                                linguistic analysis, structural analysis, the use of do-
experimental evidence that shows the high accuracy
                                                                main knowledge and previous matching experience.
of our algorithms. In particular, our contributions
                                                                However, the search for similar web-service opera-
are the following:
                                                                tions differs from schema matching in two significant
 1. We propose a basic set of search functionali-               ways. First, the granularity of the search is differ-
    ties that an effective web-service search engine            ent: operation matching can be compared to finding
    should support.                                             a similar schema, while schema matching looks for
 2. We describe algorithms for supporting similar-              similar components in two given schemas that are
    ity search. Our algorithms combine multiple                 assumed to be related. Second, the operations in a
    sources of evidence in order to determine simi-             web service are typically much more loosely related
    larity between a pair of web-service operations.            to each other than are tables in a schema, and each
    The key ingredient of our algorithm is a novel              web service in isolation has much less information
    clustering algorithm that groups names of pa-               than a schema. Hence, we are unable to adapt tech-
    rameters of web-service operations into seman-              niques for schema matching to this context.
    tically meaningful concepts. These concepts are             Software component matching: Software com-
    then leveraged to determine similarity of inputs            ponent matching is considered important for soft-
    (or outputs) of web-service operations.                     ware reuse. [28] formally defines the problem by ex-
 3. We describe a detailed experimental evaluation              amining signature (data type) matching and spec-
    on a set of over 1500 web-service operations.               ification (program behavior) matching. The tech-
    The evaluation shows that we can provide both               niques employed there require analysis of data types
    high precision and recall for similarity search,            and post-conditions, which are not available for web
    and that our techniques substantially improve               services.
    on naive keyword search.                                        Some recent work (e.g., [9, 23]) has proposed an-
                                                                notating web services manually with additional se-
   The paper is organized as follows. Section 2 be-             mantic information, and then using these annota-
gins by placing our search problem in the context               tions to compose services [8, 26]. In our context,
of the related work. Section 3 formally defines the             annotating the collection of web services is infeasi-
similarity search problem for web services. Sec-                ble, and we rely on only the information provided in
tion 4 describes the algorithm for clustering param-            the WSDL file and the UDDI entry.
eter names, and Section 5 describes the similarity                  In [15] the authors studied the supervised classi-
search algorithm. Section 6 describes our experi-               fication and unsupervised clustering of web services.

                                                          373
Our work differs in that we are doing unsupervised              that the users have already explored a web service
matching at the operation level, rather than super-             in detail. Suppose they explored the operation Get-
vised classification at the entire web service level.           Temperature in W1 . We identify the following im-
Hence, we face the challenge of understanding oper-             portant similarity search queries they may want to
ations in a web service from very limited amount of             pose:
information.                                                    Similar operations: Find operations with similar
                                                                functionalities. For example, the web-service oper-
3     Web Service Similarity Search                             ation GetWeather in W2 is similar to the operation
We begin by briefly describing the structure of web             GetTemperature in W1 . Note that we are searching
services, and then we motivate and define the search            for specific operations that are similar, rather than
problem we address.                                             similar web services. The latter type of search is
                                                                typically too coarse for our needs. There is no for-
3.1    The Structure of Web Services                            mal definition for operation similarity, because, just
                                                                like in other types of search, similarity depends on
Each web service has an associated WSDL file de-                the specific goal in the user’s mind. Intuitively, we
scribing its functionality and interface. A web ser-            consider operations to be similar if they take similar
vice is typically (though not necessarily) published            inputs, produce similar outputs, and the relation-
by registering its WSDL file and a brief description            ships between the inputs and outputs are similar.
in UDDI business registries. Each web service con-
                                                                Similar inputs/outputs: Find operations with
sists of a set of operations. For each web service, we
                                                                similar inputs. As a motivating example for such
have access to the following information:
                                                                a search, suppose our goal is to collect a variety
    • Name and text description: A web service                  of information about locations. While W1 provides
      is described by a name, a text description in the         weather, operations LocalTimeByZipCode in W3 and
      WSDL file, and a description that is put in the           ZipCodeToCityState in W4 provide other information
      UDDI registry.                                            about locations, and thereby may be of interest to
    • Operation descriptions: Each operation is                 the user.
      described by a name and a text description in                Alternatively, we may want to search for opera-
      the WSDL file.                                            tions with similar outputs, but different inputs. For
                                                                example, we may be looking for temperature, but
    • Input/Output descriptions: Each input and
                                                                the operation we are considering takes zipcode as
      output of an operation contains a set of param-
                                                                input, while we need one that takes city and state
      eters. For each parameter, the WSDL file de-
                                                                as input.
      scribes the name, data type and arity (if the
      parameter is of array type). Parameters may               Composible operations: Find operations that
      be organized in a hierarchy by using complex              can be composed with the current one. One of the
      types.                                                    key promises of building applications with web ser-
                                                                vices is that one should be able to compose a set of
3.2    Searching for Web Services                               given services to create ones that are specific to the
                                                                application’s needs. In our example, there are two
To motivate similarity search for web services, con-            opportunities for composition. In the first case, the
sider the following typical scenario. Users begin a             output of the operation is similar to the input of the
search for web services by entering keywords rele-              given operation, such as CityStateToZipCode in W4 .
vant to the search goal. They then start inspecting             Composing CityStateToZipCode with GetWeather in
some of the returned web services. Since the result             W1 offers another option for getting the weather
of the search is rather complex, the users need to              when the zipcode is not known. In the second case,
drill down in several steps. They first decide which            the output of the given operation may be similar to
web service to explore in detail, and then consider             the input of another operation; e.g., one that trans-
which specific operations in that service to look at.           forms Centigrade and Fahrenheit and thereby pro-
Given a particular operation, they will look at each            duces results in the desired scale.
of its inputs and outputs, and if the engine provides
a try it feature, they will try entering some value for         In this paper we focus on the following two problems,
the inputs.                                                     from which we can easily build up the above search
   At this point, the users may find that the web ser-          capabilities.
vice is inappropriate for some reason, but not want             Operation matching: Given a web-service opera-
to have to repeat the same process for each of other            tion, return a list of similar operations.         ¤
potentially relevant services. Hence, our goal is to            Input/output matching: Given the input (respec-
provide a more direct method for searching, given               tively, output) of a web-service operation, return a

                                                          374
list of web-service operations with similar inputs (re-         referred to as terms. We exploit the co-occurrence
spectively, outputs).                                ¤          of terms in web service inputs and outputs to clus-
    We note that these two problems are also at the             ter terms into meaningful concepts. As we shall see
core of two other types of search that Woogle sup-              later, using these concepts, in addition to the orig-
ports (See Section 7): template search and composi-             inal terms, greatly improves our ability to identify
tion search. Template search goes beyond keyword                similar inputs/outputs and hence find similar web
search by specifying the functionality, input and out-          service operations.
put of a desired operation. Composition search re-                  Applying an off-the-shelf text clustering algo-
turns not only single operations, but also composi-             rithm directly to our context does not perform well
tions of operations that fulfill the user’s need.               because the web service inputs/outputs are sparse.
                                                                For example, whereas synonyms tend to occur in the
3.3   Overview of Our Approach                                  same document in an IR application, they seldom oc-
                                                                cur in the same operation input/output; therefore,
Similarity search for web services is challenging be-           they will not get clustered. Our clustering algorithm
cause neither the textual descriptions of web services          is a refinement of agglomerative clustering. We begin
and their operations nor the names of the input and             by describing a particular kind of association rules
output parameters completely convey the underly-                that capture our notion of term co-occurrence and
ing semantics of the operation. Nevertheless, knowl-            then describe the clustering algorithm.
edge of the semantics is important to determining
similarity between operation.
   Broadly speaking, our algorithm combines mul-                4.1   Clustering Parameters by Association
tiple sources of evidences to determine similarity.             We base our clustering on the following heuristic:
In particular, it will consider similarity between the          parameters tend to express the same concept if they
textual descriptions of the operations and of the en-           occur together often. This heuristic is validated by
tire web services, and similarity between the param-            our experimental results. We use it to cluster pa-
eter names of the operations. The key ingredient of             rameters by exploiting their conditional probabilities
the algorithm is a technique that clusters parameter            of occurrence in inputs and outputs of web-service
names in the collection of web services into seman-             operations. Specifically, we are interested in associ-
tically meaningful concepts. By comparing the con-              ation rules of the form:
cepts that input or output parameters belong to, we
                                                                                    t1 → t2 (s, c)
are able to achieve good similarity measures. Sec-
tion 4 describes the clustering algorithm, and Sec-             In this rule, t1 and t2 are two terms. The support, s,
tion 5 describes how we combine the multiple sources            is the probability that t1 occurs in an input/output;
of evidence.                                                    i.e., s = P (t1 ) = kIOk
                                                                                        kIOt1 k
                                                                                                , where ||IO|| is the to-
                                                                tal number of inputs and outputs of operations, and
4     Clustering Parameter Names                                ||IOt1 || is the number of inputs and outputs that
To effectively match inputs/outputs of web-service              contain t1 . The confidence, c, is the probability that
operations, it is crucial to get at their underlying            t2 occurs in an input or output, given that t1 is
                                                                                                                 kIO     k
semantics. However, this is hard for two reasons.               known to occur in it; i.e., c = P (t2 |t1 ) = kIOt1t,tk2 ,
                                                                                                                       1
First, parameter naming is dependent on the devel-              where ||IOt1 ,t2 || is the number of inputs and out-
opers’ whim. Parameter names tend to be highly                  puts that contain both t1 and t2 . Note that the rule
varied given the use of synonyms, hypernyms, and                t1 → t2 (s12 , c12 ) and the rule t2 → t1 (s21 , c21 ) may
different naming rules. They might even not be com-             have different support and confidence values. These
posed of proper English words—there may be mis-                 rules can be efficiently computed using the A-Priori
spellings, abbreviations, etc. Therefore, lexical ref-          algorithm [7].
erences, such as Wordnet [5], are hard to apply. Sec-
ond, inputs/outputs typically have few parameters,
                                                                4.2   Criteria for Ideal Clustering
and the associated WSDL files rarely provide rich
descriptions for parameters. Traditional IR tech-               Ideally, parameter clustering results should have the
niques, such as TF/IDF [25] and LSI [11], rely on               following two features:
word frequencies to capture the underlying seman-
tics and thus do not apply well.                                 1. Frequent and rare parameters should be left
   A parameter name is typically a sequence of                      unclustered; strongly connected parameters in-
concatenated words (not necessarily proper English                  between are clustered into concepts. First,
words), with the first letter of every word capitalized             not clustering frequent parameters is consistent
(e.g., LocalTimeByZipCodeResult). Such words are                    with the IR community’s observation that such

                                                          375
technique leads to the best performance in au-                   4.3.1    The basic agglomeration algorithm
    tomatic query expansion [16]. Second, leaving
                                                                     Agglomerative clustering is a bottom-up version of
    rare parameters unclustered avoids over-fitting.
                                                                     hierarchical clustering. Each object is initialized to
 2. The cohesion of a concept—the connections be-                    be a cluster of its own. In general, at each iteration
    tween parameters inside the concept—should                       the two most similar clusters are merged until no
    be strong; the correlation between concepts—                     more clusters can be merged.
    the connections between parameters in different                      In our context, each term is initialized to be a
    concepts—should be weak.                                         cluster of its own; i.e., there are as many clusters
   Traditionally, cohesion is defined as the sum of                  as terms. The algorithm proceeds in a greedy fash-
squares of Euclidean distances from each point to                    ion. It sorts the association rules in descending order
the center of the cluster it belongs to; correlation is              first by the confidence and then by the support. In-
defined as the sum of squares of distances between                   frequent rules with less than a minimum support ts
cluster centers [14]. This definition does not apply                 are discarded. At every step, the algorithm chooses
well in our context because of “the curse of dimen-                  the highest ranked rule that has not been consid-
sionality”: our feature sets are so large that a Eu-                 ered previously. If the two terms in the rule belong
clidean distance measure is no longer meaningful.                    to different clusters, the algorithm merges the clus-
We hence quantify the cohesion and correlation of                    ters. Formally, the condition that triggers merging
clusters based on our association rules.                             cluster I and J is
   We say that t1 is closely associated to t2 if the rule                       ∃i ∈ I, j ∈ J . i → j(s > ts , c > tc )
t1 → t2 has a confidence greater than threshold tc .                 where i and j are terms. The threshold ts is cho-
The threshold tc is chosen manually to be the value                  sen to control the clustering of terms that do not
that best separates correlated and uncorrelated pairs                occur frequently. We note that in our experiments
of terms.                                                            the results of operation and input/output matching
   Given a cluster I, we define the cohesion of I as                 are not sensitive on the values of ts and tc .
the percentage of closely associated term pairs over
all term pairs. Formally,                                            4.3.2    Increasing cluster cohesion
               k {i, j | i, j ∈ I, i 6= j, i → j(c > tc )} k         The basic agglomerative algorithm merges two clus-
      cohI =
                               ||I||(||I|| − 1)                      ters together when any two terms in the two clusters
                                                                     are closely associated. The merge condition is very
where i → j(c > tc ) is the association rule for term
                                                                     loose and can easily result in low cohesion of clus-
i and j. As a special case, the cohesion of a single-
                                                                     ters. To illustrate, suppose there is a concept for
term cluster is 1.
                                                                     weather, containing temperature as a term, and a
   Given clusters I and J, we define the correlation
                                                                     concept for address, containing zip as a term. If,
between I and J as the percentage of closely associ-
                                                                     when operations report temperature, they often re-
ated cross-cluster term pairs. Formally,
                                                                     port the area zipcode as well, then the confidence of
                               C(I, J) + C(J, I)                     rule temperature → zip is high. As a result, the basic
                   corIJ =
                                 2 k I kk J k                        algorithm will inappropriately combine the weather
where C(I, J) =k {i, j | i ∈ I, j ∈ J, i → j(c > tc )} k.            concept and the address concept.
  To measure the overall quality of a clustering C,                      The cohesion of a cluster is decided by the associ-
we define the cohesion/correlation score as                          ation of each pair of terms in the cluster. To ensure
                    P
                                                   P                 that we obtain clusters with high cohesion, we merge
                        I∈C cohI
                         kCk            (||C|| − 1) I∈C cohI         two clusters only if they satisfy a stricter condition,
  scoreC =     P                      =    P
                   I,J∈C,I6=J corIJ      2 I,J∈C,I6=J corIJ          called cohesion condition.
                   kCk(kCk−1)/2
                                                                         Given a cluster C, a term is called a kernel term
                                                                     if it is closely associated with at least half2 of the
   The cohesion/correlation score captures the                       remaining terms in C. Our cohesion condition re-
trade-off between having a high cohesion score and                   quires that all the terms in the merged cluster be
a low correlation score. Our goal is to obtain a                     kernel terms. Formally, we merge two clusters I
high scoreC that will indicate tight connections in-                 and J only if they satisfy the cohesion condition:
side clusters and loose connections between clusters.
                                                                     ∀i ∈ I ∪ J . k {j | j ∈ I ∪ J, i 6= j, i → j(c > tc )} k
4.3    Clustering Algorithm                                                                               1
                                                                                                       ≥ (||I|| + ||J|| − 1)
We can now describe our clustering algorithm as a                                                         2
series of refinements to the classical agglomerative                    2 We tried different values for this fraction and found   1
                                                                                                                                  2
clustering [18].                                                     yielded the best results.

                                                               376
• If I 0 6= I, J 0 6= J, then again, merging I and J
               I            I                   I
                   I’ = I       I’ I - I’           I’ I - I’                   directly disobeys the cohesion condition. There
               J            J                   J                               are two options: one is to split I into I 0 and
                   J’ = J       J’ = J              J’ J-J’                     I −I 0 , split J into J 0 and J −J 0 , and then merge
                    (a)           (b)                 (c)                       I 0 with J 0 (see Figure 2(c)); the other is not
        Figure 2: Splitting and merging clusters                                to split or merge. We choose an option in two
                                                                                steps: the first step checks whether in the first
4.3.3    Splitting and Merging                                                  option, the merged result satisfies the cohesion
A greedy algorithm pursues local optimal solutions                              condition; if so, the second step computes the
at each step, but usually cannot obtain the global                              cohesion/correlation score for each option, and
optimal solution. In parameter clustering, an inap-                             chooses the option with a higher score.
propriate clustering decision at an early stage may                             After the above processing, the merged cluster
prevent subsequent appropriate clustering. Consider                         necessarily satisfies the cohesion condition. How-
the case where there is a cluster for zipcode {zip,                         ever, the clusters that are split from the original
code}, formed because of the frequent occurrences of                        clusters may not. To ensure cohesion, we further
parameter ZipCode. Later we need to decide whether                          split such clusters: each time, we split the cluster
to merge this cluster with another cluster for address                      into two, one containing all kernel terms, and the
{state, city, street}. The term zip is closely associ-                      other containing the rest. We repeat splitting un-
ated with state, city and street, but code is not be-                       til eventually all result clusters satisfy the cohesion
cause it also occurs often in other parameters such                         condition. Note that applying such splitting strat-
as TeamCode and ProxyCode, which typically do not                           egy on an arbitrary cluster may generate clusters of
co-occur with state, city or street. Consequently, the                      small size. Therefore, we do not merge two clusters
two clusters cannot merge; the clustering result con-                       directly (without applying the above judgment) and
trasts with the ideal one: {state, city, street, zip} and                   then split the merged cluster.
{code}.
   The solution to this problem is to split already-                        Remark 4.1. Our splitting-and-clustering tech-
formed clusters so as to obtain a better set of clusters                    nique is different from the dynamic modeling in the
with a higher cohesion/correlation score. Formally,                         Chameleon algorithm [17], which also first splits and
given clusters I and J, we denote                                           then merges. We do splitting and clustering at each
                                                                            step of the greedy algorithm. The Chameleon al-
  I 0 = {i | i ∈ I, ||{j | j ∈ I ∪ J, i → j(c > tc )}||                     gorithm first considers the whole set of parameters
                                   1                                        as a big cluster and splits it into relatively small
                                ≥ (||I|| + ||J|| − 1)
                                   2                                        sub-clusters, and then repeatedly merges these sub-
  J 0 = {j | j ∈ J, ||{i | i ∈ I ∪ J, j → i(c > tc )}||                     clusters.                                           ¤
                                   1
                                ≥ (||I|| + ||J|| − 1)           (1)         4.3.4   Removing noise
                                   2
                     0                      0
   Intuitively, I (respectively, J ) denotes the set of                     Even with splitting, the results may still have terms
terms in I that are closely associated with terms in                        that do not express the same concept as other terms
the union of I and J. Our algorithm makes splitting                         in its cluster. We call such terms noise terms. To il-
decision depending on which of the four following                           lustrate how noise terms can be formed, we continue
cases occurs:                                                               with the zipcode example. Suppose there is a clus-
                                                                            ter for address {city, state, street, zip, code}, where
  • If I 0 = I, J 0 = J, then I and J can be merged                         code is a noise term. The cluster is formed because
    directly (see Figure 2(a)).                                             the rules zip → city, zip → state, and zip → street all
  • If I 0 6= I, J 0 = J, then merging I and J di-                          have very high confidence, e.g., 90%; even if the rule
    rectly disobeys the cohesion condition. There                           code → zip has a lower confidence, e.g., 50%, the
    are two options: one is to split I into I 0 and                         rules code → city, code → state, and code → street
    I − I 0 , and then merge I 0 with J (see Figure                         can still have high confidence.
    2(b)); the other is not to split or merge. We de-                          We use the following heuristic to detect noise
    cide in two steps: the first step checks whether                        terms. A term is considered to be noise if in half
    the merged result in the first option satisfies the                     of its occurrences there are no other terms from the
    cohesion condition; if so, the second step com-                         same concept. After one pass of the greedy algo-
    putes the cohesion/correlation score for each                           rithm (considering all association rules above a given
    option, and chooses the option with a higher                            threshold), we scan the resulting concepts to remove
    score. The decision is similar for the case where                       noise terms. Formally, for a term t, denote ||IOt ||
    J 0 6= J, I 0 = I.                                                      as the number of inputs/outputs that contain t, and

                                                                      377
procedure MergeParameters(T , R) return (C)                      4.4   Clustering Results
// T is the term set, R is the association rule set
// C is the result concept set                                   We now briefly outline the results of our clustering
  for (i = 1, n) Ci = {ti }; //initiate clusters                 algorithm. Our dataset, which we will describe in
  sort R first by the descending order of confidence,            detail in Section 6, contains 431 web services and
    then by the descending order of support value;               3148 inputs/outputs. There are a total of 1599
  for each (r : t1 → t2 (s > ts , c > tc ) in R)                 terms. The clustering algorithm converges after the
    if t1 and t2 are in different clusters I and J
                                                                 seventh run. It clusters 943 terms into 182 concepts.
       Compute I 0 and J 0 according to formula (1);
       if (I 0 = I ∧ J 0 = J) merge I and J;                     The rest 656 terms, including 387 infrequent terms
       else if (splitting and merging satisfies the              (each occurs in at most 3 inputs/outputs) and 54
          cohesion condition and has a higher scoreC )           frequent terms (each occurs in at least 30 of the
             split and merge;                                    inputs/outputs) are left unclustered. There are 59
             if (I 00 = I − I 0 and/or J 00 = J − J 0            dense clusters, each with at least 5 terms. Some of
                does not observe the cohesion condition)         them correspond roughly to the concepts of address,
                split I 00 and/or J 00 iteratively;              contact, geology, maps, weather, finance, commerce,
  scan inputs/outputs and remove noise terms;                    statistics, and baseball, etc. The overall cohesion is
  return result clusters;                                        0.96, correlation is 0.003, and average cohesion for
     Figure 3: Algorithm for parameter clustering                the dense clusters is 0.76. This result observes the
                                                                 two features of an ideal clustering.
||SIOt || as the number of inputs/outputs that con-
tain t but no other terms in the same concept of t.              5     Finding Similar Operations
We remove t from the concept if ||SIOt || ≥ 21 ||IOt ||.
                                                                 In this section we describe how to predict similarity
                                                                 of inputs/outputs sets and of web-service operations.
4.3.5   Putting it all together                                  We will determine similarity by combining multiple
                                                                 sources of evidence. The intuition behind our match-
Figure 3 puts all the pieces together, and shows the             ing algorithm is that the similarity of a pair of in-
details of a single pass of the clustering algorithm.            puts (or outputs) is related to the similarity of the
   The above algorithm still has two problems.                   parameter names, that of the concepts represented
First, the cohesion condition is too strict for large            by the parameter names, and that of the operations
clusters, so it may prevent closely associated large             they belong to. Note that parameter name similar-
clusters to merge. Second, early inappropriate merg-             ity compares inputs/outputs on a fine-grained level,
ing may prevent later appropriate merging. Al-                   and concept similarity compares inputs/outputs on
though we do splitting, the terms taken off from the             a coarse-grained level. The similarity between two
original clusters may have already missed the chance             web-service operations is related to the similarity of
to merge with other closely associated terms. We                 their descriptions, that of their inputs and outputs,
solve the problems by running the clustering algo-               and that of their host web services.
rithm iteratively. After each pass, we replace each
                                                                 Input/output similarity: We identify the in-
term with its corresponding concept, re-collect as-
                                                                 put i of a web-service operation op with a vector
sociation rules, and then re-run the clustering algo-
                                                                 i = (pi , ci , op), where pi is the set of input param-
rithm. This process continues when no more clusters
                                                                 eter names, and ci is the set of concepts associated
can be merged.
                                                                 with the parameter names (as determined by the
   We illustrate with an example that the iteration              clustering algorithm described in Section 4). While
of clustering does not sharply loosen the cluster-               comparing a pair of inputs, we determine the sim-
ing condition. Consider the case where {zip} is not              ilarity on each of the three components separately,
clustered with {temperature, windchill, humidity}, be-           and then combine them. We treat op’s output o as
cause zip is closely associated with only temperature,           a vector o = (po , co , op), and process it analogously.
but not the other two. Another iteration of cluster-
ing will replace each occurrence of temperature, wind-           Web-service operation similarity: We identify
chill and humidity with a single concept, say weather.           a web-service operation op with a vector op =
The term zip will be closely associated with weather;            (w, f, i, o), where w is the text description of the
however, the term weather is not necessarily closely             web service to which op belongs, f is the textual de-
associated with zip, because that requires zip to oc-            scription of op, and i and o denote the input and
cur often when any of temperature, windchill, or hu-             output parameters. Here too, we determine similar-
midity occurs. Thus, the iteration will (correctly)              ity by combining the similarities of the individual
keep the two clusters.                                           components of the vector.

                                                           378
Observe that there is a recursive relationship be-           adding the terms in their inputs and outputs to the
tween the similarity of inputs/outputs and the simi-            bag of words.
larity of web-service operations. Intuitively, this re-         Web service description similarity: To compute
lationship holds because each one depends on the                the similarity of web service descriptions, we create
other, and any decision on how to break this recur-             a bag of words from the following: the tokenized
sive relationship would be arbitrary. In Section 5.2            web service name, WSDL documentation and UDDI
we show that with sufficient care for the choice of             description, the tokenized names of the operations in
the combination weights, we can guarantee that the              the web service, and their input and output terms.
recursive computation converges.                                We again apply TF/IDF on the bag of words.
5.1   Computing Individual Similarities                         5.2   Combining Individual Similarities
We now describe how we compute similarities for                 We use a linear combination to combine the similar-
each one of the components of the vectors.                      ity of each component of the operation. Each type
Input/output parameter name similarity:                         of similarity is assigned a weight that is dependent
We consider the terms in an input/output as a                   on its relevance to the overall similarity. Currently
bag of words and use the TF/IDF (Term Fre-                      we set the weights manually based on our analysis
quency/Inverse Document Frequency) measure [25]                 of the results from different trials. Learning these
to compute the similarity of two such bags.                     weights based on direct or indirect user feedback is
   To improve our accuracy, we pre-process the                  a subject of future work.
terms as follows.                                                  As noted earlier, there is a recursive dependency
                                                                between the similarity of operations and that of in-
 1. Perform word stemming and remove stopwords.                 puts/outputs. We prove that computing the recur-
    Stemming improves recall by removing term                   sive similarities ultimately converges.
    suffixes and reducing all forms of a term to a sin-
    gle stemmed form. Stopword removal improves                 Proposition 1. Computing operation similarity
    precision by eliminating words with little sub-             and input/output similarity converges.     ¤
    stantive meaning.
 2. Group terms with close edit distance [21] and               Proof (Sketch): Let Sop , Si and So be the simi-
    replace terms in a group with a normalized                  larity of operations, of inputs, and of outputs. Let
    form. This step helps normalize misspelled and              wi and wo be the weights for input similarity and
    abbreviated terms.                                          output similarity in computing operation similarity,
 3. Remove from the output bag the terms that                   and wop be the weight for operation similarity in
    refer to the inputs. For example, in the out-               computing input/output similarity.
    put parameter LocalTimeByZipCodeResult, the                    We start by assigning zero to the operation simi-
    term By indicates that the following terms de-              larity, and based upon it compute input/output sim-
    scribe inputs; thus, terms Zip and Code can be              ilarity and operation similarity iteratively. We can
    removed.                                                    prove that if z = wop (win + wout ) < 1, the compu-
                                                                tation converges and the results are:
 4. Extract additional information from names of
    web-service operations. Most operations are                                (∞)          (0)         1
                                                                              Sop     =    Sop  ·
    named after the output (e.g., GetWeather),                                                         1−z
                                                                               (∞)              (0)       wop
                                                                                                         (0)
    and some include input information (e.g., Zip-                            Si      =    Si         + Sop  ·
                                                                                                          1−z
    CodeToCityState). We put such terms into the                                                          wop
    corresponding input/output bag.                                           So(∞)   =    So(0) + Sop
                                                                                                    (0)
                                                                                                        ·
                                                                                                          1−z
Input/output concept similarity: To compute                             (0)   (0)         (0)
                                                                where sop , si and so are the results of the first
the similarity of the concepts represented by the in-                         (∞) (∞)      (∞)
puts/outputs, we replace each term in the bag of                round, and sop , si   and so   are the converged
words described above with its corresponding con-               results.                                        ¤
cept, and then use the TF/IDF measure. Note
that the clustering algorithm is applied on the in-             6     Experimental Evaluation
put/output terms after preprocessing.                           We now describe a set of experiments that vali-
Operation description similarity: To compute                    date the performance of our matching algorithms.
the similarity of operation descriptions, we consider           Our goal is to show that we produce high precision
the tokenized operation name and WSDL documen-                  and recall on similarity queries and to investigate
tation as a bag of words, and use the TF/IDF mea-               the contribution of the different components of our
sure. Furthermore, we supplement information by                 method.

                                                          379
6.1    Experimental Setup                                                                                     Func          Comb            Woogle

We implemented a web-service search engine, called                                       1

                                                                                        0.9
Woogle, that has access to 790 web services from                                        0.8
the main authoritative UDDI repositories. The cov-                                      0.7

erage of Woogle is comparable to that of the other                                      0.6

                                                                           Precision
                                                                                        0.5
web-service search engines [1, 2, 3, 4]. We ran our                                     0.4
experiments on the subset of web services whose as-                                     0.3

sociated WSDL files are accessible from the web, so                                     0.2

we can extract information about their functionality                                    0.1
                                                                                         0
descriptions, inputs and outputs. This set contains                                                Top 2                   Top 5                Top 10

431 web services, and 1574 operations in total.
                                                                                                                       (a)
   Woogle performs parameter clustering, operation
matching and input/output matching offline, and                                                                         Top 2   Top 5

stores the results in a database. TF/IDF was im-                                          1

plemented using the publicly available Rainbow [6]                                      0.9
                                                                                        0.8
classification tool.                                                                    0.7

   Our experiments compared our method, which                                           0.6

                                                                            Precision
                                                                                        0.5
we refer to as Woogle, with a couple of naive                                           0.4

algorithms Func and Comb. The Func method                                               0.3

matches operations by comparing only the words in                                       0.2
                                                                                        0.1
the operation names and text documentation. The                                           0

Comb method considers the words mentioned in                                                  Similar Input    Similar Output   Compose with
                                                                                                                                    Input
                                                                                                                                                Compose with
                                                                                                                                                     Output
the web service names, descriptions and parameter
                                                                                                                       (b)
names as well; in contrast to Woogle, these words
are all put into a single bag of words.
                                                                   Figure 4: Top-k precision for Woogle similarity search.
Performance Measure: We measured over-
all performance using recall(r), precision(p), R-
                                                                   pose with the output of the given operation, and
precision(pr ) and Top-k precision (pk ). Consider
                                                                   operations that compose with the input of the given
these measures for operation matching. Let Rel be
                                                                   operation. We evaluated the precision of these re-
the set of relevant operations, Ret be the set of re-
                                                                   turned lists, and report the average top-2, top-5 and
turned operations, Retrel be the set of returned rel-
                                                                   top-10 precision.
evant operations, and Retrelk be the set of relevant
operations in the top k returned operations. We de-                   We selected a benchmark of 25 web-service opera-
fine                                                               tions for which we tried to obtain similar operations
                  |Retrel|        |Retrel|                         from our entire collection. When selecting these,
             p=            , r=                                    we ensured that they are from a variety of domains
                    |Ret|          |Rel|
                                                                   and that they have different input/output sizes and
             |Retrelk |                     |Retrel|Rel| |         description sizes. To ensure the top-10 precision is
      pk =              ,   pr = p|Rel| =                          meaningful, we selected only operations for which
                k                              |Rel|
                                                                   Woogle and Comb both returned more than 10
   Among the above measures, pr is considered                      relevant operations. (Func may return less than 10
to most precisely capture the precision and rank-                  relevant operations because typically it obtains re-
ing quality of a system. We also plotted the re-                   sult sets of very small size.)
call/precision curve (R-P curve). In an R-P curve                     Figure 4(a) shows the results for top-k precision
figure, the X-axis represents recall, and the Y-axis               on operation matching. The top-2, top-5, and top-
represents precision. An ideal search engine has a                 10 precisions of Woogle are 98%, 83%, 68% respec-
horizontal curve with a high precision value; a bad                tively, higher than those of the two naive methods by
search engine has a horizontal curve with a low pre-               10 to 30 percentage points. This demonstrates that
cision value. The R-P curve is considered by the IR                considering different sources of evidence, and con-
community as the most informative graph showing                    sidering them separately, will increase the precision.
the effectiveness of a search engine.                              We also observe that Comb has a higher top-2 and
                                                                   top-5 precision than Func, but its top-10 precision
6.2    Measuring Precision                                         is lower. This demonstrates that considering more
Given a web service, Woogle generates five lists: sim-             evidence by simple combination does not greatly en-
ilar operations, operations with similar inputs, op-               hance performance.
erations with similar outputs, operations that com-                   Figure 4(b) shows the precision for the four other

                                                             380
1

                    0.9                                                                                                             1

                    0.8                                                                                                            0.9

                    0.7                                                                        Func                                0.8

                                                                                               Comb                                0.7
                    0.6
                                                                                               FuncWS
       Percentage

                    0.5                                                                        FuncIO                              0.6                                                                          ParIO

                                                                                                                      Percentage
                                                                                                                                                                                                                ConIO
                                                                                               ParOnly                             0.5
                    0.4                                                                                                                                                                                         ParConIO
                                                                                               ConOnly
                                                                                                                                   0.4                                                                          Woogle
                    0.3                                                                        Woogle

                                                                                                                                   0.3
                    0.2
                                                                                                                                   0.2
                    0.1
                                                                                                                                   0.1
                     0
                              Precision                 Recall                   R-precision                                        0
                                                                                                                                             Precision                 Recall                 R-precision

                                                             (a)
                                                                                                                                                                               (a)
                    1.2

                                                                                                                                   1.2
                      1

                                                                                                                                     1
                                                                                               Func
                    0.8
                                                                                               Comb
                                                                                               FuncWS                              0.8
       Precision

                                                                                                                                                                                                                 ParIO
                    0.6                                                                        FuncIO

                                                                                                                      Precision
                                                                                                                                                                                                                 ConIO
                                                                                               ParOnly                             0.6
                                                                                                                                                                                                                 ParConIO
                                                                                               ConOnly
                    0.4                                                                                                                                                                                          Woogle
                                                                                               Woogle
                                                                                                                                   0.4

                    0.2
                                                                                                                                   0.2

                      0
                                                                                                                                     0
                          0   0.1   0.2   0.3   0.4    0.5     0.6   0.7   0.8    0.9    1
                                                                                                                                         0   0.1    0.2   0.3   0.4    0.5      0.6   0.7   0.8   0.9       1
                                                      Recall
                                                                                                                                                                      Recall

                                                             (b)
                                                                                                                                                                               (b)

Figure 5: Performance for different operation matchers.                                                        Figure 6: Performance of different input/output match-
                                                                                                               ers
returned lists. Note that we only reported the top-2
and top-5 precision, as these lists are much smaller                                                           From that list we chose the set of similar operations
in size. From the 25-operation test set, we selected                                                           and labeled them as relevant. The rest are labeled
20 where both input and output parameters are not                                                              as irrelevant. In a similar fashion, we label relevant
empty, and the sizes of the returned lists are not too                                                         inputs and outputs.
short. Figure 4(b) shows that for the majority of the                                                             In this experiment we also wanted to test the con-
four lists, the top-2 and top-5 precisions are between                                                         tributions of the different components of Woogle.
80% and 90%.                                                                                                   To do that, we also considered the following
                                                                                                               stripped-down variations of Woogle:
6.3   Measuring Recall
                                                                                                                 • FuncWS: consider only operation descriptions
In order to measure recall of similarity search, we
                                                                                                                   and web service descriptions;
need to know the set of all operations that are rele-
vant to a given operation in the collection. For this                                                            • FuncIO: consider only operation descriptions,
purpose, we created a benchmark of 8 operations                                                                    inputs and outputs;
from six different domains: weather(2), address(2),                                                              • ParOnly: consider all of the four components,
stock(1), sports(1), finance(1), and time(1) (weather                                                              but compare inputs/outputs based on only pa-
and address are two major domains in the web ser-                                                                  rameter names;
vice corpus). We chose operations with different                                                                 • ConOnly: consider all of the four components,
popularity: four of them have more than 30 similar                                                                 but compare inputs/outputs based on only the
operations each, and the other four each have about                                                                concepts they express.
10 similar operations. Among the 8 operations, one
has empty input, so we have 15 inputs/outputs in                                                                  Figure 5(a) plots the average precision, recall and
total. When choosing the operations, we ensured                                                                R-precision on the eight operations in the bench-
that their inputs/outputs convey different numbers                                                             mark for each of the above matchers and also for
of concepts, and the concepts involved vary in pop-                                                            Func, Comb, and Woogle. Figure 5(b) plots the
ularity.                                                                                                       average R-P curves. We observe the following.
   For each of the 8 operations, we hand-labeled                                                                  First, Woogle generally beats all other match-
other operations in our collection as relevant or ir-                                                          ers. Its recall and R-precision are 88% and 78% re-
relevant. We began by inspecting a set of operations                                                           spectively, much higher than those of the two naive
that had similar web service descriptions, or similar                                                          methods. Second, considering evidences from dif-
operation descriptions, or similar inputs or outputs.                                                          ferent sources by simply putting them into a big

                                                                                                         381
bag of words (Comb) does not help much. This                  observe the following. Matching inputs/outputs by
strategy only beats Func, which considers evidence            comparing the expressed concepts significantly im-
from a single source. Even FuncWS, which dis-                 proves the performance: the three concept-aware
cards all input and output information, has a bet-            matchers obtain a recall 25 percentage points higher
ter performance than Comb. Third, FuncIO per-                 than that of ParIO. Based on concept compari-
forms better than FuncWS. It shows that in oper-              son, the performance of input/output matching can
ation matching, the semantics of input and output             be further improved by considering parameter name
provides stronger evidence than the web service de-           similarity and host operation similarity.
scription. This observation agrees with the intuition
that operation similarity depends more on input and
output similarity. Fourth, Woogle performs bet-
                                                              7   Searching with Woogle
ter than ParOnly, and also slightly better than               Similarity search supplements keyword search for
ConOnly. ParOnly has a higher precision, but                  web services. Besides, its core techniques power
a lower recall; ConOnly has a higher recall, but a            other search methods in the Woogle search engine,
lower precision. By considering parameter match-              namely, template search and composition search.
ing (fine-grained matching) and concept matching              These two methods go beyond keyword-search by
(coarse-grained matching) together, Woogle ob-                directly exploring the semantics of web-service op-
tains a recall as high as ConOnly, and a precision            erations. Because of lack of space, we describe them
as high as ParOnly.                                           only briefly.
   An interesting observation is that Woogle beats
FuncIO in precision up till the point when the re-            Template search: The user can specify the func-
call reaches 80%. Also, the recall of Woogle is 8             tionality, input and output of the desired web-service
percentage points lower than that of FuncIO. This             operation, and Woogle returns a list of operations
is not surprising because verbose textual descrip-            that fulfill the requirements. It is distinguished from
tions of web services have two-fold effects: on the           the keyword search in that (1) it explores the under-
one hand, they provide additional evidence, which             lying structure of operations; and (2) the parameters
helps significantly in the top returned operations,           of the returned operations are relevant to the user’s
where the input and output already provide strong             requirement, but do not necessarily contain the spe-
evidence; on the other hand, they contain noise that          cific words that the user uses. For example, the user
dilutes the high-quality evidence, especially at the          can ask for operations that take zipcode of an area
end of the returned list where real evidence is not           and return its nine-day forecast by specifying input
very strong.                                                  as zipcode, output as forecast, and description as the
   In our experiments, we also observe that com-              weather in the next nine days. The inputs of the re-
pared with the benefits of our clustering technique           turned operation can be named zip, zipcode, or post-
and that of the structure-aware matching, tuning the          code. The outputs can be forecast, weather, or even
parameters in a reasonable range and pre-processing           temperature, humidity at the end of the list of the
the input/output terms improve the performance                returned operations.
only slightly.                                                    Template search is implemented by considering a
                                                              user-specified template as an operation and applying
6.3.1   Input/output matching                                 the similarity search algorithm. A key challenge is
We performed an additional experiment focusing on             to perform the operation matching efficiently on-the-
the performance of input/output matching. This ex-            fly.
periment considered the following matchers:                   Composition search: Much of the promise of web
                                                              services is the ability to build complex services by
  • Woogle: matches inputs/outputs by consider-               composition. Composition search in Woogle returns
    ing parameter names, their corresponding con-             not only single operations, but also operation com-
    cepts, and the operations they belong to.                 positions that achieve the desired functionality. The
  • ParConIO: considers both parameter names                  composition can be of any length. For example,
    and concepts, but not the operations.                     when an operation satisfying the above search re-
  • ConIO: considers only concepts.                           quirement is not available, it will be valuable to re-
  • ParIO: considers only parameter names.                    turn a composition of an operation with zipcode as
                                                              input and city and state as output, and an operation
   Figure 6(a) shows the average recall, precision            with city and state as input and nine-day forecast as
and R-precision on the fifteen inputs/outputs in the          output.
benchmark for each of the above matchers. We also                Based on the machinery that we have already
plotted the average R-P curves in Figure 6(b). We             built for matching operation inputs and outputs, we

                                                        382
can discover compositions automatically. The chal-               [4] Web service list. http://www.webservicelist.com/.
lenge lies in avoiding redundancy and loop in the                [5] Wordnet. http://www.cogsci.princeton.edu/ wn/.
composition. Another challenge is to discover the                [6] rainbow. http://www.cs.cmu.edu/ mccallum/bow, 2003.
compositions efficiently on-the-fly.                             [7] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and
                                                                     A. Verkamo. Fast discovery of association rules. Ad-
                                                                     vances in Knowledge Discovery and Data Mining, 1996.
8    Conclusions and Future Work
                                                                 [8] J. Cardoso. Quality of Service and Semantic Composi-
As the use of web services grows, the problem of                     tion of Workflows. PhD thesis, University of Georgia,
searching for relevant services and operations will                  2002.
get more acute. We proposed a set of similarity                  [9] D.-S. Coalition. Daml-s: Web service description for the
                                                                     semantic web. In ISWC, 2002.
search primitives for web service operations, and de-
                                                                [10] S. Cost and S. Salzberg. A weighted nearest neighbor
scribed algorithms for effectively implementing these                algorithm for learning with symbolic features. Machine
searches. Our algorithm exploits the structure of the                Learning, 10:57–78, 1993.
web services and employ a novel clustering mecha-               [11] S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W.
nism that groups parameter names into meaning-                       Furnas, and R. A. Harshman. Indexing by latent seman-
ful concepts. We implemented our algorithms in                       tic analysis. JASIS, 41(6):391–407, 1990.
Woogle, a web service search engine, and experi-                [12] H.-H. Do and E. Rahm. COMA - A System for Flexible
mented on a set of over 1500 operations. The experi-                 Combination of Schema Matching Approaches. In Proc.
                                                                     of VLDB, 2002.
mental results show that our techniques significantly
                                                                [13] A. Doan, P. Domingos, and A. Halevy. Reconciling
improve the precision and recall compared with two                   schemas of disparate data sources: a machine learning
naive methods, and perform well overall.                             approach. In Proc. of SIGMOD, 2001.
   In future work, we plan to expand Woogle to in-              [14] D. Hand, H. Mannila, and P. Smyth. Principles of Data
clude automatic web-service invocation; i.e., after                  Mining. The MIT Press, 2001.
finding the potential operations, Woogle should be              [15] A. Hess and N. Kushmerick. Learning to attach semantic
able to fill in the input parameters and invoke the                  metadata to web services. In ISWC, 2003.
operations automatically for the user. This search              [16] K. S. Jones. Automatic keyword classification for infor-
is particularly promising because it will, in the end,               mation retrieval. Archon Books, 1971.
be able to answer questions such as “what is the                [17] G. Karypis, E. H. Han, and V. Kumar. Chameleon: A
                                                                     hierarchical clustering algorithm using dynamic model-
weather of an area with zipcode 98195.”                              ing. COMPUTER, 32, 1999.
   While this paper focuses exclusively on searches
                                                                [18] L. Kaufman and P. J. Rousseeuw. Finding Groups in
for web services, the search strategy we have de-                    Data: An Introduction to Cluster Analysis. John Wiley
veloped applies to other important domains. As a                     & Sons, New York, 1990.
prime example, if we model web forms as web ser-                [19] L. S. Larkey. Automatic essay grading using text classi-
vice operations, a deep-web search can be performed                  fication techniques. In Proc. of ACM SIGIR, 1998.
by first searching appropriate web forms with a de-             [20] L. S. Larkey and W. Croft. Combining classifiers in text
sired functionality, and then automatically filling in               categorization. In Proc. of ACM SIGIR, 1996.
the inputs and displaying the results. As another               [21] V. Levenshtein. Binary codes capable of correcting dele-
                                                                     tions, insertions and reversals. Soviet Physics Daklady,
example, applying template search and composition                    10:707–710, 1966.
search to class libraries (considering each class as
                                                                [22] S. Melnik, H. Garcia-Molina, and E. Rahm. Similarity
a web service, and each of its methods as a web-                     Flooding: A Versatile Graph Matching Algorithm. In
service operation) would be a valuable tool for soft-                Proc. of ICDE, 2002.
ware component reusing.                                         [23] M. Paolucci, T. Kawmura, T. Payne, and K. Sycara. Se-
                                                                     mantic matching of web services capabilities. In Proc. of
                                                                     International Semantic Web Conference(ISWC), 2002.
Acknowledgments
                                                                [24] E. Rahm and P. A. Bernstein. A survey on approaches
We would like to thank Pedro Domingo, Oren Et-                       to automatic schema matching. VLDB Journal, 10(4),
                                                                     2001.
zioni and Zack Ives for many helpful discussions, and
thank the reviewers of this paper for their insight-            [25] G. Salton, editor. The SMART Retrieval System—
                                                                     Experiments in Automatic Document Retrieval. Prentice
ful comments. This work was supported by NSF                         Hall Inc., Englewood Cliffs, NJ, 1971.
ITR grant IIS-0205635 and NSF CAREER grant                      [26] E. Sirin, J. Hendler, and B. Parsia. Semi-automatic com-
IIS-9985114.                                                         position ofweb services using semantic descriptions. In
                                                                     WSMAI-2003, 2003.
References                                                      [27] Y. Yang and J. Pedersen. A comparative study on fea-
                                                                     ture selection in text categorization. In International
 [1] Binding point. http://www.bindingpoint.com/.                    Conference on Machine Learning, 1997.
 [2] Grand central. http://www.grandcentral.com/directory/.     [28] A. M. Zaremski and J. M. Wing. Specification matching
                                                                     of software components. TOSEM, 6:333–369, 1997.
 [3] Salcentral. http://www.salcentral.com/.

                                                          383
You can also read