Recommendations - Item-to-Item Collaborative Filtering

Page created by Roland Navarro
CONTINUE READING Recommendations - Item-to-Item Collaborative Filtering
Industry Report

                                  Item-to-Item Collaborative Filtering
                                  Greg Linden, Brent Smith, and Jeremy York •

                        ecommendation algorithms are best                     There are three common approaches to solving the
                        known for their use on e-commerce Web                 recommendation problem: traditional collabora-
                        sites,1 where they use input about a cus-             tive filtering, cluster models, and search-based
                tomer’s interests to generate a list of recommend-            methods. Here, we compare these methods with
                ed items. Many applications use only the items                our algorithm, which we call item-to-item collab-
                that customers purchase and explicitly rate to rep-           orative filtering. Unlike traditional collaborative
                resent their interests, but they can also use other           filtering, our algorithm’s online computation scales
                attributes, including items viewed, demographic               independently of the number of customers and
                data, subject interests, and favorite artists.                number of items in the product catalog. Our algo-
                   At, we use recommendation algo-                 rithm produces recommendations in realtime,
                rithms to personalize the online store for each cus-          scales to massive data sets, and generates high-
                tomer. The store radically changes based on cus-              quality recommendations.
                tomer interests, showing programming titles to a
                software engineer and baby toys to a new mother.              Recommendation Algorithms
                The click-through and conversion rates — two                  Most recommendation algorithms start by finding
                important measures of Web-based and email                     a set of customers whose purchased and rated
                advertising effectiveness — vastly exceed those of            items overlap the user’s purchased and rated
                untargeted content such as banner advertisements              items.2 The algorithm aggregates items from these
                and top-seller lists.                                         similar customers, eliminates items the user has
                   E-commerce recommendation algorithms often                 already purchased or rated, and recommends the
                operate in a challenging environment. For example:            remaining items to the user. Two popular versions
                                                                              of these algorithms are collaborative filtering and
                • A large retailer might have huge amounts of                 cluster models. Other algorithms — including
                  data, tens of millions of customers and millions            search-based methods and our own item-to-item
                  of distinct catalog items.                                  collaborative filtering — focus on finding similar
                • Many applications require the results set to be             items, not similar customers. For each of the user’s
                  returned in realtime, in no more than half a                purchased and rated items, the algorithm attempts
                  second, while still producing high-quality rec-             to find similar items. It then aggregates the simi-
                  ommendations.                                               lar items and recommends them.
                • New customers typically have extremely limit-
                  ed information, based on only a few purchases               Traditional Collaborative Filtering
                  or product ratings.                                         A traditional collaborative filtering algorithm rep-
                • Older customers can have a glut of information,             resents a customer as an N-dimensional vector of
                  based on thousands of purchases and ratings.                items, where N is the number of distinct catalog
                • Customer data is volatile: Each interaction pro-            items. The components of the vector are positive
                  vides valuable customer data, and the algorithm             for purchased or positively rated items and nega-
                  must respond immediately to new information.                tive for negatively rated items. To compensate for

76     JANUARY • FEBRUARY 2003       Published by the IEEE Computer Society   1089-7801/03/$17.00©2003 IEEE   IEEE INTERNET COMPUTING Recommendations - Item-to-Item Collaborative Filtering Recommendations

best-selling items, the algorithm typically multi-           Unfortunately, all these methods also reduce
plies the vector components by the inverse fre-           recommendation quality in several ways. First, if
quency (the inverse of the number of customers            the algorithm examines only a small customer
who have purchased or rated the item), making less        sample, the selected customers will be less similar
well-known items much more relevant.3 For almost          to the user. Second, item-space partitioning
all customers, this vector is extremely sparse.           restricts recommendations to a specific product or
   The algorithm generates recommendations                subject area. Third, if the algorithm discards the
based on a few customers who are most similar to          most popular or unpopular items, they will never
the user. It can measure the similarity of two cus-       appear as recommendations, and customers who
tomers, A and B, in various ways; a common                have purchased only those items will not get rec-
method is to measure the cosine of the angle              ommendations. Dimensionality reduction tech-
between the two vectors: 4                                niques applied to the item space tend to have the
                                                          same effect by eliminating low-frequency items.
                               r r                        Dimensionality reduction applied to the customer
               r r       r r  A•B
    similarity A B =
                ,     )
                     cos  (
                         AB =
                          ,   r)  r
                              A * B
                                                          space effectively groups similar customers into
                                                          clusters; as we now describe, such clustering can
                                                          also degrade recommendation quality.

The algorithm can select recommendations from             Cluster Models
the similar customers’ items using various meth-          To find customers who are similar to the user, clus-
ods as well, a common technique is to rank each           ter models divide the customer base into many seg-
item according to how many similar customers              ments and treat the task as a classification problem.
purchased it.                                             The algorithm’s goal is to assign the user to the seg-
   Using collaborative filtering to generate recom-       ment containing the most similar customers. It then
mendations is computationally expensive. It is            uses the purchases and ratings of the customers in
O(MN) in the worst case, where M is the number            the segment to generate recommendations.
of customers and N is the number of product cat-             The segments typically are created using a clus-
alog items, since it examines M customers and up          tering or other unsupervised learning algorithm,
to N items for each customer. However, because            although some applications use manually deter-
the average customer vector is extremely sparse,          mined segments. Using a similarity metric, a clus-
the algorithm’s performance tends to be closer to         tering algorithm groups the most similar customers
O(M + N). Scanning every customer is approxi-             together to form clusters or segments. Because
mately O(M), not O(MN), because almost all cus-           optimal clustering over large data sets is imprac-
tomer vectors contain a small number of items,            tical, most applications use various forms of
regardless of the size of the catalog. But there are      greedy cluster generation. These algorithms typi-
a few customers who have purchased or rated a             cally start with an initial set of segments, which
significant percentage of the catalog, requiring          often contain one randomly selected customer
O(N) processing time. Thus, the final performance         each. They then repeatedly match customers to the
of the algorithm is approximately O(M + N). Even          existing segments, usually with some provision for
so, for very large data sets — such as 10 million or      creating new or merging existing segments.6 For
more customers and 1 million or more catalog              very large data sets — especially those with high
items — the algorithm encounters severe perfor-           dimensionality — sampling or dimensionality
mance and scaling issues.                                 reduction is also necessary.
   It is possible to partially address these scaling         Once the algorithm generates the segments, it
issues by reducing the data size.4 We can reduce M        computes the user’s similarity to vectors that sum-
by randomly sampling the customers or discarding          marize each segment, then chooses the segment
customers with few purchases, and reduce N by dis-        with the strongest similarity and classifies the user
carding very popular or unpopular items. It is also       accordingly. Some algorithms classify users into
possible to reduce the number of items examined           multiple segments and describe the strength of
by a small, constant factor by partitioning the item      each relationship.7
space based on product category or subject classi-           Cluster models have better online scalability
fication. Dimensionality reduction techniques such        and performance than collaborative filtering3
as clustering and principal component analysis can        because they compare the user to a controlled
reduce M or N by a large factor.5                         number of segments rather than the entire cus-

IEEE INTERNET COMPUTING                                      JANUARY • FEBRUARY 2003   77 Recommendations - Item-to-Item Collaborative Filtering
Industry Report

                                                                                   form well. For users with thousands of purchases,
                                                                                   however, it’s impractical to base a query on all the
                                                                                   items. The algorithm must use a subset or summa-
                                                                                   ry of the data, reducing quality. In all cases, rec-
                                                                                   ommendation quality is relatively poor. The rec-
                                                                                   ommendations are often either too general (such
                                                                                   as best-selling drama DVD titles) or too narrow
                                                                                   (such as all books by the same author). Recom-
                                                                                   mendations should help a customer find and dis-
Figure 1. The “Your Recommendations” feature on the                     cover new, relevant, and interesting items. Popu-
homepage. Using this feature, customers can sort recommendations                   lar items by the same author or in the same subject
and add their own product ratings.                                                 category fail to achieve this goal.

                                                                                   Collaborative Filtering
                                                                          uses recommendations as a targeted
                                                                                   marketing tool in many email campaigns and on
                                                                                   most of its Web sites’ pages, including the high-
                                                                                   traffic homepage. Clicking on the
                                                                                   “Your Recommendations” link leads customers to an
Figure 2. shopping cart recommendations. The recom-                     area where they can filter their recommendations by
mendations are based on the items in the customer’s cart: The                      product line and subject area, rate the recommended
Pragmatic Programmer and Physics for Game Developers.                              products, rate their previous purchases, and see why
                                                                                   items are recommended (see Figure 1).
                                                                                       As Figure 2 shows, our shopping cart recom-
                  tomer base. The complex and expensive clustering                 mendations, which offer customers product sug-
                  computation is run offline. However, recommen-                   gestions based on the items in their shopping cart.
                  dation quality is low.1 Cluster models group                     The feature is similar to the impulse items in a
                  numerous customers together in a segment, match                  supermarket checkout line, but our impulse items
                  a user to a segment, and then consider all cus-                  are targeted to each customer.
                  tomers in the segment similar customers for the             extensively uses recommendation
                  purpose of making recommendations. Because the                   algorithms to personalize its Web site to each cus-
                  similar customers that the cluster models find are               tomer’s interests. Because existing recommendation
                  not the most similar customers, the recommenda-                  algorithms cannot scale to’s tens of
                  tions they produce are less relevant. It is possible             millions of customers and products, we developed
                  to improve quality by using numerous fine-                       our own. Our algorithm, item-to-item collaborative
                  grained segments, but then online user–segment                   filtering, scales to massive data sets and produces
                  classification becomes almost as expensive as find-              high-quality recommendations in real time.
                  ing similar customers using collaborative filtering.
                                                                                   How It Works
                  Search-Based Methods                                             Rather than matching the user to similar cus-
                  Search- or content-based methods treat the rec-                  tomers, item-to-item collaborative filtering match-
                  ommendations problem as a search for related                     es each of the user’s purchased and rated items to
                  items.8 Given the user’s purchased and rated                     similar items, then combines those similar items
                  items, the algorithm constructs a search query to                into a recommendation list.9
                  find other popular items by the same author,                        To determine the most-similar match for a given
                  artist, or director, or with similar keywords or                 item, the algorithm builds a similar-items table by
                  subjects. If a customer buys the Godfather DVD                   finding items that customers tend to purchase
                  Collection, for example, the system might recom-                 together. We could build a product-to-product
                  mend other crime drama titles, other titles star-                matrix by iterating through all item pairs and com-
                  ring Marlon Brando, or other movies directed by                  puting a similarity metric for each pair. However,
                  Francis Ford Coppola.                                            many product pairs have no common customers,
                     If the user has few purchases or ratings, search-             and thus the approach is inefficient in terms of
                  based recommendation algorithms scale and per-                   processing time and memory usage. The following

78       JANUARY • FEBRUARY 2003                                            IEEE INTERNET COMPUTING Recommendations

iterative algorithm provides a better approach by            large data sets, unless it uses dimensionality
calculating the similarity between a single prod-            reduction, sampling, or partitioning — all of
uct and all related products:                                which reduce recommendation quality.
                                                           • Cluster models can perform much of the com-
   For each item in product catalog, I1                      putation offline, but recommendation quality
      For each customer C who purchased I1                   is relatively poor. To improve it, it’s possible to
         For each item I2 purchased by                       increase the number of segments, but this
            customer C                                       makes the online user–segment classification
           Record that a customer purchased I1               expensive.
            and I2                                         • Search-based models build keyword, category,
      For each item I2                                       and author indexes offline, but fail to provide
         Compute the similarity between I1 and I2            recommendations with interesting, targeted
                                                             titles. They also scale poorly for customers with
It’s possible to compute the similarity between two          numerous purchases and ratings.
items in various ways, but a common method is to
use the cosine measure we described earlier, in which      The key to item-to-item collaborative filtering’s
each vector corresponds to an item rather than a           scalability and performance is that it creates the
customer, and the vector’s M dimensions correspond         expensive similar-items table offline. The algo-
to customers who have purchased that item.                 rithm’s online component — looking up similar
    This offline computation of the similar-items          items for the user’s purchases and ratings — scales
table is extremely time intensive, with O(N2M) as          independently of the catalog size or the total num-
worst case. In practice, however, it’s closer to           ber of customers; it is dependent only on how
O(NM), as most customers have very few purchas-            many titles the user has purchased or rated. Thus,
es. Sampling customers who purchase best-selling           the algorithm is fast even for extremely large data
titles reduces runtime even further, with little           sets. Because the algorithm recommends highly
reduction in quality.                                      correlated similar items, recommendation quality
    Given a similar-items table, the algorithm finds       is excellent.10 Unlike traditional collaborative fil-
items similar to each of the user’s purchases and          tering, the algorithm also performs well with lim-
ratings, aggregates those items, and then recom-           ited user data, producing high-quality recommen-
mends the most popular or correlated items. This           dations based on as few as two or three items.
computation is very quick, depending only on the
number of items the user purchased or rated.               Conclusion
                                                           Recommendation algorithms provide an effective
Scalability: A Comparison                                  form of targeted marketing by creating a person- has more than 29 million customers              alized shopping experience for each customer. For
and several million catalog items. Other major             large retailers like, a good recom-
retailers have comparably large data sources.              mendation algorithm is scalable over very large
While all this data offers opportunity, it’s also a        customer bases and product catalogs, requires only
curse, breaking the backs of algorithms designed           subsecond processing time to generate online rec-
for data sets three orders of magnitude smaller.           ommendations, is able to react immediately to
Almost all existing algorithms were evaluated over         changes in a user’s data, and makes compelling
small data sets. For example, the MovieLens data           recommendations for all users regardless of the
set4 contains 35,000 customers and 3,000 items,            number of purchases and ratings. Unlike other
and the EachMovie data set3 contains 4,000 cus-            algorithms, item-to-item collaborative filtering is
tomers and 1,600 items.                                    able to meet this challenge.
   For very large data sets, a scalable recommen-             In the future, we expect the retail industry to
dation algorithm must perform the most expensive           more broadly apply recommendation algorithms for
calculations offline. As a brief comparison shows,         targeted marketing, both online and offline. While
existing methods fall short:                               e-commerce businesses have the easiest vehicles for
                                                           personalization, the technology’s increased conver-
• Traditional collaborative filtering does little or       sion rates as compared with traditional broad-scale
  no offline computation, and its online compu-            approaches will also make it compelling to offline
  tation scales with the number of customers and           retailers for use in postal mailings, coupons, and
  catalog items. The algorithm is impractical on           other forms of customer communication.

IEEE INTERNET COMPUTING                                      JANUARY • FEBRUARY 2003   79
                               Advertiser / Product                               Page Number          References
                                                                                                       1. J.B. Schafer, J.A. Konstan, and J. Reidl, “E-Commerce Rec-
                                                                                                           ommendation Applications,” Data Mining and Knowledge
                                John Wiley & Sons                     Inside Back Cover                    Discovery, Kluwer Academic, 2001, pp. 115-153.
                                                                                                       2. P. Resnick et al., “GroupLens: An Open Architecture for
                                CTIA Wireless                                  Back Cover                  Collaborative Filtering of Netnews,” Proc. ACM 1994 Conf.
                                                                                                           Computer Supported Cooperative Work, ACM Press, 1994,
                                                                                                           pp. 175-186.
                                                       Advertising Personnel                           3. J. Breese, D. Heckerman, and C. Kadie, “Empirical Analy-
                                                                                                           sis of Predictive Algorithms for Collaborative Filtering,”
                              Marion Delaney                       Sandy Brown                             Proc. 14th Conf. Uncertainty in Artificial Intelligence, Mor-
                              IEEE Media, Advertising Director     IEEE Computer Society,                  gan Kaufmann, 1998, pp. 43-52.
                              Phone:+1 212 419 7766                Business Development Manager
                              Fax:       +1 212 419 7589           Phone:+1 714 821 8380               4. B.M. Sarwarm et al., “Analysis of Recommendation Algo-
                              Email:         Fax:        +1 714 821 4010             rithms for E-Commerce,” ACM Conf. Electronic Commerce,
                                                                   Email:            ACM Press, 2000, pp.158-167.
                              Marian Anderson                      Debbie Sims                         5. K. Goldberg et al., “Eigentaste: A Constant Time Collabo-
                              Advertising Coordinator              Assistant Advertising Coordinator       rative Filtering Algorithm,” Information Retrieval J., vol.
                              Phone:+1 714 821 8380                Phone:+1 714 821 8380                   4, no. 2, July 2001, pp. 133-151.
                              Fax:       +1 714 821 4010           Fax:       +1 714 821 4010
                              Email:                                            6. P.S. Bradley, U.M. Fayyad, and C. Reina, “Scaling Clustering
                                                                                                           Algorithms to Large Databases,” Knowledge Discovery and
                                                                                                           Data Mining, Kluwer Academic, 1998, pp. 9-15.
                                                 Advertising Sales Representatives                     7. L. Ungar and D. Foster, “Clustering Methods for Collabo-
                                                                                                           rative Filtering,” Proc. Workshop on Recommendation Sys-
                             Mid Atlantic (product/recruitment)    Southeast (product/recruitment)
                                                                                                           tems, AAAI Press, 1998.
                             Dawn Becker                           C. William Bentz III
                             Phone:      +1 732 772 0160           Email:        8. M. Balabanovic and Y. Shoham, “Content-Based Collabora-
                             Fax:        +1 732 772 0161           Gregory Maddock                         tive Recommendation,” Comm. ACM, Mar. 1997, pp. 66-72.
                             Email:          Email:        9. G.D. Linden, J.A. Jacobi, and E.A. Benson, Collaborative
                                                                   Sarah K. Wiley                          Recommendations Using Item-to-Item Similarity Mappings,
                             Midwest (product)                     Email:            US Patent 6,266,649 (to, Patent and Trade-
                             David Kovacs                          Phone:      +1 404 256 3800             mark Office, Washington, D.C., 2001.
                             Phone:      +1 847 705 6867           Fax:        +1 404 255 7942         10. B.M. Sarwar et al., “Item-Based Collaborative Filtering Rec-
                             Fax:        +1 847 705 6878
                                                                   Midwest/Southwest recruitment)          ommendation Algorithms,” 10th Int’l World Wide Web
                                                                   Tom Wilcoxen                            Conference, ACM Press, 2001, pp. 285-295.
                             New England (product)                 Phone:       +1 847 498 4520
                             Jody Estabrook                        Fax:         +1 847 498 5911
                                                                                                       Greg Linden was cofounder, researcher, and senior manager in
                             Phone:       +1 978 244 0192          Email:
                             Fax:         +1 978 244 0103                                                  the Personalization Group, where he designed
                                                                   New England (recruitment)               and developed the recommendation algorithm. He is cur-
                                                                   Barbara Lynch
                                                                   Phone:       +1 401 738 6237            rently a graduate student in management in the Sloan Pro-
                             Southwest (product)                   Fax:         +1 401 739 7970            gram at Stanford University’s Graduate School of Business.
                             Royce House                           Email:
                             Phone:       +1 713 668 1007                                                  His research interests include recommendation systems, per-
                             Fax:         +1 713 668 1176          Connecticut (product)                   sonalization, data mining, and artificial intelligence. Linden
                             Email:          Stan Greenfield                         received an MS in computer science from the University of
                                                                   Phone:      +1 203 938 2418
                             Northwest (product)                                                           Washington. Contact him at
                                                                   Fax:        +1 203 938 3211
                             John Gibbs                            Email:
                             Phone: +1 415 929 7619                                                    Brent Smith leads the Automated Merchandising team at Ama-
                             Fax:      +1 415 577 5198             Northwest (recruitment)
                             Email:          Mary Tonon                     His research interests include data mining, machine
                                                                   Phone:       +1 415 431 5333            learning, and recommendation systems. He received a BS in
                             Southern CA (product)                 Fax:         +1 415 431 5335            mathematics from the University of California, San Diego,
                             Marshall Rubin                        Email:
                             Phone:       +1 818 888 2407                                                  and an MS in mathematics from the University of Washing-
                             Fax:         +1 818 888 4907          Southern CA (recruitment)               ton, where he did graduate work in differential geometry.
                             Email:          Tim Matteson
                                                                   Phone:      +1 310 836 4064             Contact him at
                             Midwest (product)                     Fax:        +1 310 836 4067
                             Dave Jones                            Email:        Jeremy York leads the Automated Content Selection and Deliv-
                             Phone:       +1 708 442 5633          Japan                                   ery team at His interests include statistical
                             Fax:         +1 708 442 7620          German Tajiri
                             Email:                                                  models for categorical data, recommendation systems, and
                                                                   Phone:        +81 42 501 9551
                             Will Hamilton                         Fax:          +81 42 501 9552           optimal choice of Web site display components. He
                             Phone:       +1 269 381 2156          Email:            received a PhD in statistics from the University of Wash-
                             Fax:         +1 269 381 2556
                             Email:          Europe (product)                        ington, where his thesis won the Leonard J. Savage award
                             Joe DiNardo                           Hilary Turnbull                         for best thesis in applied Bayesian econometrics and sta-
                             Phone:       +1 440 248 2456          Phone:     +44 131 660 6605             tistics. Contact him at
                             Fax:         +1 440 248 2594          Fax:       +44 131 660 6989
                             Email:          Email:
                                                                                                                                                IEEE INTERNET COMPUTING
You can also read