DEPARTMENT OF INFORMATICS - IT-Assisted Provision of Product Data to Online Retailers in the Home & Living Sector - Software Engineering for ...

Page created by Wesley King
 
CONTINUE READING
DEPARTMENT OF INFORMATICS - IT-Assisted Provision of Product Data to Online Retailers in the Home & Living Sector - Software Engineering for ...
DEPARTMENT OF INFORMATICS
           TECHNISCHE UNIVERSITÄT MÜNCHEN

             Master’s Thesis in Informatics

IT-Assisted Provision of Product Data to Online
    Retailers in the Home & Living Sector

                 Philipp Schlieker
DEPARTMENT OF INFORMATICS - IT-Assisted Provision of Product Data to Online Retailers in the Home & Living Sector - Software Engineering for ...
DEPARTMENT OF INFORMATICS
           TECHNISCHE UNIVERSITÄT MÜNCHEN

             Master’s Thesis in Informatics

IT-Assisted Provision of Product Data to Online
    Retailers in the Home & Living Sector

      IT-Unterstützte Bereitstellung von
    Produktdaten an Onlineshops im
          Home- & Living-Bereich

           Author:            Philipp Schlieker
           Supervisor:        Prof. Dr. Florian Matthes
           Advisor:           Tim Schopf
           Submission Date:   15.08.2021
DEPARTMENT OF INFORMATICS - IT-Assisted Provision of Product Data to Online Retailers in the Home & Living Sector - Software Engineering for ...
I confirm that this master’s thesis in informatics is my own work and I have documented all
sources and material used.

Munich, 15.08.2021                               Philipp Schlieker
Acknowledgments

   I want to thank all people that helped me write this thesis. First, I would like to thank
everyone that took time out of their busy schedules to help me conduct my interviews. This
has been truly helpful, and apart from writing this thesis, it allowed me to learn a lot. Next, I
would like to thank my advisor Tim for all his input, ideas, and feedback. Further, I would
like to thank Prof. Matthes for his very good and pointed questions. This is followed by my
gratitude for the support and understanding of my co-founder Daniel during the last months.
Last but not least, I would like to show my appreciation for all the never-ending help of my
girlfriend Anika.
Abstract
The proliferation of e-commerce in the Home & Living industry has increased the importance
of product data, such as information about size, color, and material. In most cases, online
retailers require their suppliers to provide them with this information about their products.
This is mainly done by filling the information into Excel templates provided by the online
retailers that define the syntactic and semantic structure. Due to a lack of systems to support
the suppliers and the differences among these templates, this process is largely done manually.
This thesis first presents a clear image of this process in practice by conducting interviews and
analyzing the data structure of manufacturers and online retailers. Limited data quality of
manufacturers, as well as great syntactic and semantic differences among the online retailers’
templates, pose challenges towards the automated exchange. Based on these results, different
IT-based approaches to assisting with providing product data in the Home & Living industry
are explored. A best-practice approach leveraging a common ontology and separating
concerns is presented and evaluated as a proof of concept. Further conducted interviews
confirm the proposed system.

                                               iv
Kurzfassung
Die Verbreitung des E-Commerce im Home & Living-Bereich hat die Bedeutung von Pro-
duktdaten wie Informationen zu Größe, Farbe und Material gesteigert. In den meisten Fällen
verlangen Online-Händler von ihren Lieferanten, dass diese ihnen die Informationen zu ihren
Produkten zur Verfügung stellen. Dies geschieht hauptsächlich durch das Einfüllen der Infor-
mationen in Excel-Vorlagen der Online-Händler, welche die syntaktische und semantische
Struktur definieren. Aufgrund fehlender Systeme zur Unterstützung der Lieferanten und der
Unterschiede zwischen diesen Vorlagen wird dieser Prozess größtenteils manuell durchge-
führt. In dieser Arbeit wird der Prozess zunächst anhand von Interviews und der Analyse
der Datenstruktur von Herstellern und Online-Händlern dargestellt. Die eingeschränkte Da-
tenqualität der Hersteller, sowie große syntaktische und semantische Unterschiede zwischen
den Templates der Online-Händler stellen den automatisierten Austausch vor Herausfor-
derungen. Basierend auf diesen Ergebnissen werden verschiedene IT-basierte Ansätze, für
die IT-Unterstützte Bereitstellung von Produktdaten an Onlineshops im Home- & Living
Bereich, untersucht. Ein Best-Practice-Ansatz, der eine gemeinsame Ontologie nutzt und
Zuständigkeiten trennt, wird als Proof of Concept vorgestellt und bewertet. Durchgeführte
Interviews bestätigen das vorgeschlagene System.

                                             v
Contents
Acknowledgments                                                                                                                    iii

Abstract                                                                                                                           iv

Kurzfassung                                                                                                                         v

1. Introduction                                                                                                                     1

2. Foundations                                                                                                                      3
   2.1. Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                      3
   2.2. Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                    4

3. Related Work                                                                                                                     6
   3.1. Data Exchange in the Home & Living Sector . . . .          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    6
   3.2. Industrial Information Integration . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    8
        3.2.1. Metamodel-Based Information Integration .           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    8
        3.2.2. Ontology & Schema Matching . . . . . . . .          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   10
        3.2.3. Intra-Organizational Information Integration        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   12
        3.2.4. Inter-Organizational Information Integration        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   13
   3.3. Product Catalog Integration . . . . . . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   14
        3.3.1. Layered Integration . . . . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   15
        3.3.2. Syntactic Integration using XML . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   17
        3.3.3. Semantic Integration using Ontologies . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   18
        3.3.4. Integration using Mediators . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   21
        3.3.5. Information Extraction . . . . . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   22

4. State of the Art                                                                                                                23

5. IT-Assisted Provision of Product Data                                                                                           26
   5.1. Problem Definition . . . . . . . . . . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   26
        5.1.1. Source Formats of Manufacturers . . . . . . .           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   26
        5.1.2. Target Formats of Online Shops . . . . . . . .          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   27
        5.1.3. Analysis of Transformations between Formats             .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   30
   5.2. Approaches . . . . . . . . . . . . . . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   31
        5.2.1. Theoretical Evaluation Schemata . . . . . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   31
        5.2.2. Possible Approaches . . . . . . . . . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   33
        5.2.3. Theoretical Evaluation . . . . . . . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   35

                                                 vi
Contents

   5.3. Design Principles . . . . . . . . . . . .                       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   35
        5.3.1. Metamodel-Based Integration .                            .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   35
        5.3.2. Separation of Concerns . . . .                           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   35
        5.3.3. Mediator . . . . . . . . . . . . .                       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   37
        5.3.4. Ontology . . . . . . . . . . . . .                       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   37
   5.4. Architecture . . . . . . . . . . . . . . .                      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   38
        5.4.1. Syntax Layer . . . . . . . . . .                         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   38
        5.4.2. Normalization Layer . . . . . .                          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   38
        5.4.3. Data Model Layer . . . . . . .                           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   40
        5.4.4. Ontology Layer . . . . . . . . .                         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   40
        5.4.5. Enrichment Layer . . . . . . . .                         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   40
   5.5. Design of Common Ontology . . . . .                             .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   40
        5.5.1. Methodology . . . . . . . . . .                          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   41
        5.5.2. Purpose and Scope . . . . . . .                          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   41
        5.5.3. Building of Ontology . . . . . .                         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   42
   5.6. Implementation . . . . . . . . . . . . .                        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   45
   5.7. Evaluation . . . . . . . . . . . . . . . .                      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   48
        5.7.1. Quantitative Evaluation . . . .                          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   48
        5.7.2. Qualitative Evaluation . . . . .                         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   52

6. Discussion                                                                                                                                                           54

7. Conclusion                                                                                                                                                           57

A. General Addenda                                                                                                                                                      58
   A.1. Interview Guide . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   58
   A.2. Interview Summaries     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   59
        A.2.1. Interview 1 .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   59
        A.2.2. Interview 2 .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   61
        A.2.3. Interview 3 .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   63
        A.2.4. Interview 4 .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   64
        A.2.5. Interview 5 .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   66
        A.2.6. Interview 6 .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   68

List of Figures                                                                                                                                                         70

List of Tables                                                                                                                                                          71

Acronyms                                                                                                                                                                72

Bibliography                                                                                                                                                            73

                                                                vii
1. Introduction
E-commerce is currently one of the fastest-growing sales channels [1]. In recent years this has
also applied to the Home & Living industry [2] with 34% of customers in Germany expressing
a preference of online over offline sales in a survey in 2017 [3]. Interviews have shown that
product data, such as descriptions on size, material, and color, play a key component in
customers purchasing decision process [4]. The present work addresses the exchange of this
data between manufacturers in the Home & Living industry and their retail partners.
The exchange of product data in the Home & Living industry comes along with data
exchange challenges in B2B transactions. In the 2000s, the advent of XML documents that
follow a predefined Document Type Definition (DTD) has eliminated the first interoperability
challenges in many industries on a purely syntactic level. Following this many different
standards for these XML documents emerged, leaving the challenge of integrating these
especially on a semantic level [5]. The usage of ontologies as a shared definition of used
vocabulary or as data sources including semantics has been proposed as a solution [6]. Large
initiatives have tried to introduce common standards and ontologies with varying success.
One reason for this is that there is often no agreement between different stakeholders on the
proper structure of product data as well as different requirements. The lack of standards in
many cases leaves the challenge of product data integration [7]. This is also the case in the
Home & Living industry.
The online share within the Home & Living industry has increased and thereby also the
importance of product data. Interviews with suppliers of online retailers in the Home &
Living industry have shown that the requirements towards product data have drastically
increased. Whereas before, only common information such as packaging information was
required, this has extended to very granular information, e.g., on the material. The provision
of such product data is currently a mainly manual process, consuming large amounts of
resources and being error-prone. The importance, as well as the challenges behind the process
of product data exchange in other industries, has long been a topic of research [8][5][9][7].
Therefore, the present work explores approaches towards the IT-assisted provision of product
data to online retailers in the Home & Living industry. The objective of the present work is to
provide a clear picture of the current state of the art, an analysis of possible approaches, and
a best-practice approach to guide future development.
The thesis uses the following approach. First the current state of the art is analyzed. This is
done through semi-structured expert interviews to ensure practical relevance. Throughout
the interviews, it became clear that no standard exists within the industry. This leaves the
challenge of product data exchange and integration. The most common approach is the
exchange of Excel files between manufacturers and online retailers. In most cases, the online
retailer will provide a template with prefilled values indicating the required structure. The

                                               1
1. Introduction

manufacturer will then fill in the information of the products to be sold. Next in order to get
a deeper understanding of the requirements for product data integration, the product data
structures of manufacturers and online retailers are analyzed. Based on the found challenges,
best practices are identified within the literature. This is used to answer the question of which
IT-based approaches could support users with the provision of product data. These are then
compared, and the most promising is implemented and evaluated as a Proof of Concept
(PoC). The comparison is done through experiments and semi-structured expert interviews.
Last but not least, an outlook for future work is given.

                                               2
2. Foundations

2.1. Research Methodology
The research methodology applied in this work is based on the Design Science Research
(DSR) Framework introduced by Hevner, March, Park, and Ram [10]. DSR aims to unify two
components that are deemed to be fundamental to the Information System (IS) discipline. On
the one side, behavioral science aiming to develop and verify theories that are connected to
human or organizational behavior. On the other side, design science trying to innovate by
creating new artifacts and thereby extending upon human possibilities. DSR combines these
two sides into a framework for understanding, executing, and evaluating research. Figure 2.1
shows the resulting overall framework.

      Figure 2.1.: Overview of the Information Systems Research Framework by [10]

The overall framework contains three main components: environment, IS research, and
the knowledge base. The environment describes the problem space which is addressed.
This includes mainly the people, organizations, and their existing or planned technologies.
Together these describe the goals, tasks, problems, and opportunities the people perceive
within the organization. This defines the business needs, also called problem. By addressing

                                             3
2. Foundations

business needs, relevance is given. These business needs are then addressed within IS research.
IS research contains a cycle between the development and building of artifacts or theories
and the justification and evaluation of these. The knowledge base provides the raw materials
used within IS research and encompasses the foundations and methodologies to be applied
and used. The results are added to the knowledge base for further research and practice. The
knowledge base thereby ensures rigor within DSR.
These three components structure the thesis. The environment is analyzed based on semi-
structured interviews, which were conducted with experts in the field. The interviews were
done after the implementation of the PoC to include feedback on the PoC. The interview
guide is included in addendum A.1. The interviews were summarized according to the
interview guide and can be found in addendum A.2. The identified business needs are
discussed within chapter 4 on State of the Art. The knowledge base is analyzed in chapter
3 on Related Work. By analyzing prior work, common solution approaches, as well as best
practices, are identified. These are applied within chapter 5 on IT-Assisted Provision of
Product Data. This chapter is dedicated to IS research and extends the prior knowledge. The
product data structures of 15 different manufacturers with product assortments ranging from
accessories, lighting, kitchen supplies, bedding to small furniture are analyzed. Based on
the classification of implisense1 they are micro to medium sized companies, having less than
249 employees and less than 50 Mio€ in annual revenue. The selection of manufacturers was
restricted to manufacturers that sell to large online retailers and can provide their product
data in one file. The target formats of eight online shops carrying products of the Home &
Living sector are examined. Four of them are among the top ten revenue leaders of online
shops in the Home & Living industry [11]. The requirements for transforming the data
between source and target format are evaluated by experiment. Various approaches to the
automation of the transformation process were developed within the loop of developing,
building, justifying, and evaluating. These are analyzed from a theoretical point of view
based on informed argument. The most promising approach was implemented as a PoC. The
result is the artifact in form of a model as well as the implementation of a PoC system to solve
the challenge at hand. The outcome was, on the one side, evaluated through experimentation
concerning the required manual work. On the other side, through informed argument by
presenting the results to experts during the semi-structured interviews.

2.2. Research Questions
The research questions answered within this work are the following:

RQ1: How do manufacturers in the Home & Living sector provide product data to online
retailers? In order to ensure the relevance of the conducted research, the problem space is
evaluated. This is done by providing a thorough analysis of the status quo on how product
data is currently provided to online retailers in the Home & Living industry. On the one side,

 1 https://blog.implisense.com/neue-einstufung-fuer-unternehmensgroessen-im-implisense-datenbestand/

                                               4
2. Foundations

semi-structured interviews are conducted. On the other side, product data is transformed by
experimentation to understand the problem better.

RQ2: What are IT-based approaches for assisting manufacturers with the provision of
product data to online retailers? In general, a variety of different solution approaches
towards the provision of product data are possible. By analyzing common approaches within
the literature and other industries, a set of possible approaches is developed.

RQ3: What approaches for the assisted provision of product data provide the greatest
benefit for the user? These different approaches are then compared with regard to the
greatest benefit for the user. The benefit of the user is defined as the to be expected Return on
Investment (ROI). In the first step, this is done from a theoretical point of view. In the second
step, the most promising approach is developed as PoC and evaluated. The evaluation is
based on experimentation with sample data as well as semi-structured expert interviews.

                                               5
3. Related Work

3.1. Data Exchange in the Home & Living Sector
Over the years, approaches towards product data exchange, specifically in the furniture
industry, have been proposed. One of them is the FunStep initiative1 , which together with
its partners strives to facilitate and support interoperability within the worldwide furniture
industry by developing and implementing e-business activities. This especially keeps in mind
the requirement for information exchange along the supply chain with different external
business partners [12]. The main motivations behind the initiative and its creation can be
found in [13]. It remains to highlight that Nobilia, a large kitchen manufacturer, is among
the six mentioned initial members. Hence, especially planning intensive products, such as in
this case kitchens, were the original focus. This means that data, such as data for planning
and order management, plays an important role. In order to support the overall goal of
interoperability, the ISO-Norm 10303-2362 under the title “Industrial automation systems and
integration — Product data representation and exchange — Part 236: Application protocol:
Furniture catalog and interior design“ is introduced. In addition, an ontology was proposed,
which among others, covers different pieces of furniture, as well as services, detailed logistics
and manufacturing processes and techniques in the furniture industry [14]. Nevertheless,
since its publication in 2006, neither the ISO-Norm nor the ontology have seen any widespread
adoption in the industry and scientific publications.
The ISO-Norm 10303-236 was applied within a large Brazilian furniture company. The
learnings are discussed in [15], which highlight the process of transforming industry and
company-specific knowledge into the ontology for seven different product pieces. They
thereby showcase the steps of interviewing relevant stakeholders and integrating this infor-
mation using Protégé into the ontology. As challenges, they identify vague definitions within
the norm from a technical and user point of view. More specifically, they note that the norm
was not always very clear to the members of the furniture industry. Last but not least, they
emphasize the flexibility of the standard and point towards the risk that this flexibility will
lead to ongoing challenges in the data exchange, hence not resolving the difficulties in data
exchange. These conclusions drawn by [15] are however limited by the fact that the work
focuses on the adoption only within the company and does not include any learnings from
using the norm for data exchange.
[16] study the information resources in the furniture industry as part of the Business Innova-
tion and Virtual Enterprise Environment (BIVEE) project in Spain. The BIVEE project strives

 1 http://www.funstep.org/
 2 https://www.iso.org/standard/42340.html

                                               6
3. Related Work

to promote innovations and production improvements in Small and Medium Enterprises
(SMEs). In order to achieve this goal, they analyze various SMEs concerning their needs
and challenges in respect to information resources. Thereby they include the requirements
of AIDIMA (Technology Institute of Furniture, Wood, and Packaging), which was also part
of the previously mentioned FunStep initiative, as the end-user. In the SMEs they worked
with, they highlight the successful implementation of Enterprise Resource Planning (ERP)
systems. However, they point to challenges in the planning of the production. Regarding
the previously mentioned FunStep ontology, they note that the ontology lacks references
to production technologies from their point of view. Last but not least, the work addresses
challenges with regard to change management with the introduction of new systems.
As mentioned by looking at Nobilia, manufacturers of planning-intensive products, such as
kitchens, are faced with a variety of challenges based on a large number of possible configu-
rations. [17] develop an ontology for Verso Design Furniture Inc., a furniture customization
company, to address the challenge of deciding whether a particular furniture combination is
possible or not. Even though they were not able to prove that the ontology will not allow
for furniture configurations that are not physically possible, their results seem promising
considering that all existing joint combinations could be successfully represented. They point
towards the opportunities presented by mass customization that can be enabled by ontologies.
As another planning-intensive furniture segment, parts of the German office furniture indus-
try have adopted a standard called OFML which is driven by the Industrieverband Büro und
Arbeitswelt (Industrycooperation office and workenvironment)[18]. The adoption of OFML
has simplified the exchange of data relevant for the planning of larger offices, including
3D data and aspects related to order management. [19] highlight the chances of a highly
integrated production environment from a manufacturer’s point of view. Nevertheless, based
on our interviews, the adoption remains limited to the office furniture segment and is not
fully adapted to the needs of online resellers with regard to product information.
[20] analyzes the status of integrations between businesses within the German furniture
industry by conducting interviews. He notes that a wide variety of integrations is to be found
even within the German furniture industry and selects the upholstery and kitchen industry
to be further analyzed. Within the two segments, he mentions that the kitchen industry
has widely adopted the IDM-Kitchen-Standard. In contrast, such a standard has not been
adapted in the upholstery segment, even though a standard, IDM-Upholstery, is available.
[20] therefore compares the influences on the German upholstery and kitchen industry for
the establishment of infrastructures to facilitate the data exchange.
A few common factors can be identified from the limited number of publications on data
exchange in the furniture industry. The complexity involved in planning intensive products
drives the need for standardization. The kitchen industry has therefore seen an interest
in this regard by research as well as industry [20][13]. The office furniture industry has
widely adopted a standard [18][19]. Other segments such as the upholstery, as well as other
customizable furniture, have seen efforts in this direction with varying success [12][13][17][15].
The success of these approaches is highly dependent on the segment. In addition, these efforts
are all restricted to specific planning-intensive furniture segments. Further, the adoption of

                                                7
3. Related Work

some of these standards is rather geographically limited, e.g., to Germany or Spain.

3.2. Industrial Information Integration
The exchange of data between different systems and organizations can be seen in the general
context of Industrial Information Integration. [21] defines the engineering of Industrial
Information Integrations as “complex giant system that can advance and integrate the
concepts, theory, and methods in each relevant discipline and open up a new discipline for
industry information integration purposes“. Following this definition, Industrial Information
Integration Engineering is the set of concepts as well as techniques that enable the integration
process between different systems, especially with regards to information integration [22].
The general discipline can be structured along with the addressed discipline, e.g., engineering,
management, social science, and the application engineering field, e.g., chemical engineering,
civil engineering, and material engineering. Taking these structures into account [22] provides
a thorough literature analysis of the discipline. Looking at the different approaches in
different industries, the wide variety of challenges becomes evident. The following section
will present common best practices within different areas. First, general approaches based on
metamodels and ontologies are presented. Then information integration within organizations
is discussed, followed by the discussion of the approaches between organizations.

3.2.1. Metamodel-Based Information Integration
Generally speaking, integration problems can be described using metamodels. A metamodel is
a model of models. Thus a metamodel defines what models are valid within the space of a
certain modeling language. One of the most popular metamodels in software engineering is
the Unified Modeling Language (UML), originally defined by the Object Management Group
(OMG). Their architecture encompasses four different layers, with each layer being the type
model of the layer below [23]. Figure 3.1 shows this hierarchy. As in the case of [24] this
hierarchy can be used to clarify the different abstraction levels of a model, e.g., in the case of
[24] a product ontology. Further metamodels have been used in information integration.
[25] showcase a metamodel-based approach toward information integration at industrial scale.
The approach is demonstrated by an example from the oil and gas industry sector. Their
example transforms engineering assets, such as a fuel pump, between different standards,
such as ISO and MIMOSA. They motivate their work by noting that the main challenge in
information integration is created by the constant change of information systems and their
models on the one side and the constant change of information requirements of applications
and users on the other side. They explain that current approaches do not have enough flexi-
bility to accommodate this constant change. Hence, they propose to address the integration at
a higher level of abstraction through metamodel-based information integration. This means
that the mappings between models become more flexible and reusable through mapping
templates. The integration decisions are then made for small generic fragments of the models,
e.g., a single conceptual entity such as a fuel pump in their case. Their approach contains

                                                8
3. Related Work

Figure 3.1.: The four level metamodel hierarchy defined by the Object Management Group
             [23]

three different levels: the metamodel-level, the model-level, and the instance-level. At the
metamodel-level, the different, to be integrated formats, are defined as entities, relationships,
and mapping operators. At the model-level, mapping templates are specified to define how
one part of the source metamodel needs to be represented in the target metamodel. These
are then instantiated at the instance level by the application user who applies the mapping
templates to his source model. Figure 3.2 shows this conceptual view. Coming back to the
presented example from the oil and gas sector, the end-user can automatically create the
ISO-compliant representation of a fuel pump from the MIMOSA model. Hence, automatically
transforming the representation in one norm into the semantically equivalent representation
in the other norm. [25] explain that, among other advantages, this approach greatly decreases
complexity by separating different integration tasks into different layers and making these
smaller integration decisions reusable.
[26] present a metamodel for ontology mappings based on set and relation theory. They
first motivate their work by explaining that many different ontologies exist covering overlap-
ping concepts. This creates the challenge of exchanging information between them, reusing
adjacent parts of other ontologies, or further synchronizing changes. For this integration,
mappings between common concepts within the ontologies are needed. For the management
of these mappings, they present a metamodel for ontology mappings. They define each single
mapping between two sets of concepts of two different ontologies as a mapping. Further,
they denote the set of mappings between some ontology models as a mapping model. These
mapping models contain common elements and associations. The metamodel introduces
the common structure of these mapping models. As components of this metamodel, they
define the different elements of an ontology (e.g., OntologyElement, OESet, OESetGroup)
and the different elements required for the mappings (e.g., Mapping, MappingClassification,
MappingDefinitionRule). Since this is based on set and relation theory, further properties
can be used, e.g., the synchronization of ontologies and automatic generation of mappings

                                               9
3. Related Work

       Figure 3.2.: Conceptual view of metamodel-based integration approaches [25]

among them.
[27] define a generic metamodel for schema merging. Schema merging is defined as the task
of combining several heterogeneous schemas into one unified schema. For this process, [27]
give a formal definition of the resulting schema together with an algorithm to implement
it. Similar to [26] they explain that the mapping between the elements of two to be unified
schemas is not a simple set of one-to-one correspondences. Thus, representing a mapping
model. Their approach is based on GeRoMe, a generic metamodel that in contrast to other
metamodels includes semantic information to resolve conflicts in mappings and can be used
with different metamodels, e.g., XML schemas. Based on GeRoMe [27] give formal definitions
of models, mappings, and the merging operator.

3.2.2. Ontology & Schema Matching
Ontologies provide a common vocabulary for a certain domain of interest. Depending on
the specific definition, this encompasses several data and conceptual models, including
terms, classifications, and database schemas [28]. Schemas are a formal definition of a
structure of an artifact, such as a SQL schema, XML schema, interface definition, or ontology
description[29]. Since both ontologies and schemata share the similarity of providing a
vocabulary of terms, matching both is often done with similar solutions. Therefore, solutions
from both areas are discussed within this chapter [28]. Schema and Ontology Matching can
be defined as the problem of finding correspondences between elements between different
schemas. Correspondences mean relationships between the elements, e.g., representing the
same notion or information [29].
[30] define a classification of these approaches, including previous classifications such as [31].
Figure 3.3 displays the overview. From the top-down view, the perspective of granularity /
input interpretation, the classification distinguishes along the following elements:
   • Element-level vs. structure-level: Element-level matching techniques only take an

                                               10
3. Related Work

Figure 3.3.: Classification of ontology & schema matching approaches [30]

                                   11
3. Related Work

     element in isolation into account when calculating correspondences. In contrast,
     structure-level approaches consider the relations of elements with each other to calculate
     correspondences.

   • Semantic vs. syntactic: Syntactic approaches follow clearly stated algorithms which
     only analyze the input based on its sole structure. Semantic approaches use some
     formal semantics, such as model-theoretic semantics, to analyze the input and justify
     the results. Exact semantic algorithms are complete with respect to the semantic.

Reading the classification bottom-up, looking at the origin / kind input, the classification
provides the following categories:

   • Context-based: Context-based approaches do not restrict the information to a single
     ontology or schema, but rather look at information coming from external resources,
     such as other ontologies or thesaurus describing the terms of the ontology. The external
     resources are referred to as context.
        – Semantic: As previously seen, semantic approaches follow formal semantics for
          matching.
        – Syntactic: Syntactic approaches in this case could be further differentiated between
          terminological, structural and extensional as for content-based approaches. Due to
          their limited application in practice they are grouped to under syntactic approaches.

   • Content-based: Content-based approaches limit the information taken into account to
     the content of a single ontology or schema.
        – Terminological: Terminological approaches consider their input as string.
        – Structural: Structural approaches look at the structure of elements (classes, indi-
          viduals, relations) within the ontology.
        – Extensional: Extensional approaches use data instances to find correspondences.
        – Semantic: Semantic approaches work based on semantic interpretation of the input
          usually using a reasoner.

Further information on the concrete classes can be found in [30]. A more detailed literature
review looking at current advances in the area is provided by [32].
The visualization of mappings, especially when schemas and maps are larger, is challenging.
Therefore, approaches towards the visualization of mappings have been proposed [33].

3.2.3. Intra-Organizational Information Integration
Especially the engineering field has long dealt with the challenge of information integra-
tion, particularly concerning different data representations. [34] presents an ontology-based
approach towards the data integration within the design process of chemical plants. The
requirement for an ontology-based approach arises because different phases of the design
process and different disciplines require different viewpoints. Making use of reasoning within

                                             12
3. Related Work

the ontology, they can satisfy these information demands and provide compliance checks. In
the same direction, [35] propose the use of ontologies for information integration in industrial
environments with multiple applications and data sources. They point out that the technical
integration has been widely solved. Nevertheless, combining the semantics between the
sources remains a challenge. In constantly changing environments, point-to-point integrations
are expensive. As a solution, they propose to introduce an integration ontology. For this,
they describe each data source as an ontology model and map these together to generate a
uniform integration ontology.
[36] and [37] address the challenge of different information requirements within one organi-
zation during product development. This challenge is driven by the fact that the design of
complex systems requires interactions between experts of different areas, such as Computer-
Aided Design (CAD), Engineering (CAE), and Manufacturing (CAM). The challenge they
address is to maintain consistency across the different design systems about related informa-
tion. [36] therefore apply a Model-Driven Engineering approach from software design and
propose a metamodel to integrate the information from each domain to create a shared refer-
ence. This is done by first creating a mapping between the generic concepts and the specific
knowledge model. Then based on the mapping result, the generic model is instantiated.
[38], [39], [40], and [41] address the challenge of integrating the different data sources for
product data at a somewhat different level by the use of product data management systems.
Such systems integrate and manage all information related to a product across different life
cycle stages such as design, manufacturing, and end-user support. Hence, they integrate
different areas to ensure that the correct information is available in the proper form for the
end-user [38]. [38] provides a review of web-based product data management systems. [39]
propose a distributed, open and intelligent product data management system. By supporting
standards such as Standard for Exchange of Product model data (STEP) they achieve the
suggested openness. [40] propose an ontology-based Product Data Management (PDM)
system. More recently [41] present a holistic view of this topic for the practitioner.

3.2.4. Inter-Organizational Information Integration
Moving from the intra-organizational perspective to an inter-organizational perspective, the
role of standards increases. Standards as support for joint agreements enable communication
throughout different systems for a variety of user requirements in order to improve economic
efficiency [42]. [42] show the use of different standards for engineering assets, noting that
ISO and MIMOSA are the leading bodies for defining such standards. They provide a
comprehensive review of the standards for the integration of engineering assets. Again for
the oil and gas sector [25] highlight challenges involved in the use of different standards.
A large number of different standards create the challenge of integrating these. Therefore,
they showcase a metamodel-based approach towards integrating different ISO and MIMOSA
norms. Other approaches towards the transformation of ISO-Norms by using ontologies can
be found in [43].
One area of inter-organizational communication that has seen significant interest is e-
procurement or e-business, which refers to the use of electronic communications for the

                                              13
3. Related Work

business processes between sellers and buyers. E-procurement integrates inter-organization
business processes and systems to automate the requestion and the approval of order purchase
management and the connected accounting processes using Internet-based protocols. Thereby,
it can improve the efficiency not only of single purchases but also the overall administration
and functioning of markets. Hence, it is seen as a strategic tool to improve the competi-
tiveness of organizations as well as generating scale economies for both sellers and buyers
[44]. Apart from the legal framework, resolving technical issues for the proper integration of
heterogeneous environments is a key success factor. [44] review the current state of the art
concerning the integration and list controlled vocabularies, ontologies, frameworks, as well as
e-procurement platforms as the commonly proposed solutions. They consider Electronic Data
Interchange (EDI), company websites, B2B hubs, e-procurement systems, and web services
as relevant architectures. [45] compare the different standards platforms used for order
management. They conclude that there is currently a lack of common standards. [46] point
towards EDI as being used in the domain for supporting the inter-organizational information
exchange with a focus on order management. However, as [46] point out, different systems
still require complex message transformations in order to become compatible with each other.
A range of different publications have suggested solutions. [46] propose a visual mapping
system to handle different EDI and XML-based applications. [47] suggest creating an ontology
for EDI to support easy integration. Among others [48], and [44] discuss the creation of B2B
marketplaces or B2B hubs in order to facilitate the data exchange.
Taking a look at this wide range of challenges, the general context of product data exchange
within the domain of Industrial Information Integration becomes clear. First and foremost, the
challenge of integrating different data sources needs to be resolved on an intra-organizational
level. Especially in the field of engineering, ontologies, and central representations have been
proposed for this. Concerning product data, PDM or Product Information Management (PIM)
systems are the standard solution for integrating information about products at a central
place. In order to create a shared understanding for data exchange at an inter-organizational
level standards are a common solution. Nevertheless, because of the number of different
standards, integrations are still often needed. In e-procurement EDI has long been a standard.
Nonetheless, EDI has not entirely resolved the challenge of interoperability in this area.

3.3. Product Catalog Integration
As seen in the previous chapter, data integration from different organizations remains a
challenge. As [7] note, this applies especially to product data and product description due to
the autonomy of vendors in describing their products.
Similar to the engineering domain, standards for product data and catalog exchange have
evolved. [49] discuss the design of the catalog exchange process and review four different
XML-based standards for this. This is done by considering the whole process of product
catalog exchange and defining the requirements of different stakeholders. They conclude
that none of the four selected standards, BMEcat, cXML, OAGIS, and xCBL, satisfy the
found requirements, especially regarding requirements from e-markets and content hubs.

                                              14
3. Related Work

They point out that none of the standards include further semantic checks and a feedback
mechanism on potential errors during import. [50] provide an even more granular analysis of
the functionality of each data format. In addition to the XML-based standards [48] mention
two non-XML catalog formats. EDIFACT, as a by the United Nations Economic Commission
for Europe approved format and the ISO 10303-41 known as part of the STEP family. Due to
the complexity of EDIFACT, The United Nations Centre for Trade Facilitation and Electronic
Business Apart has already published an XML format for EDIFACT. The same applies for
STEP. Apart from pure product information, the categorization of products and services is
often of interest, e.g., for accounting purposes. [49] note that none of the reviewed standards
contain these. [51] present an analysis of the different categorization standards eCl@ass,
UNSPSC, eOTD, and RosettaNet, noting that they differ in structural properties as well as the
content. [52] describe the shortcomings of the UNSPSC from a practitioners’ point of view.
The lack of standards and the remaining challenges when using standards underline the point
made by [7] and [53]. They argue that the current degree of acceptance and the multiplicity
of standards hinder the progress of standardization. Further [54] explain that standards are
slow to adapt to changes and emergent requirements. As seen in section 3.1 this also applies
to the Home & Living industry. [7] explain that the alternative to the simplification of the
integration using standards is product schema integration, also referred to as product data
mapping. Product schema integration can be defined as the process of building mappings
between different product attributes from different product descriptions [7]. Using these
mappings, product data from different sources can be integrated and unified.

3.3.1. Layered Integration
The challenges within product schema integration are mainly twofold: syntactic and semantic
integration [49]. [55] suggest separating concerns into different layers in order to reduce
complexity in data integration on the semantic web. Their concept includes three layers:
a syntax layer, an object layer, and a semantic layer. The syntax layer is responsible for
serializing and de-serializing objects stored in a given file, thereby handling the encoding and
file format. The object layer provides object-oriented access for the application that later uses
the data. This also includes the provision of identities and binary relationships, as well as
basic typing. Last, the semantic layer provides an interpretation of the object model from the
object layer. Hence, the objects are mapped onto physical or abstract objects such as books,
airplane tickets, and paragraphs of text.
[56] follow this concept and introduce three different layers in their model. Generally speak-
ing, [56] showcase a system that allows the transformation of XML catalogs between different
structures and formats. As later seen in section 3.3.2, they demonstrate that a set of direct
rules can be used to translate one catalog directly from one format into another. However,
they note that this approach is not suitable for building a scalable mapping service. Using
direct rules makes these rules very difficult to write and hence also to maintain. Further, the
re-usage of the rules is limited. In consequence, they propose the usage of three distinct layers
to separate different concerns. This allows dividing complex transformations into smaller,
simpler rules, which are then concatenated. [56] suggest that the identification of reusable

                                               15
3. Related Work

rule patterns for these smaller rules is then feasible. In order to achieve this goal, [56] use
three different layers, which align with the previously seen concept of [55]. The first layer is
the syntax layer which is responsible for handling the de-serialization of the XML documents.
In the data model layer, the products are then represented by object-property-value triplets,
removing differences imposed by different representations in the syntax layer. This means
that the properties are normalized and aligned with the structure of the following layer, the
ontology layer. The ontology layer contains the actual mappings between different elements.
Taking the example of an address, the address is first de-serialized from the XML document.
Next, it is normalized, e.g., street name and house number are separated into different fields.
Last, the position in the target document is assigned according to the ontology. Figure 3.4
shows the model of this integration approach.

                    Figure 3.4.: Layers of integration of approach by [56]

[57] also use different layers to transform product information between different sources.
However, their approach does not specifically add a syntax layer. Further, it adds a layer
in order to support differences imposed by geographic differences. Their concept includes
three different layers: source, local, and common. The source to local mapping normalizes
and extends the information by adding implicit information, e.g., the currency of the source
catalog. The local to common mapping then maps this information into a common format
that is parsed back into the local and source schema. As mentioned, this separation allows to
efficiently handle geographic differences, such as the language of the source and the target
catalog.
[52] explain that B2B business transactions over the internet present the challenge of inte-
grating information from many different sources. By working as an intermediate layer, B2B

                                              16
3. Related Work

marketplaces propose a solution to this challenge. However, this requires the B2B market-
places to integrate the different sources. Their work elaborates in great detail on the different
challenges encountered when integrating multiple product catalogs and transforming them
between different formats. In addition to the seen requirements, they recognize that the prod-
uct description of the different catalogs is often unstructured and hence not easily computer
interpretable. Hence, add another layer to extract and structure the information.
As seen throughout this section, product catalog integration consists of a variety of different
tasks. The other sections are structured along these. The first section considers the syntactic
integration, hence parsing the given input file into objects, thereby removing differences
imposed by different file formats and encodings. Since most literature only considers XML
documents, the focus lies on syntactically integrating these. Afterward, approaches to seman-
tic integration focusing on ontologies are presented. As already highlighted, the integration
of different sources remains a challenging task under any circumstance. The following
section explores approaches using a mediator, also referred to as intermediate layer, to reduce
the number of integrations needed [52]. Last but not least, different approaches towards
information extraction from unstructured sources are discussed.

3.3.2. Syntactic Integration using XML
XML is the prevailing data format within the literature, as it is the underlying format for the
previously presented standards. It remains to note that others have also considered formats,
such as HTML and Microsoft Excel [58]. However, as these do not specifically tackle the
challenge of product data integration, these are not further explored here. The main reason for
the adoption of XML is that it is seen as an important step towards the reduction of challenges
involving the heterogeneity of data exchange between different systems [59]. This is the
case, even though the standard does not touch the structural and semantic differences, which
means that semantically identical properties can be encoded in XML elements with different
names. Moreover, elements with the same name do not necessarily have the same semantics.
Further, the order of XML tags is of relevance and can be different. For better understanding
figure 3.5 shows part of such an XML document. Taking into consideration XML as the

                             Boltzmannstr. 3
                             office
                             Garching by Munich
                             Germany
                             85748
                             089 189659220
                             
         Figure 3.5.: Example of an address in XML using the OAGIS standard [48]

                                               17
3. Related Work

underlying data format, [59] propose to extend the XSLT language to generate transformations
between different XML documents. The XSLT language (eXtensible Stylesheet Language for
Transformations) was originally developed for rendering and transforming XML documents.
It allows defining a set of rules for transforming a source tree of an XML document into a
target tree. The rules defined with XSLT can then be expressed as an XML document, which
allows its validation and parsing through XML parsers. XSLT references to the input tree
can be used to create the nodes of the target tree. Figure 3.6 shows an example of such a
transformation. Further, XSLT can be extended by using XPath. [48] give the example of an

             ...
             
             ...
      
            Figure 3.6.: Example of a one-to-one mapping in XSLT notation [48]

address where this becomes necessary. One document might combine the street name and
the house number within one field, whereas another document separates these into different
elements. XPath expressions such as select="substring-after($addrline,’, ’)" can be
used to extract the relevant part. [48] note that there are four possible ways that elements
of the different XML documents can be related: one-to-one, one-to-many, many-to-one and
many-to-many mappings. Their research transforming documents in xCBL, IOTP, OAGIS,
and RETML standard show that 89% of all mappings are one-to-one mappings.
However, as also [5] point out, this approach is rather focused on syntactical integration due
to reasons described earlier, such as the missing commitment to a domain-specific vocabulary
making the names XML tags ambiguous.

3.3.3. Semantic Integration using Ontologies
After going over different approaches towards syntactic integration, the challenge of semantic
integration of different product data remains. For this, ontologies are a promising approach.
Ontologies provide support for integrating heterogeneous and distributed data [7]. Ontologies
can be defined as “a formal, explicit specification of a shared conceptualization“ [60, p. 11].
In this definition conceptualization refers to the abstract model of a phenomenon in the world,
which describes the relevant concepts of that phenomenon. Defining the type of concepts as
well as the constraints on their use makes this model explicit. Being formal makes an ontology
machine-readable. This means that ontologies allow machines to understand the semantics of
data [7]. [7] further explain that ontologies support the organization, browsing, searching,
and more intelligent access to online information and services. They argue that building
reusable and agreed-upon product catalogs is at its core building ontologies for the respective
domain.
[5] discuss the semantic integration of various information sources leveraging ontologies.

                                              18
3. Related Work

They explore the shortcomings of XML, mainly arguing that often a common vocabulary is
missing, making a direct semantic integration infeasible. They present their approach towards
the integration of documents with different structures and vocabularies. This includes the
creation of the common vocabulary in the form of ontologies. The integration is done by
creating mappings between the semantic terms of ontologies and the structure of XML
documents. They review the benefits of this ontology-based approach. Their presented
software supports multiple use cases, among others a top-down approach in which the
ontology is defined, e.g., by a consortium. Then the respective XML data structure is created
from it. The different parties involved use this generated XML data structure to exchange data
using the common vocabulary, removing the need for semantic integration. [5] also present
the use cases relevant in the scenario discussed in this work as the bottom-up approach. In the
bottom-up approach, XML documents with different structures need to be integrated. In this
case, the mapping between the data structures is done as shown in Figure 3.7 and described
as follows. First, an ontology is created from the structure of the source XML document.
The structure of the XML document is referred to as the DTD. This means that concepts and
relationships within the XML schema are identified. The focus lies on the reengineering of
the conceptual models. In the next step, mapping rules are created between the source and
the target ontology. This is done semi-automatically by providing a GUI for the evaluation
and application of the automatically generated rules. In the last step, these rules are compiled
to a set of XSLT transformations that can be applied to transform an XML document in the
source schema into an XML document in the target schema.

                   Figure 3.7.: Structure of the integration approach by [5]

[61] present the architecture of an ontology-based approach towards product data integration
focusing mainly on the integration between different applications. This is done by adding an
ontology layer between the data sources and consumers, e.g., a database and CAD or ERP

                                              19
3. Related Work

systems. [61] analyze the advantages of such an approach regarding data exchange. Using
a common ontology helps to make the intrinsic semantics of concepts explicit. This allows
exchanging information based not only on the syntax of various modeling languages but also
on a common understanding of the semantics. Further, they argue that creating this ontology
will help organizations structure and reorganize product data more thoughtfully. In addition,
the ontology will act as a buffer between different syntactic representations. This is relevant
as the syntax of product data changes over time, whereas the semantics usually stay the
same. Their work further details its implementation by providing examples and showcasing
the used tools DAML and OIL. DAML is a tool used to construct ontologies and create the
markup for the exchange of such. OIL is used for the exchange of these ontologies.
[58] present an approach towards the semi-automatic integration of existing standards and
initiatives for the classification of products and services through ontological mappings. The
concept has six steps. First, the standards and joint initiatives for product classification of
the respective domain that will be integrated are selected. Next, the knowledge models are
semi-automatically extracted from these. Afterward, the relationships between concepts in
the different models are identified, and a multi-layered knowledge architecture is manually
designed. Based on these mappings, the knowledge models are then integrated. Afterward,
the newly created integrated ontology attributes can be enriched using additional information
included in the standards. In the last step, the ontology can then be automatically exported
into different formats. For the first step, the selection of standards, the work describes the
standards UNSPSC, RosettaNet, E-cl@ass, and one additional product catalog from an existing
e-commerce platform. Tools including the ontological engineering platform WebODE and its
companion for data extraction WebPicker are presented for the next steps. The identification
of common concepts between the different standards for creating the mappings is done using
a multi-layered approach. [58] explain that ontologies can be classified and layered along
with their use case and specificity. They propose to align the integration and mapping with
these layers, reducing complexity and allowing for the intra-operability of vertical markets
from specialized domains. Apart from these two benefits, using a multi-layered ontology
allows reasoning based on the taxonomy of concepts. The mappings between ontologies are
done using notions such as equivalence, subclass-of, and union-of.
[62] discuss the usage of ontologies for the integration between product design and catalog
information. Over the life-cycle of a product, the data used primarily for manufacturing and
engineering during the initial design and manufacturing phase needs to be transferred into
catalog information targeted towards sales management. The core component for integrating
these two phases is the semantic mapping between the two ontologies using description logic.
The mapping is initially found using heuristic methods based on names, structures, and
types. Considering that both design and sales information belong to the same upper ontology,
the range of possible mappings is further reduced. In the last step, the user can adjust the
previously automatically created mapping rules. The work goes into detail about each of
these components. Among others, the similarity calculations are based on the structure of
WordNet. The structural and type matching are done based on the upper ontology between
both ontologies. After the correction by the user, the mapping rules enable the automatic

                                              20
You can also read