# Preprocessing Data: A Study on Testing Transformations for Stationarity of Financial Data - SARA BARWARY TINA ABAZARI

←

**Page content transcription**

If your browser does not render page correctly, please read the page content below

EXAMENSARBETE INOM TEKNIK, GRUNDNIVÅ, 15 HP STOCKHOLM, SVERIGE 2019 Preprocessing Data: A Study on Testing Transformations for Stationarity of Financial Data SARA BARWARY TINA ABAZARI KTH SKOLAN FÖR TEKNIKVETENSKAP

Preprocessing Data: A Study on Testing Transformations for Stationarity of Financial Data SARA BARWARY TINA ABAZARI Degree Projects in Applied Mathematics and Industrial Economics (15 hp) Degree Programme in Industrial Engineering and Management (300 hp) KTH Royal Institute of Technology year 2019 Supervisors Rickard Henricsson, Peyman Dabiri & Cecilia Pettersson Supervisors at KTH: Camilla Landén, Per Jörgen Säve-Söderbergh & Julia Liljegren Examiner at KTH: Per Jörgen Säve-Söderbergh

TRITA-SCI-GRU 2019:270 MAT-K 2019:29 Royal Institute of Technology School of Engineering Sciences KTH SCI SE-100 44 Stockholm, Sweden URL: www.kth.se/sci

Abstract In thesis within Industrial Economics and Applied Mathematics in cooperation with Svenska Handelsbanken given transformations was examined in order to assess their ability to make a given time series stationary. In addition, a parameter α belonging to each of the transformation formulas was to be decided. To do this an extensive study of previous research was conducted and two different tests of hypothesis where obtained to confirm output. A result was concluded where a value or interval for α was chosen for each transformation. Moreover, the first difference transformation is proven to have a positive effect on stationarity of financial data. Sammanfattning Det här kandidatexamensarbetet inom Industriell Ekonomi och tillämpad matematik i samarbete med Handelsbanken undersöker givna transformationer för att bedöma deras förmåga att göra givna tidsserier stationära. Dessutom skulle en parameter α tillhörande varje transformations formel bestämmas. För att göra detta utfördes en omfattande studie av tidigare forskning och två olika hypotestester gjordes för att bekräfta output. Ett resultat sammanställdes där ett värde eller ett intervall för α valdes till varje transformation. Dessutom visade det sig att ”first difference” transformationen är bra för stationäritet av finansiell data. Keywords Bachelor Thesis, financial outcome, transformations, stationarity, tests of hypothesis, EWMA i

1 Preface This Bachelor’s thesis was written in the spring of 2019 by Sara Barwary and Tina Abazari during a five-years Master’s program within Industrial Engineering and Management at KTH Royal Institute of Technology. The thesis is based on application of theory from mathematical statistics as well as the field of industrial economics. We would like to thank Cecilia Pettersson, Rickard Henricsson and Peyman Dabiri at Handelsbanken for contributing to the work and giving resources needed. We would also like to express appreciation to our supervisor Camilla Landén and additionally Per Jörgen Säve- Söderbergh at KTH for helping and giving support when facing problems throughout the work. Julia Liljegren at the department of Industrial Engineering and Management also provided valuable input and guidance to the project. ii

Contents 1 Preface ii 2 Introduction 1 2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2.2 Research Question . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.3 Goal and Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.4 Scope and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3 Economic Theory 6 3.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.1.1 Securities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.1.2 Market Index . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.3 Exchange Rates . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.4 Commodities . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.5 Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1.6 Bonds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 Timing of Entry Framework . . . . . . . . . . . . . . . . . . . . . . 9 3.2.1 First Mover Advantages . . . . . . . . . . . . . . . . . . . . . 10 3.2.2 First Mover Disadvantages . . . . . . . . . . . . . . . . . . . 10 3.3 Porter’s Five Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4 Mathematical Theory 13 4.1 Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1.1 The Objectives of Time Series Analysis . . . . . . . . . . . . 13 4.1.2 Time Series Decomposition . . . . . . . . . . . . . . . . . . . 14 4.1.3 Trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.1.4 Seasonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1.5 Stationarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Stationarity Hypothesis Testing . . . . . . . . . . . . . . . . . . . . 18 4.2.1 Dickey-Fuller Test . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2.2 Kwiatkowski–Phillips–Schmidt–Shin (KPSS)-Test . . . . . 20 4.3 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.3.1 Level Transformation . . . . . . . . . . . . . . . . . . . . . . 23 iii

4.3.2 First Difference Transformation . . . . . . . . . . . . . . . . 23 4.3.3 Mean EWMA-transformation . . . . . . . . . . . . . . . . . 24 4.3.4 Variance-EWMA Transformation . . . . . . . . . . . . . . . 24 4.3.5 Skewness EWMA Transformation . . . . . . . . . . . . . . . 25 4.3.6 Kurtosis-EWMA Transformation . . . . . . . . . . . . . . . . 25 4.3.7 Autocorrelation Transformation . . . . . . . . . . . . . . . . 25 4.3.8 Correlation-EWMA Transformation . . . . . . . . . . . . . . 26 5 Methodology 28 5.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.2 Data and Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.2.1 Exchange Rates (FX data) . . . . . . . . . . . . . . . . . . . 29 5.2.2 US Sectors Data . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.2.3 Countries- Stock Index Data . . . . . . . . . . . . . . . . . . 29 5.2.4 Commodities Data . . . . . . . . . . . . . . . . . . . . . . . . 30 5.2.5 VIX- Market Volatility Index Data . . . . . . . . . . . . . . . 30 5.2.6 Bond (IR) Data . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5.2.7 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.3 Selection of Transformations and Hypothesis Tests . . . . . . . . . 31 5.4 Selection of Market Entry Frameworks . . . . . . . . . . . . . . . . 31 5.5 Literature Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5.6 Procedure of Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6 Results 40 6.1 First Trial: Plots for Currency Rates, with Fixed α . . . . . . . . . . 40 6.1.1 Statistics for First Trial . . . . . . . . . . . . . . . . . . . . . 45 6.2 Second Trial: Plots for Commodity, with a Fixed α . . . . . . . . . . 47 6.2.1 Statistics for Second Trial . . . . . . . . . . . . . . . . . . . . 50 6.3 Third Trial: Plots for Commodity Prices, with a Fixed α . . . . . . . 51 6.3.1 Statistics with trial 3 . . . . . . . . . . . . . . . . . . . . . . . 54 6.4 Seasonality and Trends . . . . . . . . . . . . . . . . . . . . . . . . . 55 6.5 Skewness and Kurtosis . . . . . . . . . . . . . . . . . . . . . . . . . 56 6.6 First Differences on all Data . . . . . . . . . . . . . . . . . . . . . . 59 6.7 Finding the Optimal α . . . . . . . . . . . . . . . . . . . . . . . . . . 60 iv

6.7.1 Currencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 6.7.2 US-Sectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 6.7.3 Countries Index . . . . . . . . . . . . . . . . . . . . . . . . . 62 6.7.4 Commodities . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 6.7.5 VIX (Market Volatility) . . . . . . . . . . . . . . . . . . . . . 63 6.7.6 IR (Bonds) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 6.7.7 Aggregated α . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 7 Conclusions 65 7.1 Interpretation and Impact . . . . . . . . . . . . . . . . . . . . . . . 65 7.1.1 Trial 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 7.1.2 Trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 7.1.3 Trial 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 7.1.4 Skewness and Kurtosis . . . . . . . . . . . . . . . . . . . . . 66 7.1.5 First Difference as a Transformation . . . . . . . . . . . . . . 66 7.1.6 Finding the Optimal α . . . . . . . . . . . . . . . . . . . . . . 67 7.2 Analysis of Timing of Entry and Competitive Rivalry . . . . . . . . . 69 7.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 7.4 Benefits for SHB and its Stakeholders . . . . . . . . . . . . . . . . . 73 7.5 Final Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 v

2 Introduction 2.1 Background The last couple of years machine learning based forecasting has gained attention increasingly and become more established. Moreover, using machine learning for prediction of financial outcome has become desirable among financial institutions and private investors.1 There are ongoing discussions and research about how to improve these prediction models, as well as about how to pre-process input data in order to obtain predictions with high accuracy.2 This is why machine learning has become essential since the effective method combines computer science and mathematics to develop models with the intent of delivering maximal predictive precision. Predictions of financial outcome, for example security prices or market indices, involve a time component since future price movements may be dependent on past values. Thus, the time dimension needs to be taken into account when using a machine learning based prediction model. These prices in the financial market can be seen as observations at points in time. Financial price over a time period can therefore be described as a time series. As mentioned, the interest for using machine learning for prediction of price movements in the financial market has grown. Consequently, time series forecasting has become an increasingly important area of machine learning.3 The underlying assumption in time series forecasting and the related machine learning methods is that the input data, is a stationary process. That is, the statistical properties for example the mean, variance and autocorrelation of the time series should not change over time.4 However, most data is not stationary. 1 Sarlin, Peter. Björk, Kaj-Mikael.”Machine learning in finance”. Neurocomputing. Vol. 264, 2017: 1-88, Retrieved 2019-02-02 2 Palaniappan, Vivek. ”Using Machine Learning to Predict Stock Prices” 2018-10-31 https://medium.com/analytics- vidhya/using-machine-learning-to-predict-stock-prices-c4d0b23b029a (Retrieved 2019-02-02) 3 Brownlee, Jason. ”What is Time Series Forecasting?”. Machine Learning Mastery. 2016-12-2 https://machinelearningmastery.com/time-series-forecasting/?fbclid=IwAR1Zpv80x-4EEN- IIo-h1HL5fGHF6fD-OZYpknScLWdmU-p3uJ80 3ZF 9Ag(Retrieved2019 − 05 − 01) 4 Lindgren, George. ”Stationary stochastic processes”p.13-16 http://www.math.chalmers.se/ rootzen/fintid/stationary120312.pdf (Retrieved 2019-02-02) 1

As the time span of historical observations increases, the greater is the probability of the time series showing non-stationary characteristics.5 For many machine learning methods, handling non-stationary data sets is a challenge since it could increase the risk of obtaining prediction outcomes significantly different from the real outcomes. Non-stationary time series is a result from data showing trends, seasonal effects, cycles, noise and other structures dependent on the time observation. Therefore, it cannot be analyzed through traditional techniques. Instead, forecasting non-stationary time series may require models with higher complexity. In order to facilitate achieving more reliable output from a prediction model effects such as seasonal components and trends may need to be removed from the input data set.6 It is possible to make data stationary, or at least approximately stationary by the use of mathematical transformations. In the last couple of months, Svenska Handelsbanken AB (SHB) has been discussing a market entry for new financial products. The idea is to predict the return of the securities with a machine learning based model, which the products can be based upon in the future. Richard Henricsson at SHB conducted research ten years ago regarding mathematical transformations and their ability to generate stationary financial data. As a result of considering this potential business idea, the question has been raised by SHB regarding whether the transformations are still applicable to data today. Henricsson found several transformations, including both established ones and his own approximations. The approximations were derived with the aim to reduce complexity of some of the transformations. His studies resulted in seven chosen transformations. • Differencing (First order) • Exponentially weighted moving average: Mean • Exponentially weighted moving average: Variance 5 Adhikari, Ratnadip et al. ”An Introductory Study on Time Series Modeling and Forecasting” p.16-19 https://arxiv.org/ftp/arxiv/papers/1302/1302.6613.pdf(Retrieved 2019-02-19) 6 Kang,Eugine. ”Time Series: Check Stationarity”, 2018-08-26. https://medium.com/@kangeugine/time-series-check- stationarity-1bee9085da05 (Retrieved 2019-02-23) 2

• Exponentially weighted moving average: Skewness • Exponentially weighted moving average: Kurtosis • Exponentially weighted moving average: Autocorrelation • Exponentially weighted moving average: Correlation The definition and meaning of these will be explained more thoroughly in the theoretical background, Section 4. Except for the first difference transformation, the other transformations depend on a unknown constant α. Changing the value of α will result in different output obtained from each transformation. Consequently, the choice of α for a chosen transformation may have an impact on whether the data can be made stationary. In accordance, this has raised interest for SHB to examine the specific values for α to potentially make financial time series stationary. Furthermore, SHB is one of the biggest banks in the Nordic countries. In the Nordic financial sector, there are not many commercial players today providing financial products related to machine learning based financial outcome prediction. Consequently, SHB has the potential to be among the first players in this area. It is therefore of interest for SHB to understand how the timing of entry to market can affect their business. 2.2 Research Question The work of this thesis was done in cooperation with SHB. The main research question is to examine whether financial data can be transformed to become stationary, and for what value or values of the parameter α stationarity is achieved. The time span of all the data sets is 2001-01-01 to 2018-12-31. The main research questions to be answered are consequently the following: 1. Are the given transformations sufficient enough to make the data stationary? 2. Which parameter value or values of α for each transformation will make the data potentially stationary? Also, a discussion will be held regarding the effects of the timing of entry to a new market. More precisely, given that the transformations can make financial data 3

stationary and SHB can develop financial products based on machine learning financial outcome prediction, how will the timing of potential new product launches affect their competitive advantage. 2.3 Goal and Purpose Predicting financial outcome for securities is relevant for private investors as well as for financial institutions. The goal of this thesis is to examine how financial data may be pre-processed in order to make it useful as input data to a prediction model. The main goal for SHB is to, based on the results of this thesis, separately develop a machine learning based forecasting model for prediction of financial outcome. More precisely, their model will indicate future price movements in the financial market, mainly for stocks in well developed countries such as the US. For example, this can be stock indices from the US such as Dow Jones Industrial Average or S&P 500. Since SHB is currently in the initial phase of the model development, it is of importance for them to know if it is possible to make input data for the future model stationary. The goal of this research is to provide an insight regarding the question. If it not possible to make the data stationary, it may be required for them to consider conducting further research on building a model on non-stationary data. Alternatively, this thesis can answer whether it is needed for SHB to conduct further research regarding how to make data stationary. Therefore, the greater purpose of this study is to give a direction for the future work for SHB. In the market entry discussion, the market of SHB will be limited to other banks institutions in Sweden/the Nordics since SHB:s main business activity lies within this area. 4

2.4 Scope and Limitations The scope of this thesis is limited to examining transformations given by SHB. Also the data is provided by SHB and it is mainly related to the financial markets of the US and other well developed countries. Moreover, it is necessary to determine what qualifies as stationarity since there exists a strong and weak form. For this research, it has been decided that it is sufficient if a time series only fulfills the requirements of weak stationarity since proving strict stationarity for a whole data set is complex. The difference between these types of stationarity is explained in Section 4.1.5.7 Moreover, the project will be limited to only two different hypothesis tests, both chosen by SHB. These were chosen since they are based on different model assumptions and hypotheses and may therefore give a wider perspective to the analysis of the results. 7 ”Stationarity Differencing” https://www.statisticshowto.datasciencecentral.com/stationarity/ (Retrieved 2019-03-02) 5

3 Economic Theory Terminology related to the financial market that are mentioned in the thesis or used as input data in the research are explained in this section. The purpose is to facilitate obtaining an understanding of the content of this thesis. Theory regarding stocks, bonds and other financial assets will be provided to understand why they are important to look at when studying an economy. Moreover a model of Porters Five Forces will be introduced and discussed as well as benefits with being a first mover to the market. 3.1 Terminology 3.1.1 Securities A security is a financial asset that can be traded. There are are several types of securities and these are in general classified as equity securities, debt securities and derivatives. Equity securities represents ownership in an entity. The most common equity security is a stock, which is an ownership of a share of a company.8 A holder of a debt security borrows money which later must be repaid. For instance, when a debt security is issued, different terms are formulated for example, for the size of the loan, the maturity date and the interest rate. Corporate bonds and government bonds are examples of two frequently debt securities.9 Derivatives are contracts between at least two parties. The value of the contract is based on an underlying asset such as a stock, a market index, interest rate or a market index. There are various derivatives, such as options and futures.10 8 Kenton, Will. ”Security”, 2019-05-20. https://www.investopedia.com/terms/s/security.asp (Retrieved 2019-05-22) 9 Chen, James. ”Debt Security”, 2019-03-23. https://www.investopedia.com/terms/d/debtsecurity.asp (Retrieved 2019-05-20) 10 Chen, James. ”What is a Derivate?”, 2019-05-19. https://www.investopedia.com/ask/answers/12/derivative.asp (Retrieved 2019-05-22) 6

3.1.2 Market Index A market index is a measurement of a segment of the financial market. More precisely, the index shows the performance of the securities within the chosen segment. A market index is computed from the prices of the securities. There are 11 several weighting methods for determining the impact of each price. 3.1.3 Exchange Rates An exchange rate is the value of an economic zone’s currency compared to the currency of another nation or a specific economic zone. The currency exchange rate is one of the most important factors to use when indicating a country’s economic health relative to others. It is vital to a country’s level of trade and financial flows in the area.12 Movements in the exchange rate has an influence on the decisions of businesses, government and individuals in society. Collectively, this may have an effect on the activity on the financial markets (for example on how people trade and how securities are valued).13 3.1.4 Commodities Commodities are basic goods used in commerce and as input in productions of both products and services. The price of it is usually decided by the whole market. It could be anything from raw material to chemicals sold. Commodities are most commonly sold and purchased through future contracts that standardize the quantity and minimum quality of the commodity that is being traded. The market of commodities is important since it offers a market place where members can transact business. It also establishes a regulated trading with rules and 11 Young, Julie. ”Market Index”, 2019-05-02. https://www.investopedia.com/terms/m/marketindex.asp (Retrieved 2019-05-22) 12 Twin, Alexandra. ”6 Factors that Influence Exchange Rate”, 2019-05- 20. https://www.investopedia.com/trading/factors-influence-exchange-rates/ (Retrieved 2019- 05-20) 13 Hamilton, Adam. ”Understanding Exchange Rates and Why They Are Important”, 2018. https://www.rba.gov.au/publications/bulletin/2018/dec/pdf/understanding-exchange- rates-and-why-they-are-important.pdf (Retrieved 2019-05-20) 7

regulations. Moreover it is a place for collecting and disseminating as well as grading of the commodities depending on quality.14 One example that will be used in the thesis is the the spot price of crude oil which is considered one of the most important commodities in the world. Since today’s society and economy is dependent on non-renewable fossil fuels crude oil plays an important role in the market of commodities. The cost of a barrel of crude oil is determined by the global market, more precisely the supply and demand of it. For example, if the demand for crude oil is high and the supply is low, the result will be higher oil prices. This is important for economists and experts to predict since the prices are volatile. The price of oil can directly or indirectly through multiple steps affect the costs of goods and services in the economy which can result in inflation. The West Texas Intermediate crude oil is considered one of the major benchmarks of crude oil.15 3.1.5 Volatility Volatility is the standard deviation the return of an asset. The standard deviation is the square root of the variance. Both variance and standard deviation measure the variability of a return. The volatility is as an indicator of the risk level for an assets, for instance a security, portfolio or market. It is expected to be more challenging to predict the price of an highly volatile asset. Consequently volatile assets are viewed as riskier compared to less volatile assets. Shortly, volatility is considered as the risk related to the change in the asset’s price. The VIX Index is an example of a market volatility measure. Before making an investment decision, investors normally look at the VIX values to gain insight about the market risk.16 14 Lioudis, Nick. ”Commodities Trading: An Overview”, 2018-05-18. https://www.investopedia.com/investing/commodities- trading-overview/(Retrieved 2019-05-20) 15 Premkumar, Divya. ”How do oil prices affect stock market”, 2019-01-08. https://www.tradebrains.in/how- do-oil-prices-affect-the-stock-market/(Retrieved 2019-05-01) 16 Kuepper, Justin. ”Volatility Definition”, 2019-04-18. https://www.investopedia.com/terms/v/volatility.asp (Retrieved 2019-05-01) 8

3.1.6 Bonds A bond is a fixed income instrument that is a loan made by an investor to a borrower. When companies or other financial institutions need to finance new projects, ongoing operations or other financial investors they can issue bonds directly to investors. The borrower, the one that issued the bond, for example includes terms of the loan, interest payments and maturity date. The interest payment, the coupon, is the earning for bondholders for loaning their funds. The interest rate that determines the payment is called the coupon rate. A government bonds is a bond issued by the government. Treasury yield is the return on investment on the U.S. government’s debt obligations. It is important when analysing stocks since it tends to signal investor confidence. When it is high the bond’s price drops and yield increase since investors believe they can find investments with higher return. When confidence is low, the opposite occurs. Bonds will affect the amount of liquidity in countries since it determines how easy or difficult it will be to take loans and buy on credit for example. Since the bonds are so strongly related to the economy it means they are important for forecasting. Bond yields will indicate what investors think the economy will do.17 3.2 Timing of Entry Framework When firms are about to enter a new market, either by launching a new product or expanding to new regions, one main concern is regarding when to enter the market. Entrants are usually divided into three categories depending on their time of entrance. These are the first movers, early followers and the late entrants. Earlier research have resulted in contradictory answers to the question of which entry timing strategy is the optimal and why. The first movers of a market are the first to bring and sell a new good or service to the market. Early followers are relatively early to the market, even though 17 Amadeo, Kimberly. ”How Bonds Affect the U.S. Economy”, 2019-01-20. https://www.thebalance.com/how-do-bonds-affect-the- us-economy-3305601 (Retrieved 2019-05-01) 9

they are not the first to enter. Lastly, the late entrants are seen when a product is becoming or has become more commercial, in other words when the product gains mass market penetration. 3.2.1 First Mover Advantages The theory of timing of entry also covers the advantages and disadvantages of being the first mover. According to theory, the first mover will gain brand loyalty and technological leadership. Additionally, first movers have more time on the market, enabling them to gain more market share. This could eventually result in a Winner-Takes-All Market. The reasons is that the company may be posed as a technological innovator and gain reputation as a leader. Being the first also enables the player to develop the characteristics of the technology, for instance its features, functionality of the technology, as well as forming the pricing. Firms that enter the market early can capture important resources such as key locations, government permits, patents to the technology, access to distribution channels and develop relationships with suppliers. Another advantage with being early is exploiting buyer switching costs. In other words if a buyer faces switching costs when changing to other superior technology and has invested time in the technology, the first mover that captures customers may be able to keep those customers. If the industry pressures and encourages the adoption of a dominant design the timing of the entry could be critical to its likelihood of success. 3.2.2 First Mover Disadvantages Studies have shown that many first movers are exposed to higher costs, which reduce the profits of their businesses. To become the first mover, it may be required to add resources to research and development work. The late entrants have on the other hand the possibility to use already existing work, technology and knowledge developed by the first mover, to create a similar product. They can also adapt the product or service development to the customers’ preferences instead of facing customer uncertainty of requirements. As a result, they can avoid high development expenses. 10

Another negative aspect is that new developed technologies may require other technologies or components produced by other firms. Therefore, they are dependent on the effort of other firms. The first movers can therefore not rely on enabling technologies. Moreover, when firms introduces new technology and innovations, often there are no appropriate suppliers or distributors exist. This will lead to the firm having to assist the suppliers or perhaps develop its own suppliers which is a time and resource demanding task. 3.3 Porter’s Five Forces Porter’s Five Forces Framework, developed by Michael Porter, is a tool for analyzing the market dynamics and the competition of a business. The purpose of the model is to identify and analyze five competitive forces that shape every industry and helps determine an industry’s weaknesses and strengths. The insights are often used to see if new product or service offerings can be profitable. Also it may be used for answering strategic questions such as how, where and when a market entry should be done. The five forces are threats of new entrants, bargaining power of suppliers, bargaining power of customers, threats of substitute products and competitive rivalry. All together, the four first forces describe the competitive rivalry. 11

Figure 3.1: Porter’s five forces model and important questions to answer during the analysis 12

4 Mathematical Theory The following section provides information regarding the mathematical theories and models used in the thesis. It also intends to explain the assumptions which the models are based upon. 4.1 Time Series A time series is a series of data points, measured over a time period and indexed in time order. In other words, values are taken by a variable over time in chronological order.18 The time series is denoted as a vector {Xt }, t=0,1,2.... where t represents the time and Xt is seen as a random variable. There exists both discrete and continuous time series for a time series. For a time t ∈ [0, ∞). 4.1.1 The Objectives of Time Series Analysis The primary objective of time series analysis is the development of mathematical models that describe the data sample. The purpose is to extract meaningful statistics and characteristics of the data. There are in general two main goals of the time series analysis: 1. Identifying the nature of the phenomenon. What does it contain? 2. Forecasting or in other words predicting future values of the time series variable. These goals require an identification of the pattern that is observed in the time series. With this it can be interpreted and integrated with other data for a forecast model.19 18 ”Time Series” http://www.businessdictionary.com/definition/time-series.html (Retrieved 2019-01-30) 19 ”Time Series” https://www.stat.ncsu.edu/people/bloomfield/courses/st730/slides/SnS-01- 2.pdf (Retrieved 2019-02-02) 13

4.1.2 Time Series Decomposition Within time series analysis, one can decompose a time series into several components. Let {Xt } be a sequence of random variables. Then, a time series can be decomposed either additively as: Xt = Tt + St + ϵt or multiplicatively as Xt = Tt ∗ St ∗ ϵt where Tt is the trend component at time t, St is the seasonal component at time t and ϵt is a irregular component at time t.20 Over a long time period a time series may show a general tendency of decrease, increase or stagnation. This is represented by the trend component in a decomposition. The seasonal component exhibits patterns affected by seasonal factors such as the day of the weak or the quarter of the year. The period of the seasonality is fixed and known. Further, the irregular component portrays events that do not occur regularly and are of unpredictable characteristics.21 The irregular component corresponds to the residual obtained after the trend and seasonality have been removed, that is, ϵt is a random noise component. Additionally, ϵt is stationary at least in the weak (described in Section 4.1.3) sense. 22 4.1.3 Trends Usually one wants to know if there is a trend in the time series to support future forecasting. In some cases a trend is seen as an accumulated effect of certain factors and in other cases trends indicate a kind of influence that needs further investigation. The trend could for example be linear, exponential or even mixed 20 Adhikari, Ratnadip et al. ”An Introductory Study on Time Series Modeling and Forecasting” https://arxiv.org/ftp/arxiv/papers/1302/1302.6613.pdf (Retrieved 2019-02-16) 21 Adhikari, Ratnadip et al. ”An Introductory Study on Time Series Modeling and Forecasting” p. 12-18 https://arxiv.org/ftp/arxiv/papers/1302/1302.6613.pdf(Retrieved 2019-03-23) 22 Brockwell, J Peter. Davis, A Richard. ”Introduction to Time Series and Forecasting”, p.20. Third ed, Springer 14

between different types.23 4.1.4 Seasonality In time series data, seasonality is a presence of variations that occur at specific regular intervals for example every autumn. These repeat regularly over time. Identifying or removing seasonal components could result in a more clear relationship between the variables that are input and output. It could also provide information that is helpful for improvement of model performance.24 4.1.5 Stationarity A stationary assumption is equivalent to saying that the generating mechanism of the process is itself time-invariant, so that neither the form nor the parameter values of the generation procedure change over time. A process {Xt }, t ∈ Z (where Z is the integer set) is defined to be weakly stationary if it satisfies 1. E[Xt ] = µ 2. Var[Xt ] = σx2 < ∞ 3. γX (s, t) = γX (s + h, t + h) for all s, t, h ∈ Z, where γ is the autocovariance function. In other words this means that a stochastic process that is stationary will have a mean and variance that do not change over a time period. Also the autocovariance, meaning the covariance between the values of the process at two points in time, will only depend on the distance between the time points and not on time itself.25 There is also a more restrictive definition of stationarity than the above mentioned. A time series {Xt1 , Xt2 ..., Xtn , t = 0, ±1, ±2, ....} is strictly stationary if the same joint probability distribution holds for (Xt1 , ..., Xtn ) as for (Xt1 +h , ..., Xtn +h ), that is 23 Deshpande, Bala. 2014-03-12 ”Time series forecasting: understanding trend and seasonality” http://www.simafore.com/blog/bid/205420/Time-series- forecasting-understanding-trend-and-seasonality (Retrieved 2019-05-01) 24 Brownlee, Jason. 2016-12-23 ”How to Identify and Remove Seasonality from Time Series Data with Python” https://machinelearningmastery.com/time-series-seasonality-with-python/ (Retrieved 2019-04-14) 25 A. Lincoln. Introduction to the theory of time series, Chapter 1 p.4-6 15

d (Xt1 , ....., Xtn ) = (Xt1 +h , ....., Xtn +h ) for all integers h and n>0 .26 The importance of stationarity is great. If the data selection of a time series is non- stationary the series can very much influence both its behaviour and properties. Thus, a regression depending on the data points will be hard to prove. Also, if the variables in a regression model not are surely stationary, the assumptions for the asymptotic analysis may not be valid.27 Non-stationary time series will depend on data showing trends, seasonal effects and other structures dependent on the time observation.28 A time series is usually non-deterministic, hence what occurs in the future can not be predicted with certainty. Therefore, the concept of stationary of a time series abates the complexity in forecasting the future.29 In order to prove or check for stationarity there are a number of different approaches that could be useful. The most common methods are examining plots and statistical tests.30 One can run a sequence of plots and examine them to find any obvious trends or seasonal effect. With this, summary statistics can be obtained which are used to summarize a set of observations, to communicate as much of the information as possible. In the process the data is partitioned into intervals and then it is checked if there are obvious or significant differences in the summary statistics between them. Statistical tests can provide a method for making quantitative decisions about a particular sample. 26 Brockwell, J Peter . Davis, A Richard. ”Introduction to Time Series and Forecasting”, p.13. Third ed, Springer 27 Ryabko,Daniil. ”Asymptotic Nonparametric Statistical Analysis of Stationary Time Series”, 2019-03-30 https://arxiv.org/abs/1904.00173 (Retrieved 2019-05-01) 28 Kang,Eugine.”Time Series: Check Stationarity”, 2018-08-26. https://medium.com/@kangeugine/time-series-check- stationarity-1bee9085da05 (Retrieved 2019-02-23) 29 Adhikari, Ratnadip et al. ”An Introductory Study on Time Series Modeling and Forecasting” p. 12-18 https://arxiv.org/ftp/arxiv/papers/1302/1302.6613.pdf(Retrieved 2019-03-23) 30 ”Tests of Stationarity” https://people.maths.bris.ac.uk/ magpn/Research/LSTS/TOS.html (Retrieved 2019-02-12) 16

Figure 4.1: The following graph illustrates a non-stationary time series, a random walk that has not been adjusted Figure 4.2: This figure illustrates the same data but after stationarity is obtained with the first difference transformation. As one can see the graph seems more like a even line, indicating stationarity. 17

4.2 Stationarity Hypothesis Testing As mentioned in the limitations to this project, we will only use two different stationarity tests. These hypothesis tests are used to obtain an indication as to whether a time series is stationary. However they can not be used as a proof of stationarity. If the counter hypothesis is rejected, the null-hypothesis is not confirmed. A non significant result only means it can be concluded that the counter-hypothesis is not a strong competitor to the null-hypothesis. Also, in general there can be many other null-hypotheses that also would not have been rejected.31 4.2.1 Dickey-Fuller Test A commonly used method for checking the existence of a unit root is by the Dickey- Fuller test, which was developed by David Dickey and Wayne Fuller (1979). The Dickey-Fuller hypothesis test gives an indication on whether a process is stationary or not.32 The test checks if a process follows a unit root process. The augmented Dickey-Fuller (ADF) test is an expansion of the original Dickey-Fuller (DF) test, used for higher order correlations, since the Dickey-Fuller is only valid for AR(1)-processes. An AR(1)-process is an autoregressive process of the first order. This means that the current value is based on the immediately preceding value.33 Similar to the original DF-test, the ADF tests for a unit root in a time series sample. The primary difference is that the ADF is used for more complicated and larger sets of time series models.34 If there is higher order correlation instead of only AR(1)- processes the augmented version must be used. The purpose is to test the null hypothesis, that an unit root is present against the hypothesis that there is no unit root which indicates that the data is stationary. 31 ”Hypotesprövning” http://gauss.stat.su.se/gu/sg/2012VT/Kompendium/KAP17new.pdf (Retrieved 2019-05-03) 32 ”ADF — Augmented Dickey Fuller Test ” https://www.statisticshowto.datasciencecentral.com/adf-augmented-dickey-fuller- test/ (Retrieved 2019-03-15) 33 Pantelis, Anastasios. 2008. ”Testing for unit roots in the presence of structural change” http://lup.lub.lu.se/luur/download?func=downloadFilerecordOId=1338330fileOId=1646631 (Retrieved 2019-03-09) 34 ”The Augmented Dickey-Fuller Test” https://www.thoughtco.com/the-augmented-dickey- fuller-test-1145985 (Retrieved 2019-02-27) 18

Consider the first order autoregressive model Xt = δ + θXt−1 + ϵt where θ = 1 corresponds to a unit root and ϵt is a white noise process, with a constant variance and zero mean. In a stationary AR(1)-process, the constant term δ can be expressed as δ = (1 − θ)µ, where µ is the mean of the series. The null hypothesis of a unit root is that θ = 1 which also implies that δ = 0. Hence, to test the null hypothesis θ = 1 and δ = 0 must be shown. This is difficult to test, therefore the model is rewritten as ∆Xt = δ + (θ − 1)Xt−1 + ϵt = πXt−1 + ϵt The null hypothesis states that ϕ − 1 = 0 or equivalently π = 0. The hypothesis is thus formulated as H0 : π = 0 H1 : π < 0 When the hypotheses are established the Dickey-Fuller test performs a t-test on H0 . With the test one obtains a critical value τ̂ , which is a point in the test distribution and is compared to the test statistics. ϕ̂ − 1 π̂ τ̂ = = SE(ϕ) SE(π̂) 35 When performing the ADF test, p-value< 0.05 indicates strong evidence against the null hypothesis. Thus, stationarity is not rejected. On the other hand, p- 35 Verbeek, Marno.”A Guide to Modern Econometrics” 2014, 2nd Edition, p.265-268 19

value≥ 0.05, then evidence against the null-hypothesis is weak, hence stationarity of the time series can be rejected. 4.2.2 Kwiatkowski–Phillips–Schmidt–Shin (KPSS)-Test The KPSS-test is a test of the stationarity hypothesis proposed by Kwiatkowski, Phillips, Schmidt and Shin (1990). Similar to the Dickey-Fuller test, the characteristics of the KPSS-test is that it gives an indication on whether there exists a unit root or the process is stationary.36 Let Xt , t = 1,2,...T be a time series of observed values. Assume, the series can be decomposed into a deterministic trend, a random walk, and a stationary error. The data generating process (DGP) of Xt in KPSS can then be defined as Xt = Yt + ϵt + ξt where Yt is the deterministic trend term, ϵt is the error term, and ξt is the random walk term, so that ξt = ξt−1 + ηt . By definition of the random walk ηt ∼ iid(0,σ 2 ).37 If σ 2 =0 meaning the variance of ηt is zero, then it holds that ξt = ξt−1 That is, the random walk process devolves to a constant term and Xt becomes 36 ”What is a Critical Value?”, 2019. https://support.minitab.com/en-us/minitab-express/1/help-and-how- to/basic-statistics/inference/supporting-topics/basics/what-is-a-critical-value/ (Retrieved 2019- 05-04) 37 Nabeya, Seiji et al. ”Asymptotic Theory of a Test for the Con- stancy of Regression Coefficients Against the Random Walk alternative” 1987. https://projecteuclid.org/download/pdf1 /euclid.aos/1176350701?f bclid = IwAR2Rt2XpM IT ex A880DiEC4qzo8V EjzmA7HjM KN yp3mKSoKSAXhOaY F f 85c(Retrieved2019− 04 − 30) 20

trend-stationary, meaning that the series grows around the deterministic trend. Consequently, the null hypothesis can be formulated as H0 : σ 2 = 0 H1 : σ 2 > 0 Under the null hypothesis the process is trend-stationary (and the counter hypothesis implies that Xt , t = 1, 2...T is a unit root process).38 To reduce complexity, the deterministic component of the series may also be removed, Yt = 0. This is a special case for which the null hypothesis is that Xt is level-stationary around a level or mean (ξ0 ) instead of around a trend, meaning that the mean value no longer depends on t.39 A statistic that can be used for the null hypothesis is the LM statistic, which is defined as ∑ T LM = St 2 /σbt2 i=1 where ∑ t St2 = ei i=1 . That is, S2t is the squared partial sum of the residuals from a regression of x on the deterministic term. Further, et , t=1, 2, T denotes the residuals from a regression of X on a time trend and an intercept. Also, σct2 is the notation for the estimated value of the variance obtained from the regression. If the aim is to test for trend stationarity then the residual is redefined as ei = Xi − X̄ 38 Cappuccio, Nunzio et al. ”The Fragility of the KPSS Stationarity Test” 2009. http://leonardo3.dse.univr.it/home/workingpapers/fragilityk pss.pdf ?f bclid = IwAR0snLcQCpmgyN CM q0eR9JgXXwF W 3hnIZykKcv72IbZO7t57goM 9d1W 4xGI(Retrieved2019− 04 − 30) 39 Journal of Econonometrics ”Testing the null hypothesis of stationarity against the alternative of a unit root” 1991. http://debis.deu.edu.tr/userweb//onder.hanedar/dosyalar/kpss.pdf?fbclid=IwAR3uwIVD3WTB (Retrieved 2019-04-30) 21

which is the regression of X only on an intercept.40 4.3 Transformations This section will provide theory regarding the transformations that Henricsson found to be relevant when doing research. Furthermore, the purpose of them will be discussed. Data transformation is a process where information or data is converted from one format to another. In this case the goal is to transform data from non-stationary to stationary. To describe these given equations the following variables are introduced: Data is measured on the range ( t0 , .., t, .., tmax ) and consists of T elements. The dataset X, is an N*T matrix containing the N variable vectors (x1 , x2 ,.., xN ) where xi = (xi,t0 , …, xit… , xi,tmax ). For a certain point in time t, and a specific variable k, we will present a number of approximations of transformations. Most of the generally approximated transformations depend on the rate of decay α, which can be varied so there are a suitable number of varieties of the transformations and an estimation may be needed. Generally the formula for the new forecast after the transformation follows the pattern N ewF orecast = α(N ewData) + (1 − α)M ostRecentF orecast One can say that the approximation of α will decide the rate of how much the new forecast represents of new data and how much to consider the past.41 Studies that have been performed before have suggested that the value of α should be below 0.3 for a smoothing result.42 40 Journal of Econonometrics ”Testing the null hypothesis of stationarity against the alternative of a unit root” 1991. http://debis.deu.edu.tr/userweb//onder.hanedar/dosyalar/kpss.pdf?fbclid=IwAR3uwIVD3WTB (Retrieved 2019-04-30) 41 Ragnarstrom, Elsa. ”How to calculate forecast accuracy for stocked items with a lumpy demands”, 2015. https://www.diva- portal.org/smash/get/diva2:901177/FULLTEXT01.pdf (Retrieved 2019-05-03) 42 ”How To Identify Patterns in Time Series Data: Time Series Analysis” 22

4.3.1 Level Transformation Let {Xt , t = 0, 1, 2...} be a time series. Then the level transformation is defined as F 1i,t = Xi, t̄ where t̄ = max(tj ≤ t) t̄ = max(tj ≤ t) is the largest t value in the sample at a specific point of time. That, it corresponds to the latest observation. In other words, if there are any missing values, the most recent value obtained will be used. 4.3.2 First Difference Transformation The first difference at time t, F2i,t is obtained by looking at the change between an observation at time t and the previous time step, t-1, from the original series.43 The first difference transformation is defined as F 2i,t = Xi,t̄ − Xi,t̄−1 A non-stationary behavior commonly encountered is when the level of the process changes, although the process still shows homogeneity in the variability. Taking the (first) difference may in these cases lead to stationarity.44 In time series analysis, differencing is frequently used for removing dependency on time, for which structures such as trend and seasonality may be included. http://www.statsoft.com/Textbook/Time-Series-Analysis (Retrieved 2019-05-03) 43 Kulahci, Murat et al. ”Time Series Analysis and Forecasting by Example”, 2011.p 90 44 Bisgaard.S, Kulahci. M. ”Time Series Analysis and Forecasting”, 2017-06- 22. https://www.vividcortex.com/blog/exponential-smoothing-for-time-series- forecasting?fbclid=IwAR2XCtbMASHciBFEIRrpRkVvJda6ziKVJ3qCirAQJ3Oc3GsNBk5VZ4xLd0Q (Retrieved 2019-02-18) 23

4.3.3 Mean EWMA-transformation An exponentially weighted moving average, also called EWMA is a type of moving average that places a greater weight and significance on the most recent data points. For example, it can be assumed that a security’s price is mostly dependent on more recent prices compared to long ago historical data. The previous value of the EWMA is taken into consideration in the calculation of the following EWMA. The weights are based on the expontential function as the name indicates.45 This is a very popular scheme to produce a smoothed time series. In general if you have a time series called {Xt } then the smoother version will look like St = α ∗ xt + (1 − α)St−1 46 The definition for the EWMA mean in this case is F 3i,t = (1 − α) ∗ F 3i,t−1 + α ∗ F 2i,t 4.3.4 Variance-EWMA Transformation As mentioned, exponentially weighted moving averages are often used for smoothing irregular fluctuations in a time series to better find the patterns over a specific time period. Since EWMA has different properties the formula used for the EWMA variance transformation is F 4i,t = (1 − α) ∗ F 4i,t−1 + α(F 2i,t − F 3i,t )2 From EWMA variance, a future variance is estimated by the weighted average of 45 ”Exponentially Weighted Moving Average” https://www.value-at-risk.net/exponentially- weighted-moving-average-ewma (Retrieved 2019-03-02) 46 Jinka, Preetam. ”Exponential Smoothing for Time Series Forecasting”, 2017- 06-22. https://www.vividcortex.com/blog/exponential-smoothing-for-time-series- forecasting?fbclid=IwAR2XCtbMASHciBFEIRrpRkVvJda6ziKVJ3qCirAQJ3Oc3GsNBk5VZ4xLd0Q (Retrieved 2019-02-18) 24

past variances.47 4.3.5 Skewness EWMA Transformation This transformation measures the skewness and uses it in order to transform the data. The formula used is F 5i,t = (1 − α) ∗ F 5i,t−1 + α(F 2i,t − F 3i,t )3 4.3.6 Kurtosis-EWMA Transformation This transformation measures the kurtosis of the change in the variable. F 6i,t = (1 − α) ∗ F 6i,t−1 + α(F 2i,t − F 3i,t )4 4.3.7 Autocorrelation Transformation In general probability theory and statistics with a known stochastic process in focus, the autocorrelation will be a number that represents the similarity between a given time series and a lagged version of it over successive time intervals. In other words it is the same as calculating the correlation between two different time series, its current value versus its past. The result varies between -1 and 1. If the autocorrelation is positive it means that the increase in one time series results in an increase in the other time series as well.48 Firstly, the EWMA autocovariance is calculated by the following formula 47 Breaking Down Finance. EXPONENTIALLY MOVING AVERAGE VOLATILITY (EWMA). https://breakingdownfinance.com/finance-topics/risk-management/ewma/ (Retrieved 2019-05-03) 48 Kenton, Will. ”Autocorrelation”, 2019-03-31. https://www.investopedia.com/terms/a/autocorrelation.asp (Retrieved 2019-04-13) 25

F 7i,t = (1 − α) ∗ F 7i,t−1 + α(F 2i,t − F 3i,t )(F 2i,t−1 − F 3i,t−1 ) Normally, the autocovariance function between time t1 and t2 for Xt is defined as γX (t1 , t2 ) = Cov(Xt1 , Xt2 ) and the autocorrelation is defined as γX (t1 , t2 ) φX,X (t1 , t2 ) = σt1 ∗ σt2 where σt 2 is the variance at time t.49 To obtain the EWMA autocorrelation between, t1 = t and t2 = t − 1 the standard variances are replaced with the corresponding EWMA variances. Also, the EWMA autocovariance is used and the formula is hence EWMA autocorrt = √ √ F 7i,t F 4i,t ∗ F 4i,t−1 4.3.8 Correlation-EWMA Transformation In probability theory the correlation measures the degree to which two time series move in relation to each other. Just like in the autocorrelation case, if the correlation is positive, it indicates that if one series moves up the other will follow.50 Let {Xt , t = 0, 1, 2...} be a time series representing one set of observed data, and {Yt , t = 0, 1, 2....} be another time series which represents another set of observed data. To begin with, the EWMA covariance is calculated by the formula F 8i,j ,t = (1 − α) ∗ F 8i,j ,t−1 + α(F 2i,t − F 3i,t )(F 2j ,t−1 − F 3j ,t−1 ) 49 Kulahci, Murat et al. ”Time Series Analysis and Forecasting by Example”, 2011 p.62 50 Hayes, Adam. ”Correlation”, 2019-04-30. https://www.investopedia.com/terms/c/correlation.asp (Retrieved 2019-05-01) 26

where index i and index j correspond to Xt and Yt , respectively. In general, the covariance between to random variables X and Y is denoted Cov(X,Y) and the correlation between the random variables is defined as Cov(X, Y ) φX,Y = σX ∗ σY where σX 2 is the variance of X and σY 2 is the variance of Y .51 Using the EWMA covariance and replacing the standard variance with their corresponding EWMA variances, the EWMA correlation is formulated as EWMA corrt = √ √ F 8i,j ,t F 4i,t ∗ F 4j ,t 51 Kulahci, Murat et al. ”Time Series Analysis and Forecasting by Example”, 2011 p.62 27

5 Methodology As tools it was decided to limit this project to the programming language Python and spreadsheet Microsoft Excel. These tools have been chosen since they are easily used for time series data and one can perform all the hypothesis tests and transformations required using these.52 5.1 Data Collection The data was provided by SHB and consisted of different security prices and indices. These covered the time period from 2001-01-01 to 2018-12-31 and were noted on a daily basis. This was in order to capture real trends and seasonality of the time series. The data regarded US related securities, such as US sectors stock indices, US treasury bonds, exchange rates with the US dollar and more. Processing this type of data may lay the basis for SHB to use the data and predict future outcome of the US stock market. For example, future values for US stock market indices Dow Jones Index or SP 500 may potentially be forecasted by a prediction model after the data is pre-processed. This was an area of interest for SHB. The data is considered to be quantitative since it only contains numbers.53 Qualitative data was also used when discussing experiences with professionals with previous expertise regarding data pre-processing. For example discussions on how to interpret results or to understand more about the data chosen. 5.2 Data and Notations This section contains the data and notation used in this thesis and explanations regarding them. 52 Brownlee, Jason. ”How to Check if Time Series Data is Stationary with Python”, 2016-12- 30 https://machinelearningmastery.com/time-series-data-stationary-python/ (Retrieved 2019- 03-09) 53 ”Collecting Data” http://betterthesis.dk/research-methods/lesson-1different-approaches-to- research/collecting-data (Retrieved 2019-02-09) 28

5.2.1 Exchange Rates (FX data) An exchange rate shows the value of one currency unit relative to a unit of another currency in the foreign exchange market.54 Further in this report, a currancy pair Currancy1 Currancy2 represents the price given in currency 2, for one unit of currency 1. As FX-data, the currency pairs used are EURUSD, GBPUSD, AUDUSD, NZDUSD, USDCAD, USDCHF, USDJPY, USDNOK and USDSEK. 5.2.2 US Sectors Data The sector data used are indices, each one describing the performance of a chosen sector in the United States. The index is designed by Morgan Stanley Capital International (MSCI) and covers securities in the large and mid cap segment within the specific sector. MSCI is a provider of security indices and performance analytics.55 The classification of the securities follows the Global Industry Classification Standard (GICS®).56 Notations for each sector are MXUS0EN (Energy), MXUS0MT (Materials), MXUS0IN (Industrials), MXUS0CD (Consumer Discretionary), MXUS0CS (Consumer Staples), MXUS0HC (Health Care), MXUS0FN (Financials), MXUS0IT (Information Technology) and MXUS0TC (Telecom Services) and MXUS0UT (Utilities). 5.2.3 Countries- Stock Index Data The country (and region) indices used are MXDE (Denmark), MXEU (Europe), MXGB (United Kingdom), MXFR (France), MXCH (Switzerland), MXES (Spain), MXIT (Italy) and MXUS (the United States). Each index is developed 54 Investopedia, ”Currancy Pair Definition”.https://www.investopedia.com/terms/c/currencypair.asp. (Retrieved 2019-05- 04) 55 ”Index solutions”. MSCI, https://www.msci.com/index-solutions (Retrieved 2019-05-18) 56 ”MSCI USA MATERIALS INDEX”. MSCI, 2019-04-30. https://www.msci.com/documents/10199/6ce4617e- 9127-480f-8f3b-1fdf4c0c8962 (Retrieved 2019-05-03) 29

You can also read