Autotuner Feature Guide - Bisheng Compiler - HUAWEI TECHNOLOGIES CO., LTD - Issue Date

Page created by Veronica Lee
 
CONTINUE READING
Autotuner Feature Guide - Bisheng Compiler - HUAWEI TECHNOLOGIES CO., LTD - Issue Date
Bisheng Compiler

Autotuner Feature Guide

Issue           05
Date            2021-06-22

HUAWEI TECHNOLOGIES CO., LTD.
Copyright © Huawei Technologies Co., Ltd. 2021. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without prior
written consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

      and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective
holders.

Notice
The purchased products, services and features are stipulated by the contract made between Huawei and
the customer. All or part of the products, services and features described in this document may not be
within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements,
information, and recommendations in this document are provided "AS IS" without warranties, guarantees
or representations of any kind, either express or implied.

The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.

Issue 05 (2021-06-22)          Copyright © Huawei Technologies Co., Ltd.                                  i
Bisheng Compiler
Autotuner Feature Guide                                                                                                                                                            Contents

                                                                                                                                                             Contents

1 Overview....................................................................................................................................1
1.1 Concepts..................................................................................................................................................................................... 1
1.2 Functions of the Bisheng Compiler................................................................................................................................... 1
1.3 Functions of the Autotuner..................................................................................................................................................2
1.4 Autotuner Tuning Process.................................................................................................................................................... 2

2 Quick Start................................................................................................................................ 4
2.1 Obtaining the Autotuner...................................................................................................................................................... 4
2.2 Environment Requirements................................................................................................................................................. 4
2.3 Installing the Autotuner........................................................................................................................................................ 4
2.4 Running the Autotuner......................................................................................................................................................... 5
2.4.1 Running Modes..................................................................................................................................................................... 5
2.4.2 llvm-autotune (Recommended)..................................................................................................................................... 6
2.4.3 auto-tuner.............................................................................................................................................................................. 8
2.5 Uninstalling the Autotuner.................................................................................................................................................. 9

3 Preparations............................................................................................................................10
4 Usage........................................................................................................................................11
4.1 llvm-autotune (Recommended)...................................................................................................................................... 11
4.1.1 Tool Introduction............................................................................................................................................................... 11
4.1.2 Help Information............................................................................................................................................................... 11
4.1.3 Compiler-related Options............................................................................................................................................... 12
4.2 auto-tuner............................................................................................................................................................................... 12
4.2.1 Tool Introduction............................................................................................................................................................... 13
4.2.2 Help Information............................................................................................................................................................... 13
4.2.3 Parse Instruction................................................................................................................................................................ 13
4.2.3.1 Usage of the Parse Instruction.................................................................................................................................. 13
4.2.3.2 Filters.................................................................................................................................................................................. 14
4.2.3.3 Search Configuration File............................................................................................................................................ 15
4.2.3.4 Parse Example................................................................................................................................................................. 16
4.2.4 Run Instruction................................................................................................................................................................... 16
4.2.4.1 Running the Tuner......................................................................................................................................................... 17
4.2.4.2 Configuration File.......................................................................................................................................................... 18
4.2.4.3 Tuners................................................................................................................................................................................. 19

Issue 05 (2021-06-22)                                     Copyright © Huawei Technologies Co., Ltd.                                                                                              ii
Bisheng Compiler
Autotuner Feature Guide                                                                                                                                                         Contents

4.2.4.4 Search Space File............................................................................................................................................................20
4.2.4.5 Algorithm.......................................................................................................................................................................... 21
4.2.4.6 Run Example.................................................................................................................................................................... 22
4.2.5 Auto-run Instruction......................................................................................................................................................... 22
4.2.5.1 Usage of the Auto-run Instruction........................................................................................................................... 22
4.2.5.2 Auto-run Example.......................................................................................................................................................... 24

5 Appendix..................................................................................................................................26
5.1 Feedback.................................................................................................................................................................................. 26
5.2 Change History.................................................................................................................................................................... 26

Issue 05 (2021-06-22)                                    Copyright © Huawei Technologies Co., Ltd.                                                                                           iii
Bisheng Compiler
Autotuner Feature Guide                                                                  1 Overview

                                                                     1         Overview

                 1.1 Concepts
                 1.2 Functions of the Bisheng Compiler
                 1.3 Functions of the Autotuner
                 1.4 Autotuner Tuning Process

1.1 Concepts
Automatic Tuning
                 Automatic tuning is an automatic iterative process that optimizes a given program
                 by manipulating compilation options for optimal performance. This process is
                 completed by the collaboration of two components, the Bisheng compiler and the
                 Autotuner command line tool.

Bisheng Compiler
                 A compiler with the automatic tuning feature can work with the Autotuner to
                 control optimization in a finer granularity.

Autotuner
                 The Autotuner is a command line tool that needs to be used together with the
                 Bisheng compiler. It manages the generation and parameter operations of search
                 spaces and drives the entire tuning process.

1.2 Functions of the Bisheng Compiler
                 As one of the features of the Bisheng compiler, the automatic tuning can control
                 optimization in a finer granularity. You do not need to add pragma directives into
                 the source code. Instead, you can specify the optimization configuration in a
                 simple YAML file. The file contains the optimization information and the
                 corresponding code region information, including the name and line number. In

Issue 05 (2021-06-22)        Copyright © Huawei Technologies Co., Ltd.                                1
Bisheng Compiler
Autotuner Feature Guide                                                                      1 Overview

                 addition, it can record optimization results, generate a tuning opportunity list, and
                 export the list in YAML format.

Purposes
                 ●      Make the compilation process more flexible and controllable.
                 ●      Fine-grained compilation control provides more tuning opportunities.

Functions
                 ●      Read the compilation configuration corresponding to each code area.
                 ●      Output the tuning opportunities, that is, which structures in the target
                        program can be used for tuning.

1.3 Functions of the Autotuner
                 ●      Interact with the Bisheng compiler:
                        –   Create a search space based on the tuning opportunities generated by
                            the compiler.
                        –   Generate the compilation configuration and invoke the compiler to
                            compile the source code.
                 ●      Operate tuning parameters and apply the search algorithm.
                        –   Built-in genetic algorithm.
                 ●      Obtain performance data.

1.4 Autotuner Tuning Process
                 As shown in Figure 1-1, the tuning process consists of two phases: initial
                 compilation and tuning process.

                 Figure 1-1 Autotuner tuning process

Initial Compilation
                 In the initial compilation phase before tuning, the Autotuner instructs the compiler
                 to compile the target program code. During the compilation, the Bisheng compiler

Issue 05 (2021-06-22)           Copyright © Huawei Technologies Co., Ltd.                            2
Bisheng Compiler
Autotuner Feature Guide                                                                     1 Overview

                 generates some YAML files that contain all tuning opportunities, and tells us
                 which structures in the target program can be used for tuning, such as module,
                 function, and loop. For example, loop unrolling is one of the most common
                 optimization methods in a compiler. By copying loop body code for multiple
                 times, the loop unrolling achieves optimization effects such as increasing a space
                 for instruction scheduling and reducing overheads of loop branch instructions. If
                 the tuning is performed based on the unroll factor, the compiler generates all the
                 loops that can be cyclically unrolled in the YAML file as the tuning opportunities.

Tuning Process
                 After the tuning opportunities are generated, the tuning process starts.
                 1.     The Autotuner reads the YAML files of the tuning opportunities to generate
                        the corresponding search spaces, that is, the parameters and ranges for each
                        tuning opportunity.
                 2.     The Autotuner tries a group of parameters based on the specified search
                        algorithm to generate a compilation configuration file in YAML format. In this
                        way, the compiler compiles the target program code to generate a binary file.
                 3.     Finally, the Autotuner runs the compiled file in a user-defined manner and
                        obtains the performance information as the feedback.
                 4.     After a certain number of iterations, the Autotuner finds the optimal
                        configuration, generates the optimal compilation configuration file, and stores
                        the file in YAML format.

Issue 05 (2021-06-22)          Copyright © Huawei Technologies Co., Ltd.                               3
Bisheng Compiler
Autotuner Feature Guide                                                                2 Quick Start

                                                                  2        Quick Start

                 2.1 Obtaining the Autotuner
                 2.2 Environment Requirements
                 2.3 Installing the Autotuner
                 2.4 Running the Autotuner
                 2.5 Uninstalling the Autotuner

2.1 Obtaining the Autotuner
                 The Autotuner has been included in the release package of the Bisheng compiler.
                 You can find the file in the directory bisheng-compiler-1.3.3-aarch64-linux/lib/
                 autotuner.

2.2 Environment Requirements
                 Mandatory:
                 ●      Operating systems: openEuler21.03, openEuler 20.03 (LTS), CentOS 7.6,
                        Ubuntu 18.04, Ubuntu 20, Kylin V10, and UOS 20
                 ●      Architecture: AArch64
                 ●      Python 3.8.2
                 ●      SQLite 3.0
                 Optional:
                 ●      LibYAML (recommended, which can improve the Auotuner file parsing speed)

2.3 Installing the Autotuner
                 The Autotuner has been included in the release package of the Bisheng compiler.
                 If you have installed the Bisheng compiler, you only need to configure the
                 environment variable of the Bisheng compiler. Otherwise, install the Bisheng
                 compiler first.

Issue 05 (2021-06-22)          Copyright © Huawei Technologies Co., Ltd.                            4
Bisheng Compiler
Autotuner Feature Guide                                                                            2 Quick Start

                 ●      Run the following command to configure the environment variable of the
                        Bisheng compiler:
                        export PATH=/opt/compiler/bisheng-compiler-1.3.3-aarch64-linux/bin:$PATH

                            NOTICE

                        The /opt/compiler is used as an example. The actual installation directory
                        prevails.

                 ●      Verify the installation.
                        Run the following commands:
                        llvm-autotune -h
                        auto-tuner -h

                        If the help information is displayed, the installation is successful.

                            NOTICE

                        If an error occurs during the running, ensure that your system meets the
                        requirements described in 2.2 Environment Requirements.
                        For example:
                        bad magic number in 'autotuner': b'U\r\r\n'

                        Ensure that your Python 3 version is 3.8.2 and the installation path exists in
                        PATH. Run the python3 -V command to check the Python 3 version.
                        No module named '_sqlite3'

                        Ensure that SQLite 3.0 has been installed.

2.4 Running the Autotuner

2.4.1 Running Modes
                 Currently, the Autotuner can be used in two modes with two different command
                 line tools, llvm-autotune and auto-tuner.

                 ●      The llvm-auotune allows users to lead the tuning process and provides
                        auxiliary functions to work with the compiler. Compared with the auto-tuner,
                        the llvm-auotune greatly simplifies the configuration and tuning procedure.
                        The llvm-auotune is recommended because it is available out-of-the-box.
                 ●      The auto-tuner is a traditional tuning tool that manages the entire tuning
                        process. You need to adapt the configuration file to set the details during the
                        tuning, including how to compile and run code, and how to obtain the
                        performance information and tunable parameters.

                 The following uses the coremark as an example to describe how to perform
                 automatic tuning. The release package of the Bisheng compiler does not contain
                 the coremark. Obtain the coremark from the community. For details, see 4
                 Usage.

Issue 05 (2021-06-22)             Copyright © Huawei Technologies Co., Ltd.                                   5
Bisheng Compiler
Autotuner Feature Guide                                                                                     2 Quick Start

2.4.2 llvm-autotune (Recommended)
                 You can write the tuning scripts as required. The following uses the coremark as
                 an example to describe how to perform automatic tuning. The release package of
                 the Bisheng compiler does not contain the coremark. Obtain the coremark from
                 the community. The following is an example of the script for tuning the coremark
                 in 20 iterations:
                 export AUTOTUNE_DATADIR=/tmp/autotuner_data/
                 CompileCommand="clang -Ilinux64 -I. -g -DFLAGS_STR=\"\" -DITERATIONS=300000 core_list_join.c
                 core_main.c core_matrix.c core_state.c core_util.c linux64/core_portme.c -O2 -o coremark"

                 $CompileCommand -fautotune-generate;
                 llvm-autotune minimize;
                 for i in $(seq 20)
                 do
                   $CompileCommand -fautotune ;
                   time=`/usr/bin/time -p ./coremark 0x0 0x0 0x66 300000 2>&1 1>/dev/null | grep real | awk '{print $2}'`;
                   echo "iteration: " $i "cost time:" $time;
                   llvm-autotune feedback $time;
                 done
                 llvm-autotune finalize;

                 The steps are as follows:

         Step 1 Configuring environment variable

                 Use the environment variable AUTOTUNE_DATADIR to specify the storage
                 location of tuning-related data.
                 export AUTOTUNE_DATADIR=/tmp/autotuner_data/

         Step 2 Initial compilation procedure

                 Add the -fautotune-generate option to the Bisheng compiler to generate tuning
                 opportunities.
                 cd examples/coremark/
                 clang -Ilinux64 -I. -DFLAGS_STR=\"" -lrt"\" -DITERATIONS=300000 core_list_join.c core_main.c
                 core_matrix.c core_state.c core_util.c linux64/core_portme.c -O2 -g -o coremark -fautotune-generate

                        NOTICE

                 It is recommended that this option be used only for hotspot code files that require
                 tuning. If there are too many code files (more than 500) of the application, a large
                 number of tuning opportunity files are generated. As a result, the initialization in
                 Step 3 may take a long time (several minutes). In addition, the tuning effect is
                 not satisfactory and the convergence time is long due to the huge search space.

         Step 3 Initial tuning

                 Run the llvm-autotune command to initialize the tuning task. Generate the initial
                 compilation configuration for the next compilation.
                 llvm-autotune minimize

                 minimize indicates the tuning target to minimize indicators such as program
                 running time. You can also use maximize to maximize indicators such as program
                 throughput.

Issue 05 (2021-06-22)            Copyright © Huawei Technologies Co., Ltd.                                                   6
Bisheng Compiler
Autotuner Feature Guide                                                                                   2 Quick Start

         Step 4 Tuning and compilation

                 Add the -fautotune option to the Bisheng compiler to read the current
                 AUTOTUNE_DATADIR configuration and compile.
                 clang -Ilinux64 -I. -DFLAGS_STR=\"" -lrt"\" -DITERATIONS=300000 core_list_join.c core_main.c
                 core_matrix.c core_state.c core_util.c linux64/core_portme.c -O2 -g -o coremark -fautotune

         Step 5 Performance feedback

                 You can run the program and obtain performance data based on your
                 requirements. Run the llvm-autotune feedback command to feed back the
                 performance data. For example, if you want to perform the tuning based on the
                 coremark running speed, run the following commands:
                 time -p ./coremark 0x0 0x0 0x66 300000 2>&1 1>/dev/null

                 llvm-autotune feedback 31.09

                        NOTICE

                 Before running the llvm-autotune feedback command, you are advised to check
                 whether the compilation in Step 4 is normal and whether the compiled program is
                 running properly. If the compilation or running is abnormal, enter the worst value
                 of the tuning target. For example, if the tuning target is minimize, enter llvm-
                 autotune feedback 9999. If the tuning target is maximize, enter 0 or -9999.
                 If the input performance feedback is incorrect, the final tuning result may be
                 affected.

         Step 6 Tuning iteration

                 Repeat steps 4 and 5 to perform optimization iteration based on the specified
                 number of iteration times.

         Step 7 Stopping tuning

                 After multiple iterations, you can stop the tuning and save the optimal
                 configuration file. The configuration file is saved in the directory specified by the
                 environment variable AUTOTUNE_DATADIR.
                 llvm-autotune finalize

         Step 8 Final compilation

                 Use the optimal configuration file obtained in Step 7 to perform the final
                 compilation. If the environment variable is not changed, you can directly use the -
                 fautotune option.
                 clang -Ilinux64 -I. -DFLAGS_STR=\"" -lrt"\" -DITERATIONS=300000 core_list_join.c core_main.c
                 core_matrix.c core_state.c core_util.c linux64/core_portme.c -O2 -g -o coremark -fautotune

                 Alternatively, you can run the use -mllvm -auto-tuning-input= command to
                 directly point to the configuration file.

Issue 05 (2021-06-22)            Copyright © Huawei Technologies Co., Ltd.                                           7
Bisheng Compiler
Autotuner Feature Guide                                                                                     2 Quick Start

                 clang -Ilinux64 -I. -DFLAGS_STR=\"" -lrt"\" -DITERATIONS=300000 core_list_join.c core_main.c
                 core_matrix.c core_state.c core_util.c linux64/core_portme.c -O2 -g -o coremark -mllvm -auto-tuning-
                 input=/tmp/autotuner_data/config.yaml

                 ----End

2.4.3 auto-tuner
                 Use the auto-tuner tool to manage the tuning process. The procedure is as
                 follows. The configuration file for tuning coremark will be used during the process.
                 You can find the configuration file in the Bisheng software package directory /lib/
                 autotuner/config/coremark_sample.ini.

         Step 1 Generating a tuning opportunity list
                 Use the -mllvm -auto-tuning-opp= option of the Bisheng compiler to
                 generate a tuning opportunity list for the search space.
                 cd examples/coremark/
                 clang -Ilinux64 -I. -DFLAGS_STR=\"" -lrt"\" -DITERATIONS=300000 core_list_join.c core_main.c
                 core_matrix.c core_state.c core_util.c linux64/core_portme.c -O2 -g -o coremark -mllvm -auto-tuning-
                 opp=opp

         Step 2 Parsing
                 Parse the tuning opportunity list to generate the search space.
                 cd ../..
                 auto-tuner parse ./examples/coremark/opp/* -o loop_search.yaml --type-filter loop

                 If you want to perform tuning only at the loop level, you can use the --type-filter
                 loop option to specify that only the loop search space is generated.
         Step 3 Running
                 Use the generated search space file to start automatic tuning.
                 auto-tuner run config/coremark_sample.ini --results-log module.log --stop-after 600 -ss loop_search.yaml --
                 time-after-convergence 300

                 You can use --stop-after or --time-after-convergence to set the tuning time. In
                 this example, the task will stop 600 seconds after the tuning starts, or 300 seconds
                 after no better configuration can be found.

                         NOTE

                        If the following error occurs:
                        /bin/sh: config/../../../bin/clang not found
                        It is because BinPath in config/coremark_sample.ini is set incorrectly. Change the value to
                        the bin path of the Bisheng compiler.

                 ----End

                 Alternatively, run the auto_run command to generate a tuning opportunity list,
                 parse the list, and run the automatic tuning program step by step.
                 The auto_run command automatically completes the preceding three phases, that
                 is, automatically generates a tuning opportunity list, parses the list as a search
                 space, and then automatically starts tuning. Command:
                 auto-tuner auto_run config/coremark_sample.ini --results-log coremark.log --stop-after 600

Issue 05 (2021-06-22)             Copyright © Huawei Technologies Co., Ltd.                                                8
Bisheng Compiler
Autotuner Feature Guide                                                                                2 Quick Start

                 At the same time, it starts automatic tuning in three phases (module -> function -
                 > loop). In each phase, parameters are adjusted at a specific fine-grained level
                 (module, function, loop, or machine_basic_block).

                         NOTE

                        If you want to tune only at a specific fine-grained level, use the --stage-order option (for
                        example, --stage-order loop).

2.5 Uninstalling the Autotuner
                 Edit environment variable PATH and delete the path /opt/compiler/bisheng-
                 compiler-1.3.3-aarch64-linux/bin of the newly added Bisheng compiler.

Issue 05 (2021-06-22)             Copyright © Huawei Technologies Co., Ltd.                                            9
Bisheng Compiler
Autotuner Feature Guide                                                                 3 Preparations

                                                               3         Preparations

         Step 1 Install the Autotuner. For more information, see 2 Quick Start.
         Step 2 The Autotuner must be used with a compiler that supports tuning.
                 Before running the Autotuner, check whether the environment variable of the
                 compiler is correctly set. Alternatively, you can put the environment variable in the
                 configuration file. For details, see 4 Usage.

                 ----End

Issue 05 (2021-06-22)        Copyright © Huawei Technologies Co., Ltd.                              10
Bisheng Compiler
Autotuner Feature Guide                                                                        4 Usage

                                                                                 4       Usage

                 4.1 llvm-autotune (Recommended)
                 4.2 auto-tuner

4.1 llvm-autotune (Recommended)

4.1.1 Tool Introduction
                 Currently, the Autotuner can be used in two modes with two different command
                 line tools, llvm-autotune and auto-tuner.
                 The llvm-auotune allows users to lead the tuning process and provides auxiliary
                 functions to work with the compiler. Compared with the auto-tuner, the llvm-
                 auotune greatly simplifies the configuration and tuning procedure. The llvm-
                 auotune is recommended because it is available out-of-the-box.

4.1.2 Help Information
                 Help command: llvm-autotune -h. The execution format of the llvm-autotune is
                 as follows:
                 llvm-autotune [-h] {minimize,maximize,feedback,dump,finalize}

                 Optional instructions:
                 ●      minimize: initializes tuning and generates an initial compiler configuration
                        file to minimize indicators (such as running time).
                 ●      maximize: initializes tuning and generates the initial compiler configuration
                        file to maximize indicators (such as throughput).
                 ●      feedback: feeds back the performance optimization result and generates new
                        compiler configuration.
                 ●      dump: generates the optimal configuration without stopping the tuning
                        (feedback can be continued).
                 ●      finalize: stops tuning and generate the optimal compiler configuration
                        (feedback cannot be executed).
                 Help information.

Issue 05 (2021-06-22)           Copyright © Huawei Technologies Co., Ltd.                               11
Bisheng Compiler
Autotuner Feature Guide                                                                         4 Usage

                 ●      --help/-h
                 usage: llvm-autotune [-h] {minimize,maximize,feedback,dump,finalize} ...

                 positional arguments:
                  {minimize,maximize,feedback,dump,finalize}
                   minimize          Initialize tuning and generate the initial compiler
                                 configuration file, aiming to minimize the metric
                                 (e.g. run time)
                   maximize           Initialize tuning and generate the initial compiler
                                 configuration file, aiming to maximize the metric
                                 (e.g. throughput)
                   feedback          Feed back performance tuning result and generate a new
                                 test configuration
                   dump             Dump the current best configuration without
                                 terminating the tuning run
                   finalize        Finalize tuning and generate the optimal compiler
                                 configuration

                 optional arguments:
                  -h, --help      show this help message and exit

4.1.3 Compiler-related Options
                 llvm-auotune needs to be used with the -fautotune-generate and -fautotune
                 options of the Bisheng compiler.

                 ●      -fautotune-generate:
                        –    The tuning opportunity list is generated in the autotune_datadir
                             directory. The default directory can be modified by the environment
                             variable AUTOTUNE_DATADIR.
                        –    As the first step of tuning preparation, you need to use the option before
                             running the llvm-autotune minimize/maximize command.
                        –    You can also assign a value to this option to change the tuning
                             granularity. The options are Other, Function, Loop, and
                             MachineBasicBlock. For example, -fautotune-generate=Function
                             enables the tuning opportunities of the function type. Each function is
                             assigned a different parameter value during tuning. Other indicates
                             global. The generated tuning opportunities correspond to compilation
                             units (code files).
                             -fautotune-generate is equivalent to -fautotune-
                             generate=Function,Loop by default. The default value is recommended.
                 ●      -fautotune:
                        –    Use the compiler configuration in the autotune_datadir directory for
                             tuning and compilation. (The default directory can be modified by the
                             environment variable AUTOTUNE_DATADIR.)
                        –    This option is used after the llvm-autotune minimize/maximize/
                             feedback command is run during tuning iteration.
                         NOTE

                        For details, see 2.4.2 llvm-autotune (Recommended).

4.2 auto-tuner

Issue 05 (2021-06-22)            Copyright © Huawei Technologies Co., Ltd.                             12
Bisheng Compiler
Autotuner Feature Guide                                                                           4 Usage

4.2.1 Tool Introduction
                 Currently, the Autotuner can be used in two modes with two different command
                 line tools, llvm-autotune and auto-tuner.
                 The auto-tuner is a traditional tuning tool that manages the entire tuning process.
                 You need to adapt the configuration file to set the details during the tuning,
                 including how to compile and run code, and how to obtain the performance
                 information and tunable parameters.

4.2.2 Help Information
                 Help command: auto-tuner -h. The execution format of auto-tuner is as follows:
                 Auto-tuner [-h] {run,merge,divide,parse,auto_run} ...

                 Optional instructions:
                 ●      run: runs the tuner.
                 ●      merge: merges multiple compilation configuration files.
                 ●      divide: divides a compilation configuration file into multiple files based on the
                        source code file name in the configuration file.
                 ●      parse: parses the tuning opportunity list to generate the search space.
                 ●      auto_run (recommended): automatically generates the search space and
                        performs the tuning by phase. The default phase sequence is module >
                        function > loop.
                        The three main instructions are parse, run, and auto_run.
                 Help information.
                 ●      --help/-h
                 usage: auto-tuner [-h] {run,merge,divide,parse,auto_run} ...

                 positional arguments:
                  {run,merge,divide,parse,auto_run}
                                 commands help
                    run            Run the tuner
                    merge            Merge LLVM configuration input files
                    divide          Divide LLVM configuration input file into multiple
                                 files based on file_name
                    parse           Parse the tuning opportunity files and generate search
                                 space
                    auto_run         (recommended) auto-generate the search space and run
                                 the auto-phase-based tuning (the default order of
                                 stages is module -> function -> loop)

                 optional arguments:
                  -h, --help      show this help message and exit

4.2.3 Parse Instruction

4.2.3.1 Usage of the Parse Instruction
                 The parse instruction is used to parse the tuning opportunity list and generate the
                 search space. The format of the parse instruction is as follows:
                 auto-tuner parse    ...

Issue 05 (2021-06-22)            Copyright © Huawei Technologies Co., Ltd.                            13
Bisheng Compiler
Autotuner Feature Guide                                                                                             4 Usage

                  Mandatory parameter:
                  ●     opp_file: tuning opportunity file generated by the compiler
                  Common optional parameter:
                  ●     --output/-o : specifies the address of the output file.
                  Help information:
                  ●     --help/-h
                  positional arguments:
                   opp_file         Opportunity files generated by LLVM

                  optional arguments:
                    -h, --help         show this help message and exit
                    --parse-format [{xml,yaml}]
                                    choose the format of LLVM auto-tuning-
                                    input/opp,(default: yaml)
                    -nf Name [Name ...], --name-filter Name [Name ...]
                                    to filter code regions by names when generating search
                                    space
                    --func-name-filter Name [Name ...]
                                    to filter code regions by function names when
                                    generating search space
                    --file-name-filter Name [Name ...]
                                    to filter code regions by file names when generating
                                    search space
                    -scf SEARCH_CONFIG_FILE, --search-config-file SEARCH_CONFIG_FILE
                                    The Search space config file
                    -o FILE, --output FILE
                                    output file
                    -tf {machine_basic_block,loop,function,module} [{machine_basic_block,loop,function,module} ...], --type-
                  filter {machine_basic_block,loop,function,module} [{machine_basic_block,loop,function,module} ...]
                                    to filter code regions by types when generating search
                                    space

4.2.3.2 Filters
                  When the search space is generated, the code regions in the opp file can be
                  filtered based on the region name, function name, file name, and type. If no filter
                  is applied, the search space will contain all code regions. The format of the
                  instruction is as follows:
                  --name-filter Region name 1 Region name 2 Region name 3
                  --func-name-filter Function name 1 Function name 2 Function name 3
                  --file-name-filter File name 1 File name 2 File name 3
                  --type-filter Type name 1 Type name 2 Type name 3

                        NOTICE

                  These options filter the code regions by matching the text information in the opp
                  file.

                  For example, use file_name to filter the following code regions:
                  --- !AutoTuning
                  Pass:        machine-scheduler
                  Name:         '%bb.2:if.end'
                  DebugLoc:       { File: core_list_join.c, Line: 287, Column: 7 }
                  Function:     core_list_insert_new
                  CodeRegionType: machine_basic_block
                  ...

Issue 05 (2021-06-22)              Copyright © Huawei Technologies Co., Ltd.                                                   14
Bisheng Compiler
Autotuner Feature Guide                                                                                                   4 Usage

                 Select the correct value for --file-name-filter from the following options:
                 ●      [×] ./core_list_join.c
                 ●      [×] /home/user/coremark/core_list_join.c
                 ●      [√] core_list_join.c

4.2.3.3 Search Configuration File
                 The search configuration file defines global parameter settings for each type of
                 code region. You can use --search-config-file to specify a personalized search
                 configuration file. If --search-config-file is not specified, the Auotuner uses the
                 default search configuration file.
                 The content of the default search configuration file is as follows:
                 CodeRegion:
                   CodeRegionType: loop
                   Args:
                     VectorizationInterleave:
                       value: [1, 2, 4]
                       type: enum
                     UnrollCount:
                       value: [0, 1, 2, 4, 8]
                       type: enum
                     PeelCount:
                       value: [0, 1]
                       type: enum
                 ---
                 CodeRegion:
                   CodeRegionType: machine_basic_block
                   Args:
                     MachineScheduling:
                       value: ["TopDown", "BottomUp", "Bidirectional"]
                       type: enum
                 ---
                 CodeRegion:
                   CodeRegionType: function
                   Args:
                     InlineThreshold:
                       value: [175, 225, 275, 325, 375, 425, 500]
                       type: enum
                 ---
                 CodeRegion:
                   CodeRegionType: other
                   Args:
                     OptPass:
                       type: selection
                       value: [ipsccp, globalopt, mem2reg, deadargelim, instcombine, simplifycfg, prune-eh, inline,
                 functionattrs,
                       argpromotion, sroa, jump-threading, simplifycfg, aggressive-instcombine, instcombine, tailcallelim,
                 simplifycfg,
                       reassociate, loop-simplify, lcssa, loop-rotate, licm, loop-unswitch, simplifycfg, instcombine, loop-simplify,
                       lcssa, indvars, loop-deletion, loop-unroll, gvn, memcpyopt, sccp, instcombine, jump-threading, dse, loop-
                 simplify,
                       lcssa, licm, simplifycfg, instcombine, globalopt, globaldce, loop-simplify, lcssa, loop-rotate, loop-simplify,
                 instcombine,
                       simplifycfg, instcombine, loop-simplify, lcssa, loop-unroll, instcombine, loop-simplify, lcssa, licm, strip-
                 dead-prototypes,
                       globaldce, constmerge, loop-simplify, lcssa, simplifycfg]

                 When configuring the personalized search configuration file, refer to the preceding
                 default search configuration file.

Issue 05 (2021-06-22)             Copyright © Huawei Technologies Co., Ltd.                                                       15
Bisheng Compiler
Autotuner Feature Guide                                                                                       4 Usage

Important Configuration Attributes
                  Key                                  Value

                  CodeRegionType                       other, loop, function, machine_basic_block

                  type                                 bool, enum, range, permutation, selection

Variable Type
                 ●      bool: indicates a parameter of the Boolean type.
                        Args:
                          ParamName:
                              type: bool

                 ●      enum: indicates a parameter of an unordered set. Randomly select a value
                        from the specified set.
                        Args:
                          ParamName:
                              type: enum
                              value: [0, 2, 4, 8]

                 ●      range: indicates a parameter whose value is an integer within the valid range
                        (from 0 to 255). The minimum value and maximum value must be specified.
                        Args:
                          ParamName:
                              type: range
                              min: 1
                              max: 6

                 ●      permutation: indicates a permutation parameter. Disorder the elements in
                        value to form a permutation.
                        Args:
                          ParamName:
                              type: permutation
                              value: [option1, option2, option3, option4]

                 ●      selection: indicates a permutation parameter. Select any number of elements
                        from value to form a permutation in any order.
                        Args:
                          ParamName:
                              type: selection
                              value: [option1, option2, option4, option5]

4.2.3.4 Parse Example
                 Run the following command as an example:
                 auto-tuner parse    -o search_space.yaml --type-filter loop module

                 ●      opp1.yaml opp2.yaml opp3.yaml is a tuning opportunity list generated by
                        the compiler through -auto-tuning-opp.
                 ●      -o search_space.yaml is used to generate the search space file
                        search_space.yaml, which will be used as the input of the run instruction.
                 ●      --type-filter loop is used to filter out the loop code regions.

4.2.4 Run Instruction

Issue 05 (2021-06-22)             Copyright © Huawei Technologies Co., Ltd.                                            16
Bisheng Compiler
Autotuner Feature Guide                                                                           4 Usage

4.2.4.1 Running the Tuner
                 The format of the run instruction is as follows:
                 auto-tuner run  --search_space 

                 Mandatory parameters:
                 ●      config_file: tuning configuration file, which is used to configure the
                        compilation and running methods and related paths.
                 ●      --search_space : search space file, which is generated by the parse
                        instruction.
                 Common optional parameters:
                 ●      --results-log : log file, which is used to record the information
                        generated each time the optimal configuration is found.
                 ●      --results-log-details: log file, which is used to record information about
                        each iteration.
                 ●      --test-limit : maximum iterations
                 ●      --stop-after : The tuning is stopped after the specified time
                        (second).
                 ●      --time-after-convergence : If no better compilation configuration
                        is found after the specified time (second), the tuning is stopped.
                 Help information:
                 ●      --help/-h
                 positional arguments:
                  config_file      The tuning config file.

                 optional arguments:
                  -h, --help          show this help message and exit
                  --machine-class MACHINE_CLASS
                                  name of the machine class being run on
                  --parallel-compile present if compiling can be done in parallel
                  --test-limit TEST_LIMIT
                                  stop tuning after given tests count
                  --stop-after STOP_AFTER
                                  stop tuning after given seconds
                  --parallelism PARALLELISM
                                  how many tests to support at once
                  --pipelining PIPELINING
                                  how long a delay (in generations) before results are
                                  available
                  --bail-threshold BAIL_THRESHOLD
                                  abort if no requests have been made in X generations
                  --no-dups            don't print out warnings for duplicate requests
                  --seed-configuration FILENAME
                                  Start search at a given configuration. Can be
                                  specified multiple times. Configurations are loaded
                                  with ConfigurationManipulator.load_from_file() and
                                  file format is detected from extension.
                  --results-log RESULTS_LOG
                                  file to store log of the best configuration times
                  --results-log-details RESULTS_LOG_DETAILS
                                  file to store log of the non-best configuration times
                  --quiet            print less information
                  --display-frequency DISPLAY_FREQUENCY
                                  how often for DisplayPlugin to print
                  --technique TECHNIQUE, -t TECHNIQUE
                                  which technique to use
                  --list-techniques, -lt

Issue 05 (2021-06-22)            Copyright © Huawei Technologies Co., Ltd.                            17
Bisheng Compiler
Autotuner Feature Guide                                                                                              4 Usage

                                    list techniques available and exit
                  --generate-bandit-technique, -gbt
                                    randomly generate a bandit to use
                  --label LABEL           name for the TuningRun
                  --print-search-space-size
                                    Print out the estimated size of the search space and
                                    exit
                  --database DATABASE database to store tuning results in, see: http://docs.
                                    sqlalchemy.org/en/rel_0_8/core/engines.html#database-
                                    urls
                  --print-params, -pp show parameters of the configuration being tuned
                  --time-after-convergence TIME, -tac TIME
                                    stop tuning if no new best results after given
                                    seconds
                  -o DIR, --output DIR write " "optimal yaml config into the given directory
                  --parse-format [{xml,yaml}]
                                    choose the format of LLVM auto-tuning-
                                    input/opp,(default: yaml)
                  --plugin-dir DIR        specify the dir to load customized tuner scripts
                  -tr TUNER, --tuner TUNER
                                    Select which tuner to use
                  -lr, --list-tuners List all available tuners
                  --add-llvm-inputs ADD_LLVM_INPUTS [ADD_LLVM_INPUTS ...]
                                    add existing llvm configuration input files
                                    asconstants in addition to the llvm
                                    configurations generated in each iteration of the
                                    tuning run
                  -ss SEARCH_SPACE, --search_space SEARCH_SPACE
                                    The search space file.
                  --enable-final-compile
                                    perform final compilation with optimal config at the
                                    end of tuning

4.2.4.2 Configuration File
                 You need to modify the configuration file, including the system environment
                 variable, compilation information, and running information. For details, see the
                 examples in the Bisheng software package directory /lib/autotuner/config.
                 The following is an example of the configuration file for coremark tuning:
                 # variables that can be shared in all the sections below
                 [DEFAULT] # optional
                 # Home = /path/to/your/home

                 # change your environment variables
                 [Environment Setting] # optional
                 # prepend a list of paths into the PATH in order.
                 # PATH = /path/to/bin
                 # you can also set other environment variables here too.

                 [Compiling Setting] # required
                 # NOTE: ConfigFilePath is set to the path to the current config file automatically by default.
                 CompileDir = %(ConfigFilePath)s/../examples/coremark/

                 # Specify where autotuner will generate the compilation config (LLVM input file).
                 # This will be passed to the compiler with -auto-tuning-input.
                 LLVMInputFile = %(CompileDir)s/input.yaml

                 BinPath = %(ConfigFilePath)s/../../../bin/
                 CompileCommand = %(BinPath)s/clang -Ilinux64 -I. -DFLAGS_STR=\"" -lrt"\" -DITERATIONS=300000 -g
                 core_list_join.c core_main.c core_matrix.c core_state.c core_util.c linux64/core_portme.c -O2 -o coremark -
                 mllvm -auto-tuning-input=%(LLVMInputFile)s

                 RunDir = %(CompileDir)s
                 RunCommand = ./coremark 0x0 0x0 0x66 300000 # run 300000 iterations for coremark

Issue 05 (2021-06-22)            Copyright © Huawei Technologies Co., Ltd.                                                     18
Bisheng Compiler
Autotuner Feature Guide                                                                                          4 Usage

                 # OppDir and OppCompileCommand are optional, do not have to specify this if not using auto_run sub-
                 command.
                 # Specify where autotuner will parse tuning opportunity files from.
                 # This should be set to where the compiler generate tuning opportunity files with -auto-tuning-opp.
                 OppDir = %(CompileDir)s/opp

                 # both -auto-tuning-input and -mllvm -auto-tuning-opp=opp need to be used in the
                 OppCompileCommand directly or indirectly.
                 # -auto-tuning-input is also needed here because auto_run can invoke multiple stages of tuning runs. The
                 later stage needs to take the previous stage's best config to generate tuning opportunities.
                 OppCompileCommand = %(CompileCommand)s -mllvm -auto-tuning-opp=%(OppDir)s

4.2.4.3 Tuners
                 A tuner is an instance used to define specific tuning behavior, including
                 initialization, compilation, running, and testing. The behavior needs to be defined
                 in different ways depending on the specific tuning task objectives. Therefore, we
                 have multiple tuners for different objectives. You can find the sample file of the
                 customized tuner in the Bisheng software package directory /lib/autotuner/
                 plugin/.

                 ●      Create a customized tuner.
                        You can write a Python file to inherit the parent class CustomTunerBase and
                        overwrite some functions as required to create a customized tuner. To register
                        a customized tuner, you need to name the Python file xxx_tuner.py with the
                        suffix _tuner.py and place the file in the tuner plug-in directory.
                 ●      Use your own tuner plug-in.
                        If you need to use your own tuner plug-in when running the auto-tuner
                        instruction, use the following option to specify the plug-in directory where the
                        user-defined tuner is located:
                        --plugin-dir 

                 ●      Select the tuner you want to use.
                        --tuner(or -tr) 

                        If you do not specify the tuner to be used, SimpleTuner is used by default.
                 ●      Check all tuners.
                        If you want to check all tuners, run the following instruction to list all tuners:
                        --list-tuners (or -lr)

                 The following is an example of a customized tuner for coremark tuning.
                 import os

                 from opentuner import Result
                 from opentuner.search.objective import MinimizeCycle
                 from autotuner.tuners.tunerbase import CustomTunerBase

                 class Tuner(CustomTunerBase):

                     # The run method runs opentuner under the given configuration
                     # and returns the calculated performance under this configuration
                     def run(self, desired_result, input, limit):
                       """
                       Compile and run a given configuration then
                       return performance
                       """
                       cycles = float('inf')

                        # create a command for running a executable

Issue 05 (2021-06-22)              Copyright © Huawei Technologies Co., Ltd.                                            19
Bisheng Compiler
Autotuner Feature Guide                                                                                                      4 Usage

                        run_result = self.call_program(self.run_cmd, cwd=self.run_dir, limit=120)

                        # check if the source program is compiled and run successful
                        if run_result['returncode'] == 0:
                            std = run_result['stdout']
                            if "Correct operation validated." in std:
                                cycles_line = std.strip().splitlines()[2]

                             cycles = int(cycles_line.replace('Total ticks     :', ''))
                          else:
                             if not os.path.isdir('errors_log'):
                                 os.mkdir('errors_log')

                             with open("errors_log/errors_" + str(desired_result.configuration.id) + ".log", 'w') as file:
                                file.write(std)
                             print('coremark errors detected')
                        else:
                           self._print_errors(self.run_cmd, run_result)

                        return Result(cycle=cycles, time=run_result['time'])

                     def objective(self):
                       """
                       Override the default object MinimizeTime
                       """
                       return MinimizeCycle()

                 To automatically tune the coremark, you need to run the executable file, parse the
                 stdout result, and use cycle as the metric. Therefore, run() and objective() need
                 to be overwritten from the parent class. For more detailed examples, see the
                 scripts in the release package directory plugin/. Currently, the following metrics
                 are supported:
                 ●      time (required)
                 ●      cycle (optional)
                 ●      rate (optional)
                 The metrics need to be transferred with Result as the return value of the run()
                 function. For example:
                 return Result(rate=rate, time=run_result['time'])

                 The tuning objectives corresponding to the three metrics are as follows:
                 ●      MinimizeTime()
                 ●      MinimizeCycle()
                 ●      MaximizeRate()
                 For example, if MinizeTime() is used as the tuning objective, the smaller the
                 Result.time value obtained after the run() function is executed in each iteration,
                 the better the compilation configuration used in this iteration.
                 If MaxmizeRate() is used as the tuning objective, the greater the Result.rate
                 value obtained after the run() function is executed in each iteration, the better the
                 compilation configuration used in this iteration.

4.2.4.4 Search Space File
                 The search space file is a necessary parameter of the run instruction. It defines the
                 detailed search space (such as the code regions and parameters) for the tuning
                 task. The file can be generated from the tuning opportunity list generated by the
                 compiler using the parse instruction.

Issue 05 (2021-06-22)              Copyright © Huawei Technologies Co., Ltd.                                                     20
Bisheng Compiler
Autotuner Feature Guide                                                                         4 Usage

                         NOTE

                        To specify a search space, use --search-space or -ss.
                        Example: -ss SEARCH_SPACE_FILE

                 The following is an example of a search space file in YAML format:
                 code_region:
                   code_region_type: loop
                   debug_loc:
                     column: 13
                     file_name: core_list_join.c
                     line: 453
                   func_name: core_list_init
                   name: while.cond7.i.outer
                   pass_name: loop-unroll
                 params:
                   PeelCount:
                     type: enum
                     value: [0,1]
                   UnrollCount:
                     type: enum
                     value: [0,1,2,4,8]
                   VectorizationInterleave:
                     type: enum
                     value: [1,2,4]
                 tuning_id: 1
                 ---
                 code_region:
                   code_region_type: loop
                   debug_loc:
                     column: 13
                     file_name: core_list_join.c
                     line: 443
                   func_name: core_list_init
                   name: for.body.i
                   pass_name: loop-vectorize
                 params:
                   PeelCount:
                     type: enum
                     value: [0, 1]
                     -0
                     -1
                   UnrollCount:
                     type: enum
                     value: [0,1,2,4,8]
                   VectorizationInterleave:
                     type: enum
                     value: [1,2,4]
                 tuning_id: 2

                 It is very similar to the search configuration file, except that each specific code
                 region corresponds to a set of parameters.

4.2.4.5 Algorithm
                 You can specify a search algorithm to run automatic tuning.

                 For example, if the automatic tuning function is used for debugging, you can use
                 the SimpleTraverse algorithm, which traverses all parameter values and can
                 change only one parameter value at a time.

                 ●      List all algorithms.
                        --list-techniques

Issue 05 (2021-06-22)             Copyright © Huawei Technologies Co., Ltd.                            21
Bisheng Compiler
Autotuner Feature Guide                                                                                              4 Usage

                 ●      Use a specific algorithm (for example, SimpleTraverse).
                        --technique SimpleTraverse

4.2.4.6 Run Example
                 Run the following command as an example:
                 auto-tuner run config/coremark_sample.ini --plugin-dir ./plugin-dir -tr coremark_tuner --results-log
                 coremark.log --results-log-details details.log --stop-after 3600 --time-after-convergence 600 -ss
                 search_space.yaml

                 The parameters are described as follows:

                 ●      coremark_sample.ini: tuning configuration file
                 ●      --plugin-dir ./plugin-dir: defines the customized plug-in directory.
                 ●      coremark_tuner: specifies the customized tuner stored in the plug-in
                        directory ./plugin-dir.
                 ●      --results-log coremark.log: records the performance information of the
                        optimal configuration found in each iteration.
                 ●      --results-log-details details.log: records performance information about
                        each iteration.
                 ●      --stop-after 3600: The tuning stops after 3600 seconds.
                 ●      --time-after-convergence 600: The tuning stops if no better configuration is
                        found after 600 seconds.
                 ●      -ss search_space.yaml: uses search_space.yaml as the tuning space file.

                 After the debugging is complete, the optimal configuration is generated as
                 opt_config.yaml. You can use the -o option to customize the name of the optimal
                 configuration file.

                 You can add the Bisheng compiler option -mllvm -auto-tuning-
                 input=opt_config.yaml to this configuration file to make it take effect and
                 generate the optimal binary file.

                 For example, to compile the coremark, run the following command:
                 clang -Ilinux64 -I. -DFLAGS_STR=\"" -lrt"\" -DITERATIONS=300000 -g core_list_join.c core_main.c
                 core_matrix.c core_state.c core_util.c linux64/core_portme.c -O2 -o coremark -mllvm -auto-tuning-
                 input=opt_config.yaml

4.2.5 Auto-run Instruction

4.2.5.1 Usage of the Auto-run Instruction
                 The auto-run instruction is similar to the run instruction, but it automatically
                 generates a search space instead of transferring the search space file through the
                 command line.

                         NOTE

                        This function requires some additional settings in the configuration file, such as config/
                        coremark.sample.ini.

                 The format of the auto-run instruction is as follows:

Issue 05 (2021-06-22)             Copyright © Huawei Technologies Co., Ltd.                                              22
Bisheng Compiler
Autotuner Feature Guide                                                                          4 Usage

                 auto-tuner auto_run 

                 Mandatory parameter:
                 ●      config_file: tuning configuration file, which is used to configure the
                        compilation and running methods and related paths.
                 Common optional parameter:
                 ●      --stage-order : specifies the sequence of tuning phases. The default
                        sequence is module -> function -> loop. For example, use --stage-order
                        function loop to perform fine-grained function-level tuning and then cyclic
                        tuning.
                 positional arguments:
                  config_file      The tuning config file.

                 optional arguments:
                  -h, --help          show this help message and exit
                  --machine-class MACHINE_CLASS
                                  name of the machine class being run on
                  --parallel-compile present if compiling can be done in parallel
                  --test-limit TEST_LIMIT
                                  stop tuning after given tests count
                  --stop-after STOP_AFTER
                                  stop tuning after given seconds
                  --parallelism PARALLELISM
                                  how many tests to support at once
                  --pipelining PIPELINING
                                  how long a delay (in generations) before results are
                                  available
                  --bail-threshold BAIL_THRESHOLD
                                  abort if no requests have been made in X generations
                  --no-dups            don't print out warnings for duplicate requests
                  --seed-configuration FILENAME
                                  Start search at a given configuration. Can be
                                  specified multiple times. Configurations are loaded
                                  with ConfigurationManipulator.load_from_file() and
                                  file format is detected from extension.
                  --results-log RESULTS_LOG
                                  file to store log of the best configuration times
                  --results-log-details RESULTS_LOG_DETAILS
                                  file to store log of the non-best configuration times
                  --quiet            print less information
                  --display-frequency DISPLAY_FREQUENCY
                                  how often for DisplayPlugin to print
                  --technique TECHNIQUE, -t TECHNIQUE
                                  which technique to use
                  --list-techniques, -lt
                                  list techniques available and exit
                  --generate-bandit-technique, -gbt
                                  randomly generate a bandit to use
                  --label LABEL         name for the TuningRun
                  --print-search-space-size
                                  Print out the estimated size of the search space and
                                  exit
                  --database DATABASE database to store tuning results in, see: http://docs.
                                  sqlalchemy.org/en/rel_0_8/core/engines.html#database-
                                  urls
                  --print-params, -pp show parameters of the configuration being tuned
                  --time-after-convergence TIME, -tac TIME
                                  stop tuning if no new best " "results after given
                                  seconds
                  -o DIR, --output DIR write " "optimal yaml config into the given directory
                  --parse-format [{xml,yaml}]
                                  choose the format of LLVM auto-tuning-
                                  input/opp,(default: yaml)
                  --stage-order stage [stage ...]
                                  specify stage order of auto_run. each stage is a code

Issue 05 (2021-06-22)            Copyright © Huawei Technologies Co., Ltd.                           23
Bisheng Compiler
Autotuner Feature Guide                                                                                          4 Usage

                                    region type
                  -nf Name [Name ...], --name-filter Name [Name ...]
                                    to filter code regions by names when generating search
                                    space
                  --func-name-filter Name [Name ...]
                                    to filter code regions by function names when
                                    generating search space
                  --file-name-filter Name [Name ...]
                                    to filter code regions by file names when generating
                                    search space
                  -scf SEARCH_CONFIG_FILE, --search-config-file SEARCH_CONFIG_FILE
                                    The Search space config file
                  --plugin-dir DIR        specify the dir to load customized tuner scripts
                  -tr TUNER, --tuner TUNER
                                    Select which tuner to use
                  -lr, --list-tuners List all available tuners
                  --add-llvm-inputs ADD_LLVM_INPUTS [ADD_LLVM_INPUTS ...]
                                    add existing llvm configuration input files
                                    asconstants in addition to the llvm
                                    configurationsgenerated in each iteration of the
                                    tuning run

                 The auto-run instruction also automatically performs code region tuning based on
                 different granularities. That is, the auto-run instruction executes three tuning tasks
                 at different code region levels in sequence. The working mode is as follows:

                 In each phase, the optimal configuration found in the previous phase is used as
                 the constant configuration in the next phase, and the tuning task is executed at a
                 finer code region level and corresponding tuning parameters.
                 When each tuning phase is complete, the optimal configuration file corresponding
                 to each phase is generated for the compiler to use. Similar to the run instruction,
                 the optimal configuration file generated by this instruction can take effect by
                 adding the -mllvm -auto-tuning-input=< file path > option of the Bisheng
                 compiler.

                         NOTE

                        All the command line options contained in the run subcommand will be invoked three
                        times in turn in auto_run, because it has three tuning runs. For example, if you use the --
                        stop-after 10 option to stop the tuning 10 seconds later, the auto-run instruction will stop
                        in 30 seconds because there are three phases.

4.2.5.2 Auto-run Example
                 Run the following command as an example:
                 auto-tuner auto_run config/coremark_sample.ini -tr coremark_tuner --results-log coremark.log --results-
                 log-details details.log --time-after-convergence 600

                 The auto_run instruction is similar to the run command. The difference is that the
                 auto_run instruction does not require the search space. Similarly, you can use the
                 specified filter to generate a search space, just like the parse instruction.
                 In this example, the optimal configuration files module.yaml, function.yaml, and
                 loop.yaml corresponding to the three tuning phases are generated by default.

Issue 05 (2021-06-22)            Copyright © Huawei Technologies Co., Ltd.                                             24
Bisheng Compiler
Autotuner Feature Guide                                                                                              4 Usage

                 Select the optimal configuration file for compilation as required. Generally, you are
                 advised to use the last tuning phase configuration file, because it contains all the
                 configuration information of the previous tuning phases.
                 clang -Ilinux64 -I. -DFLAGS_STR=\"" -lrt"\" -DITERATIONS=300000 -g core_list_join.c core_main.c
                 core_matrix.c core_state.c core_util.c linux64/core_portme.c -O2 -o coremark -mllvm -auto-tuning-
                 input=loop.yaml

Issue 05 (2021-06-22)           Copyright © Huawei Technologies Co., Ltd.                                                25
Bisheng Compiler
Autotuner Feature Guide                                                                    5 Appendix

                                                                      5          Appendix

                 5.1 Feedback
                 5.2 Change History

5.1 Feedback
                 If you encounter any problem and need technical support, send the problem
                 information to the Kunpeng compiler forum.

5.2 Change History
                  Date                     Change History

                  2021-06-22               This is the fifth official release. The update is as
                                           follows:
                                           Updated the description of using the Autotuner.

                  2020-12-12               This is the fourth official release. The update is as
                                           follows:
                                           Added the description of the llvm-autotune tool.

                  2020-11-26               This is the third official release. The update is as
                                           follows:
                                           Added the parameter description of the instructions
                                           in Chinese.
                                           Added the working mode diagram of the auto-run
                                           instruction.

                  2020-10-29               This is the second official release. The update is as
                                           follows:
                                           Updated the Autotuner tuning flowchart.

                  2020-09-28               This is the first official release.

Issue 05 (2021-06-22)       Copyright © Huawei Technologies Co., Ltd.                              26
You can also read