Configuring LLVM

LLVM dynamic compilation can be used to generate customized machine code for each query to replace original common functions. The query performance is improved by reducing redundant judgment condition and virtual function invocation, and make local data more accurate during actual queries.

LLVM needs to consume extra time to pre-generate intermediate representation (IR) and compile it into code. Therefore, if the data volume is small or if a query itself consumes little time, LLVM actually does more harm than good.

LLVM Application Scenarios and Constraints

Applicable Scenarios

Non-Applicable Scenarios

Other Factors Impacting LLVM Performance

The result of LLVM optimization depends not only on operations and computation in the database, but also on the hardware environment.

Recommended Usage of LLVM

LLVM is enabled in the database kernel by default, and users can configure it based on the analysis above. The overall suggestions are as follows:

  1. Set an appropriate value for work_mem and set it as large as possible. If much data is flushed to disks, you are advised to disable LLVM dynamic compilation and optimization by setting enable_codegen to off.
  2. Set an appropriate value for codegen_cost_threshold (The default value is 10,000). Ensure that LLVM dynamic compilation and optimization is not used when the data volume is small. After the value is set, if the database performance deteriorates due to the use of LLVM dynamic compilation and optimization, increase the value.
  3. If a large number of C- functions are invoked, you are advised to disable LLVM dynamic compilation and optimization.
  4. The constants following the In expression cannot exceed 10. Otherwise, LLVM compilation and optimization cannot be performed.

    If resources are sufficient, the database performance will improve as the data volume increases.