Non-linear Arithmetic and Geometry for Program Optimization and Compiling

Given the ever evolving processor architectures, a strong research effort is needed in order to improve the program compiling process. The new processors appearing on today's market have such hardware capabilities that development and code generation softwares become more and more sollicitated to exploit these capabilities. One can reasonably expect the intensification of this situation over the next few years, and the following problem are relevant:

  • Memory accesses: the processors clock frequency increase exceeds by far the memory clock frequency increase, leading to a more and more critical bottleneck. Code generation without any attention paid to this problem, where data references are organized exclusively according to the logic of the algorithm, should not be allowed anymore.
  • Instruction parallelism: new generation processors (superscalar, VLIW, EPIC) allow more and more intruction parallelism even increasing the memory bottleneck. Moreover the processor is divided into clusters of functional units in order to feed these many functional units. Intra-processor communication management has then to be considered.
  • Power consumption: embedded systems have a limited amount of electrical power. Its consumption must be directed by minimizing the use of instructions with high energy needs (I/O instructions for example).
  • The amount of memory: it is also very limited in embedded systems for space and cost reasons. It is necessary to minimize the total amount of memory used by program instructions and data.
  • Analysis accuracy: nowadays compilers are generating their executable code from a source analysis process which is still very restricted. For example, dependency analysis between instructions is not precise and does not handle all data structures (pointers, data references expressed as complex functions, ...). It is necessary to implement symbolic methods in compilers, dedicated to the consideration of more complex functions.

Most of these problems can be modeled by a system of parameterized equations or inequations. Such a system is then considered in order to find a solution, which can be optimal relatively to an objective function, or in order to characterize or count the solutions, or in order to find parameters values for which the system has at least one solution.

In the restricted case of linear parameterized equations or inequations, most of these problems are today well considered through a geometrical model called the polytope model.

Between the control structures of programs, nested loops, or loop nests, involve some most expensive and interesting parts of programs for several reasons: large number of computed instructions and accessed data, important execution time and parallelism degree. Many works dealing with automatic parallelisation and optimization have focused on these structures. Several mathematic methods have been proposed dealing with loop nests whose indices bounds and data references are expressed as affine functions of the loop indices. Some other works were concerned with iterative structures with dynamic behaviour and speculative execution.

Intensive use of mathematical methods on system of integer linear inequalities has to be made for precise static analysis of these structures. The today well-known polytope model of loop nests translates analysis and optimization problems of programs into geometrical and arithmetical problems using polyhedra transformations. Many features have been implemented in the polyhedral library Polylib which was initiated at IRISA. Some of its recent extensions have been proposed by our team.

The kind of mathematical problem we are able to consider at this time are characterized by affine parameterized systems of inequalities. But a contribution to the objectives described at the beginning of this document implies a major extension in the kind of considered equations to the set of multi-variate polynomials.

A multi-variate polynomial model a wide application field to program analysis and optimisation. Between these applications, some of them are today clearly identified. The following can be mentioned:

  • dedicated memory access optimization to a given architecture: a memory hierarchy is characterized by several parameters such as the number of hierarchy levels, level sizes, cache associativities, cache line sizes, access latencies. Modeling mathematically memory data loads results in non-linear equations. For example, the index of a cache line where an N*N array element a(i,j) is stored is expressed as ((Ni+j) div t) mod nc, where nc denotes the number of cache lines and t the number of data per line.
  • determination of a linear iteration schedule: loop nest parallelization in the polytope model considers iteration schedules defined by affine functions on the loop indices and associating execution instants to each iteration. For an iteration I=(i1,...,id), its execution instant is given by a linear function of the indices. Conversely, finding a schedule according to some available hardware ressources translates to a polynomial equation: finding values of the coefficients in the linear function such that the number of simultaneously computed iterations, or the number of calculation steps, is bounded according to a maximum value.
  • extension of the set of considered program structures: the polytope model only handles loop nests whose indices are bounded by affine functions of the outer loop indices, and whose array element references are also expressed as affine functions on the indices. These affine functions can be parameterized: parameters occur in their constant terms only. An extension to polynomial functions would allow analysis tools developement dedicated to a larger set of programs where bounds and indices are expressed as more general functions, and where parameters can occur as coefficients. Notice that polynomials with variable exponents are excluded at this time, since they are not associated to any known optimization problem. Moreover, they are relevant to a quite larger mathematical domain.

It is important to notice that the characteristics of the modeled problem induce particular polynomials:

  • only rational coefficients occur in the considered polynomials,
  • whose values are only integers.

Hence our objectives can be summarized to considering equation or inequation systems of multi-variate polynomials in order to determine:

  • one integer solution, or the integer solution of an optimization problem,
  • the number of integer solutions,
  • minimum or maximum integer values of polynomials over a domain.

Several investigation directions have to be explored. These can be distinguished by the type of the approach: exact or approximate methods.

Exact methods

General algebraic methods

Many works are done on polynomial equalities and inequalities using different approaches based on more or less recent mathematical results. The first realist methods and software implementations have been made possible due to Buchberger who discovered Gröbner bases in 1965. Several improvements have been brought since then in order to provide efficient software implementations.

Our objective is to evaluate the opportunity of applying algebraic methods to our polynomial integer problems: computation of the real or complex solutions and selection of the integer ones, adaptation of the approach to integer numbers and to the kind of considered polynomials, ...

Ehrhart polynomial theory inversion

Most of the polynomial problems issued from our program optimization objectives take place in the polytope model. For example, symbolic evaluation of memory or processor ressources translates into Ehrhart polynomials. But the program analysis process should go further by using such a symbolic evaluation as an entry to an additional analysis step. It should also be used as a parameter conditioning some program transformations.

Since Ehrhart polynomials represent the exact number of integer points contained within a parameterized polytope or a union of parameterized polytopes, the approach we would like to explore considers any polynomial as the Ehrhart polynomial of a parameterized polytope or a union of parameterized polytopes. Hence polynomial roots finding would translate into finding an occurence of a polytope which does not contain any integer point. In the same way, verification of a polynomial inequality would translate into identifying polytope occurences associated to integer volumes that are greater (or less) than a given value.

Handling systems composed with such polynomials would consider simultaneously several polytopes. The involved mathematical objectives are the followings:

  • from a given multi-variate polynomial, find a parameterized polytope or a union of parameterized polytope whose integer volume equals the polynomial,
  • translate polynomial equation and inequations into geometrical properties,
  • propose a geometrical model representing the simultaneous verification of several polynomial constraints.
Approximating methods - Symbolic Bernstein Expansion

Polynomial equations are curves that can be approximated by a sequence of less complex geometrical objects. Approximating such curves by line segment sequences would allow to transform the initial non-linear problem into a linear one. Since we are only interested by the set of integer points bounded with such curves, even exact results can be expected: approximation errors should be controlled such that only sets containing no integer points can be lost. This approach is close to the one proposed by Ehrhart for some particular systems with two variables and one parameter. Approximation theory proposes many methods that will have to be evaluated and adapted to our problems.

Women

We propose an extension of the theory of Bernstein expansion to handle parameterized multivariate polynomial expressions. Bernstein expansion allows bounding the range of a multivariate polynomial considered over a box. Numerical applications of this theory have been proposed to the resolution of system of strict polynomial inequalities. It has been shown that Bernstein expansion is generally more accurate than classic interval methods. Moreover, Stahl has shown that for sufficiently small boxes, the exact range is obtained.

We generalize Bernstein expansion by considering parameterized multivariate polynomials for which ranges are bound by polynomials over the parameters. Sufficient conditions over the parameters for strict parameterized inequalities solutions can therefore be constructed.

Bernstein polynomials are particular polynomials that form a basis for the space of polynomials. Hence any polynomial can be expressed into this basis through coefficients, the Bernstein coefficients, that have interesting properties and that can be computed through a direct formula. Due to the Bernstein convex hull property, the value of the polynomial is then bounded by the values of the minimum and maximum Bernstein coefficients. The direct formula allows symbolic computation of these Bernstein coefficients giving a supplementary interest to the use of this theory. Another main interesting consequence is that the involved computations have quite low complexity. Moreover an appropriate instrumentation of this model allows automatic and inexpensive resolutions of complex program analysis issues.

Related publications

Ph. Clauss and I. Tchoupaeva A Symbolic Approach To Bernstein Expansion (in russian), in Programmirovanie (Programming and Computer Software), Vol. 30, No. 3, 2004, Russian Academy of Sciences, Nauka, Moscow.

Ph. Clauss and Irina Tchoupaeva, A Symbolic Approach to Bernstein Expansion for Program Analysis and Optimization, 13th International Conference on Compiler Construction, CC 2004, LNCS 2985, pp 120-133, Springer, April 2004, Barcelona, Spain.


 


 

Program Compilation & Optimization Resources

archi-compil.org v 4_3