Quantcast
Channel: What are the ways compilers recognize complex patterns? - Programming Language Design and Implementation Stack Exchange
Viewing all articles
Browse latest Browse all 6

Answer by Moonchild for What are the ways compilers recognize complex patterns?

$
0
0

canonicalisation (c.f. my other answer) is indeed relevant. it is no panacea—we cannot 'reduce any function into some "canonical form"' due to godel, rice, et al, and even for the cases where that is possible, it is often computationally infeasible—but in general, if we have an optimising rewrite x -> y, and x' is an alternate form of x which will by means of other transformations be canonicalised into it, then there is no need to also express the rewrite x' -> y

there is also a a degree to which idioms make their way into the cultural canon, so there will be a particular way of writing a function (like popcnt, as you showed) which it's understood contemporary compilers will reduce to a single instruction if possible. it's debatable the extent to which this is a good idea, but it is a thing

abstract interpretation can be helpful. in general, an optimising rewrite may be predicated not just on the syntactic structure of an expression, but also on facts that are known about individual terms. consider a much simpler example, in the form of the following two expressions (& here means bitwise logical and):

if x < 1 then x < 4 else ...let x = y & 3 in x < 4

in both of these cases, the subexpression 'x < 4' can be reduced to 'true'. we could express this with a pair of rewrites: first 'if x < P then C[x < Q] else ...' -> 'if x < P then C[true] else ...' if Q <= P; second '(x & P) < Q' -> 'true' if P < Q. you can see how this would blow up quickly, though (and i already had to cheat in order to allow x < Q to be a subexpression in the conditional case)

on the other hand, we could track upper and lower bounds for every term (the abstract domain of intervals). these are broadly useful facts to know. both bitwise and and if can create and propagate those facts as appropriate, and they can be consumed by a single rewrite: 'x < P' -> 'true' if upperbound(x) < P

take abstract interpretation a bit further and you get symbolic abstract domains, which know about relationships between terms. this blogpost suggests that msvc is doing something similar in order to recognise byteswaps and bitreverses. rather than recognise a specific code sequence (like with popcnt), it tries to track the general case when one term is known to be a permutation of the bits in another term. in the case when a term is the bswap permutation of another's bits, it can be directly replaced with a bswap instruction; notably, this requires absolutely no inspection into the structure of the concrete source term, only the abstract state. although this cannot work perfectly in every case (see again godel et al), it will in practice be quite insensitive to the way you happen to write your bswap

(i say 'suggests' because the blogpost is not entirely clear—it could be canonicalising to concrete 'bit permute' instructions and matching on that instead. i could see an argument for going both ways; both would work)


Viewing all articles
Browse latest Browse all 6

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>