86921239 2015-08-23
只要你和程序打交道,了解编译器架构就会令你受益无穷——无论是分析程序效率,还是模拟新的处理器和操作系统。通过本文介绍,即使你对编译器原本一知半解,也能开始用LLVM,来完成有意思的工作。
LLVM是一个好用、好玩,而且超前的系统语言(比如C和C++语言)编译器。
当然,因为LLVM实在太强大,你会听到许多其他特性(它可以是个JIT;支持了一大批非类C语言;还是App Store上的一种新的发布方式等等)。这些都是真的,不过就这篇文章而言,还是上面的定义更重要。
下面是一些让LLVM与众不同的原因:
是,LLVM是一款酷炫的编译器,但是如果不做编译器研究,还有什么理由要管它?
答:只要你和程序打交道,了解编译器架构就会令你受益,而且从我个人经验来看,非常有用。利用它,可以分析程序要多久一次来完成某项工作;改造程序,使其更适用于你的系统,或者模拟一个新的处理器架构或操作系统——只需稍加改动,而不需要自己烧个芯片,或者写个内核。对于计算机科学研究者来说,编译器远比他们想象中重要。建议你先试试LLVM,而不用hack下面这些工具(除非你真有重要的理由):
就算一个编译器不能完美地适合你的任务,相比于从源码到源码的翻译工作,它可以节省你九成精力。
下面是一些巧妙利用了LLVM,而又不是在做编译器的研究项目:
重要的话说三遍:LLVM不是只用来实现编译优化的!LLVM不是只用来实现编译优化的!LLVM不是只用来实现编译优化的!
LLVM架构的主要组成部分如下(事实上也是所有现代编译器架构):
前端,流程(Pass),后端
下面分别来解释:
虽然当今大多数编译器都使用了这种架构,但是LLVM有一点值得注意而与众不同:整个过程中,程序都使用了同一种中间表示。在其他编译器中,可能每一个流程产出的代码都有一种独特的格式。LLVM在这一点上对hackers大为有利。我们不需要担心我们的改动该插在哪个位置,只要放在前后端之间某个地方就足够了。
让我们开干吧。
首先需要安装LLVM。Linux的诸发行版中一般已经装好了LLVM和Clang的包,你直接用便是。但你还是需要确认一下机子里的版本,是不是有所有你要用到的头文件。在OS X系统中,和XCode一起安装的LLVM就不是那么完整。还好,用CMake从源码构建LLVM也没有多难。通常你只需要构建LLVM本身,因为你的系统提供的Clang已经够用(只要版本是匹配的,如果不是,你也可以自己构建Clang)。
具体在OS X上,Brandon Holt有一个不错的指导文章。用Homebrew也可以安装LLVM。
你需要对文档有所了解。我找到了一些值得一看的链接:
isa
、cast
和dyn_cast
),这些你不管在哪都要跑。 使用LLVM来完成高产研究通常意味着你要写一些自定义流程。这一节会指导你构建和运行一个简单的流程来变换你的程序。
我已经准备好了模板仓库,里面有些没用的LLVM流程。我推荐先用这个模板。因为如果完全从头开始,配好构建的配置文件可是相当痛苦的事。
首先从GitHub上下载llvm-pass-skeleton仓库:
<span class="pln">$ git clone git@github</span><span class="pun">.</span><span class="pln">com</span><span class="pun">:</span><span class="pln">sampsyo</span><span class="pun">/</span><span class="pln">llvm</span><span class="pun">-</span><span class="kwd">pass</span><span class="pun">-</span><span class="pln">skeleton</span><span class="pun">.</span><span class="pln">git</span>
主要的工作都是在skeleton/Skeleton.cpp
中完成的。把它打开。这里是我们的业务逻辑:
<span class="kwd">virtual</span><span class="kwd">bool</span><span class="pln"> runOnFunction</span><span class="pun">(</span><span class="typ">Function</span><span class="pun">&</span><span class="pln">F</span><span class="pun">)</span><span class="pun">{</span>
<span class="pln">errs</span><span class="pun">()</span><span class="pun"><<</span><span class="str">"I saw a function called "</span><span class="pun"><<</span><span class="pln"> F</span><span class="pun">.</span><span class="pln">getName</span><span class="pun">()</span><span class="pun"><<</span><span class="str">"!\n"</span><span class="pun">;</span>
<span class="kwd">return</span><span class="kwd">false</span><span class="pun">;</span>
<span class="pun">}</span>
LLVM流程有很多种,我们现在用的这一种叫函数流程(function pass)(这是一个不错的入手点)。正如你所期望的,LLVM会在编译每个函数的时候先唤起这个方法。现在它所做的只是打印了一下函数名。
细节:
通过CMake来构建这个流程:
<span class="pln">$ cd llvm</span><span class="pun">-</span><span class="kwd">pass</span><span class="pun">-</span><span class="pln">skeleton </span>
<span class="pln">$ mkdir build </span>
<span class="pln">$ cd build </span>
<span class="pln">$ cmake </span><span class="pun">..</span><span class="com"># Generate the Makefile. </span>
<span class="pln">$ make </span><span class="com"># Actually build the pass.</span>
如果LLVM没有全局安装,你需要告诉CMake LLVM的位置.你可以把环境变量LLVM_DIR
的值修改为通往share/llvm/cmake/
的路径。比如这是一个使用Homebrew安装LLVM的例子:
<span class="pln">$ LLVM_DIR</span><span class="pun">=</span><span class="str">/usr/</span><span class="kwd">local</span><span class="pun">/</span><span class="pln">opt</span><span class="pun">/</span><span class="pln">llvm</span><span class="pun">/</span><span class="pln">share</span><span class="pun">/</span><span class="pln">llvm</span><span class="pun">/</span><span class="pln">cmake cmake </span><span class="pun">..</span>
构建流程之后会产生一个库文件,你可以在build/skeleton/libSkeletonPass.so
或者类似的地方找到它,具体取决于你的平台。下一步我们载入这个库来在真实的代码中运行这个流程。
想要运行你的新流程,用clang
编译你的C代码,同时加上一些奇怪的flag来指明你刚刚编译好的库文件:
<span class="pln">$ clang </span><span class="pun">-</span><span class="typ">Xclang</span><span class="pun">-</span><span class="pln">load </span><span class="pun">-</span><span class="typ">Xclang</span><span class="pln"> build</span><span class="pun">/</span><span class="pln">skeleton</span><span class="pun">/</span><span class="pln">libSkeletonPass</span><span class="pun">.*</span><span class="pln"> something</span><span class="pun">.</span><span class="pln">c </span>
<span class="pln">I saw a </span><span class="kwd">function</span><span class="pln"> called main</span><span class="pun">!</span>
-Xclang -load -Xclang path/to/lib.so
这是你在Clang中载入并激活你的流程所用的所有代码。所以当你处理较大的项目的时候,你可以直接把这些参数加到Makefile的CFLAGS里或者你构建系统的对应的地方。
(通过单独调用clang
,你也可以每次只跑一个流程。这样需要用LLVM的opt命令。这是官方文档里的合法方式,但在这里我就不赘述了。)
恭喜你,你成功hack了一个编译器!接下来,我们要扩展这个hello world水平的流程,来做一些好玩的事情。
想要使用LLVM里的程序,你需要知道一点中间表示的组织方法。
模块(Module),函数(Function),代码块(BasicBlock),指令(Instruction)
模块包含了函数,函数又包含了代码块,后者又是由指令组成。除了模块以外,所有结构都是从值产生而来的。
首先了解一下LLVM程序中最重要的组件:
大部分LLVM中的内容——包括函数,代码块,指令——都是继承了一个名为值的基类的C++类。值是可以用于计算的任何类型的数据,比如数或者内存地址。全局变量和常数(或者说字面值,立即数,比如5)都是值。
这是一个写成人类可读文本的LLVM中间表示的指令的例子。
<span class="pun">%</span><span class="lit">5</span><span class="pun">=</span><span class="pln"> add i32 </span><span class="pun">%</span><span class="lit">4</span><span class="pun">,</span><span class="lit">2</span>
这个指令将两个32位整数相加(可以通过类型i32
推断出来)。它将4号寄存器(写作%4
)中的数和字面值2(写作2
)求和,然后放到5号寄存器中。这就是为什么我说LLVM IR读起来像是RISC机器码:我们甚至连术语都是一样的,比如寄存器,不过我们在LLVM里有无限多个寄存器。
在编译器内,这条指令被表示为指令C++类的一个实例。这个对象有一个操作码表示这是一次加法,一个类型,以及一个操作数的列表,其中每个元素都指向另外一个值(Value)对象。在我们的例子中,它指向了一个代表整数2的常量对象和一个代表5号寄存器的指令对象。(因为LLVM IR使用了静态单次分配格式,寄存器和指令事实上是一个而且是相同的,寄存器号是人为的字面表示。)
另外,如果你想看你自己程序的LLVM IR,你可以直接使用Clang:
<span class="pln">$ clang </span><span class="pun">-</span><span class="pln">emit</span><span class="pun">-</span><span class="pln">llvm </span><span class="pun">-</span><span class="pln">S </span><span class="pun">-</span><span class="pln">o </span><span class="pun">-</span><span class="pln"> something</span><span class="pun">.</span><span class="pln">c</span>
让我们回到我们正在做的LLVM流程。我们可以查看所有重要的IR对象,只需要用一个普适而方便的方法:dump()
。它会打印出人可读的IR对象的表示。因为我们的流程是处理函数的,所以我们用它来迭代函数里所有的代码块,然后是每个代码块的指令集。
下面是代码。你可以通过在llvm-pass-skeleton
代码库中切换到containers分支来获得代码。
<span class="pln">errs</span><span class="pun">()</span><span class="pun"><<</span><span class="str">"Function body:\n"</span><span class="pun">;</span>
<span class="pln">F</span><span class="pun">.</span><span class="kwd">dump</span><span class="pun">();</span>
<span class="kwd">for</span><span class="pun">(</span><span class="kwd">auto</span><span class="pun">&</span><span class="pln"> B </span><span class="pun">:</span><span class="pln"> F</span><span class="pun">)</span><span class="pun">{</span>
<span class="pln">errs</span><span class="pun">()</span><span class="pun"><<</span><span class="str">"Basic block:\n"</span><span class="pun">;</span>
<span class="pln">B</span><span class="pun">.</span><span class="kwd">dump</span><span class="pun">();</span>
<span class="kwd">for</span><span class="pun">(</span><span class="kwd">auto</span><span class="pun">&</span><span class="pln"> I </span><span class="pun">:</span><span class="pln"> B</span><span class="pun">)</span><span class="pun">{</span>
<span class="pln">errs</span><span class="pun">()</span><span class="pun"><<</span><span class="str">"Instruction: "</span><span class="pun">;</span>
<span class="pln">I</span><span class="pun">.</span><span class="kwd">dump</span><span class="pun">();</span>
<span class="pun">}</span>
<span class="pun">}</span>
使用C++ 11里的auto
类型和foreach语法可以方便地在LLVM IR的继承结构里探索。
如果你重新构建流程并通过它再跑程序,你可以看到很多IR被切分开输出,正如我们遍历它那样。
当你在找寻程序中的一些模式,并有选择地修改它们时,LLVM的魔力真正展现了出来。这里是一个简单的例子:把函数里第一个二元操作符(比如+,-)改成乘号。听上去很有用对吧?
下面是代码。这个版本的代码,和一个可以试着跑的示例程序一起,放在了llvm-pass-skeleton
仓库的 mutate分支。
<span class="kwd">for</span><span class="pun">(</span><span class="kwd">auto</span><span class="pun">&</span><span class="pln"> B </span><span class="pun">:</span><span class="pln"> F</span><span class="pun">)</span><span class="pun">{</span>
<span class="kwd">for</span><span class="pun">(</span><span class="kwd">auto</span><span class="pun">&</span><span class="pln"> I </span><span class="pun">:</span><span class="pln"> B</span><span class="pun">)</span><span class="pun">{</span>
<span class="kwd">if</span><span class="pun">(</span><span class="kwd">auto</span><span class="pun">*</span><span class="pln"> op </span><span class="pun">=</span><span class="pln"> dyn_cast</span><span class="pun"><</span><span class="typ">BinaryOperator</span><span class="pun">>(&</span><span class="pln">I</span><span class="pun">))</span><span class="pun">{</span>
<span class="com">// Insert at the point where the instruction `op` appears.</span>
<span class="typ">IRBuilder</span><span class="pun"><></span><span class="pln"> builder</span><span class="pun">(</span><span class="pln">op</span><span class="pun">);</span>
<span class="com">// Make a multiply with the same operands as `op`.</span>
<span class="typ">Value</span><span class="pun">*</span><span class="pln"> lhs </span><span class="pun">=</span><span class="pln"> op</span><span class="pun">-></span><span class="pln">getOperand</span><span class="pun">(</span><span class="lit">0</span><span class="pun">);</span>
<span class="typ">Value</span><span class="pun">*</span><span class="pln"> rhs </span><span class="pun">=</span><span class="pln"> op</span><span class="pun">-></span><span class="pln">getOperand</span><span class="pun">(</span><span class="lit">1</span><span class="pun">);</span>
<span class="typ">Value</span><span class="pun">*</span><span class="pln"> mul </span><span class="pun">=</span><span class="pln"> builder</span><span class="pun">.</span><span class="typ">CreateMul</span><span class="pun">(</span><span class="pln">lhs</span><span class="pun">,</span><span class="pln"> rhs</span><span class="pun">);</span>
<span class="com">// Everywhere the old instruction was used as an operand, use our</span>
<span class="com">// new multiply instruction instead.</span>
<span class="kwd">for</span><span class="pun">(</span><span class="kwd">auto</span><span class="pun">&</span><span class="pln"> U </span><span class="pun">:</span><span class="pln"> op</span><span class="pun">-></span><span class="pln">uses</span><span class="pun">())</span><span class="pun">{</span>
<span class="typ">User</span><span class="pun">*</span><span class="pln"> user </span><span class="pun">=</span><span class="pln"> U</span><span class="pun">.</span><span class="pln">getUser</span><span class="pun">();</span><span class="com">// A User is anything with operands.</span>
<span class="pln">user</span><span class="pun">-></span><span class="pln">setOperand</span><span class="pun">(</span><span class="pln">U</span><span class="pun">.</span><span class="pln">getOperandNo</span><span class="pun">(),</span><span class="pln"> mul</span><span class="pun">);</span>
<span class="pun">}</span>
<span class="com">// We modified the code.</span>
<span class="kwd">return</span><span class="kwd">true</span><span class="pun">;</span>
<span class="pun">}</span>
<span class="pun">}</span>
<span class="pun">}</span>
细节如下:
dyn_cast<T>(p)
构造函数是LLVM类型检查工具的应用。使用了LLVM代码的一些惯例,使得动态类型检查更高效,因为编译器总要用它们。具体来说,如果I
不是“二元操作符”,这个构造函数返回一个空指针,就可以完美应付很多特殊情况(比如这个)。现在我们编译一个这样的程序(代码库中的example.c):
<span class="com">#include</span><span class="str"><stdio.h></span>
<span class="kwd">int</span><span class="pln"> main</span><span class="pun">(</span><span class="kwd">int</span><span class="pln"> argc</span><span class="pun">,</span><span class="kwd">const</span><span class="kwd">char</span><span class="pun">**</span><span class="pln"> argv</span><span class="pun">)</span><span class="pun">{</span>
<span class="kwd">int</span><span class="pln"> num</span><span class="pun">;</span>
<span class="pln">scanf</span><span class="pun">(</span><span class="str">"%i"</span><span class="pun">,</span><span class="pun">&</span><span class="pln">num</span><span class="pun">);</span>
<span class="pln">printf</span><span class="pun">(</span><span class="str">"%i\n"</span><span class="pun">,</span><span class="pln"> num </span><span class="pun">+</span><span class="lit">2</span><span class="pun">);</span>
<span class="kwd">return</span><span class="lit">0</span><span class="pun">;</span>
<span class="pun">}</span>
如果用普通的编译器,这个程序���行为和代码并没有什么差别;但我们的插件会让它将输入翻倍而不是加2。
<span class="pln">$ cc example</span><span class="pun">.</span><span class="pln">c</span>
<span class="pln">$ </span><span class="pun">./</span><span class="pln">a</span><span class="pun">.</span><span class="kwd">out</span>
<span class="lit">10</span>
<span class="lit">12</span>
<span class="pln">$ clang </span><span class="pun">-</span><span class="typ">Xclang</span><span class="pun">-</span><span class="pln">load </span><span class="pun">-</span><span class="typ">Xclang</span><span class="pln"> build</span><span class="pun">/</span><span class="pln">skeleton</span><span class="pun">/</span><span class="pln">libSkeletonPass</span><span class="pun">.</span><span class="pln">so example</span><span class="pun">.</span><span class="pln">c</span>
<span class="pln">$ </span><span class="pun">./</span><span class="pln">a</span><span class="pun">.</span><span class="kwd">out</span>
<span class="lit">10</span>
<span class="lit">20</span>
很神奇吧!
如果你想调整代码做一些大动作,用IRBuilder来生成LLVM指令可能就比较痛苦了。你可能需要写一个C语言的运行时行为,然后把它链接到你正在编译的程序上。这一节将会给你展示如何写一个运行时库,它可以将所有二元操作的结果记录下来,而不仅仅是闷声修改值。
这里是LLVM流程的代码,也可以在llvm-pass-skeleton
代码库的rtlib分支找到它。
<span class="com">// Get the function to call from our runtime library.</span>
<span class="typ">LLVMContext</span><span class="pun">&</span><span class="typ">Ctx</span><span class="pun">=</span><span class="pln"> F</span><span class="pun">.</span><span class="pln">getContext</span><span class="pun">();</span>
<span class="typ">Constant</span><span class="pun">*</span><span class="pln"> logFunc </span><span class="pun">=</span><span class="pln"> F</span><span class="pun">.</span><span class="pln">getParent</span><span class="pun">()-></span><span class="pln">getOrInsertFunction</span><span class="pun">(</span>
<span class="str">"logop"</span><span class="pun">,</span><span class="typ">Type</span><span class="pun">::</span><span class="pln">getVoidTy</span><span class="pun">(</span><span class="typ">Ctx</span><span class="pun">),</span><span class="typ">Type</span><span class="pun">::</span><span class="pln">getInt32Ty</span><span class="pun">(</span><span class="typ">Ctx</span><span class="pun">),</span><span class="pln"> NULL</span>
<span class="pun">);</span>
<span class="kwd">for</span><span class="pun">(</span><span class="kwd">auto</span><span class="pun">&</span><span class="pln"> B </span><span class="pun">:</span><span class="pln"> F</span><span class="pun">)</span><span class="pun">{</span>
<span class="kwd">for</span><span class="pun">(</span><span class="kwd">auto</span><span class="pun">&</span><span class="pln"> I </span><span class="pun">:</span><span class="pln"> B</span><span class="pun">)</span><span class="pun">{</span>
<span class="kwd">if</span><span class="pun">(</span><span class="kwd">auto</span><span class="pun">*</span><span class="pln"> op </span><span class="pun">=</span><span class="pln"> dyn_cast</span><span class="pun"><</span><span class="typ">BinaryOperator</span><span class="pun">>(&</span><span class="pln">I</span><span class="pun">))</span><span class="pun">{</span>
<span class="com">// Insert *after* `op`.</span>
<span class="typ">IRBuilder</span><span class="pun"><></span><span class="pln"> builder</span><span class="pun">(</span><span class="pln">op</span><span class="pun">);</span>
<span class="pln">builder</span><span class="pun">.</span><span class="typ">SetInsertPoint</span><span class="pun">(&</span><span class="pln">B</span><span class="pun">,</span><span class="pun">++</span><span class="pln">builder</span><span class="pun">.</span><span class="typ">GetInsertPoint</span><span class="pun">());</span>
<span class="com">// Insert a call to our function.</span>
<span class="typ">Value</span><span class="pun">*</span><span class="pln"> args</span><span class="pun">[]</span><span class="pun">=</span><span class="pun">{</span><span class="pln">op</span><span class="pun">};</span>
<span class="pln">builder</span><span class="pun">.</span><span class="typ">CreateCall</span><span class="pun">(</span><span class="pln">logFunc</span><span class="pun">,</span><span class="pln"> args</span><span class="pun">);</span>
<span class="kwd">return</span><span class="kwd">true</span><span class="pun">;</span>
<span class="pun">}</span>
<span class="pun">}</span>
<span class="pun">}</span>
你需要的工具包括Module::getOrInsertFunction和IRBuilder::CreateCall。前者给你的运行时函数logop
增加了一个声明(类似于在C程序中声明void logop(int i);
而不提供实现)。相应的函数体可以在定义了logop
函数的运行时库(代码库中的rtlib.c)找到。
<span class="com">#include</span><span class="str"><stdio.h></span>
<span class="kwd">void</span><span class="pln"> logop</span><span class="pun">(</span><span class="kwd">int</span><span class="pln"> i</span><span class="pun">)</span><span class="pun">{</span>
<span class="pln">printf</span><span class="pun">(</span><span class="str">"computed: %i\n"</span><span class="pun">,</span><span class="pln"> i</span><span class="pun">);</span>
<span class="pun">}</span>
要运行这个程序,你需要链接你的运行时库:
<span class="pln">$ cc </span><span class="pun">-</span><span class="pln">c rtlib</span><span class="pun">.</span><span class="pln">c</span>
<span class="pln">$ clang </span><span class="pun">-</span><span class="typ">Xclang</span><span class="pun">-</span><span class="pln">load </span><span class="pun">-</span><span class="typ">Xclang</span><span class="pln"> build</span><span class="pun">/</span><span class="pln">skeleton</span><span class="pun">/</span><span class="pln">libSkeletonPass</span><span class="pun">.</span><span class="pln">so </span><span class="pun">-</span><span class="pln">c example</span><span class="pun">.</span><span class="pln">c</span>
<span class="pln">$ cc example</span><span class="pun">.</span><span class="pln">o rtlib</span><span class="pun">.</span><span class="pln">o</span>
<span class="pln">$ </span><span class="pun">./</span><span class="pln">a</span><span class="pun">.</span><span class="kwd">out</span>
<span class="lit">12</span>
<span class="pln">computed</span><span class="pun">:</span><span class="lit">14</span>
<span class="lit">14</span>
如果你希望的话,你也可以在编译成机器码之前就缝合程序和运行时库。llvm-link工具——你可以把它简单看做IR层面的ld的等价工具,可以帮助你完成这项工作。
大部分工程最终是要和开发者进行交互的。你会希望有一套注记(annotations),来帮助你从程序里传递信息给LLVM流程。这里有一些构造注记系统的方法:
__enable_instrumentation()
和__disable_instrumentation()
,让程序将代码改写限制在某些具体的区域。__attribute__((annotate("foo")))
语法会发射一个元数据和任意字符串,可以在流程中处理它。Brandon Holt(又是他)有篇文章讲解了这个技术的背景。如果你想标记一些表达式,而非声明,一个没有文档,同时很不幸受限了的__builtin_annotation(e, "foo")
内建方法可能会有用。我希望能在以后的文章里展开讨论这些技术。
LLVM非常庞大。下面是一些我没讲到的话题:
我希望我给你讲了足够的背景来支持你完成一个好项目了。探索构建去吧!如果这篇文章对你帮助,也请让我知道。
感谢UW的架构与系统组,围观了我的这篇文章并且提了很多很赞的问题。
以及感谢以下的读者:
更多LLVM的资讯:
LLVM 的详细介绍:请点这里
LLVM 的下载地址:请点这里
原文:http://adriansampson.net/blog/llvm.html 作者: Adrian Sampson
译文:http://geek.csdn.net/news/detail/37785 译者: 张洵恺