Last week, I attended a workshop on GCC Internals held at IIT Bombay. It was five days workshop from 29th June to 3rd July.
Me, Amit Shah and Gopal Tiwari went from Red Hat. We booked a cab and left for Mumbai on Friday, 28th June afternoon. The journey from Pune to Mumbai was awesome with nice scenery and rain on the way. We reached IIT Bombay in the evening and were had guest house booked for us. The campus was really big, great and full of natural things like trees, leopard, snakes, oh and even python :P
We were supposed to do some setup in laptop before attending workshop as mentioned in http://www.cse.iitb.ac.in/grc/gcc-workshop-13/index.php?page=laptop .
I woke up early in the morning and reached to venue FC Kohli auditorium after having breakfast. With around 40 participants, it started with small introduction to the motivation, objectives, goals, etc behind this workshop by Prof. Uday Khedker . Then we had small introduction with all participants and TAs (Teaching Assistant). It was good to hear from different people. Participants were from different areas like Teachers, students, corporate world.
Now, actual workshop started. All participants were grouped together with two in each group for doing lab assignments. There were first presentation on particular topic followed by lab session.
It started with An Overview of Compilation and GCC in which Prof. Khedker discussed about structure of a simple compiler, GNU tool chain, how parse tree and abstract syntax tree is created for a program.
After this we learnt about Gray Box Probing of GCC i.e examining input and output of various components and modules of GCC. We learnt about basic transformations in GCC from one language to another language.
We learnt about how GIMPLE representation is done for data types, functions, pointers, array, classes, virtual/non-virtual functions, inheritance, etc. Another interesting thing was possible optimization in GIMPLE like constant propagation, Copy propagation, Loop unrolling, Dead code elimination. Various optimizations can be achieved for a program (checkout, gcc -c --help=optimizers)
Here, I came to know that there are total 163 GIMPLE and 89 RTL passes in GCC 4.7.2 among which 215 are unique passes. This is really an amazing thing to know what gcc do internally when I compile my simple “Hello World “ program.
Post lunch there was more presentation on GIMPLE IR. After that, we were doing assignment which was on Gray box probing of GCC and GIMPLE assignment.
There was presentation on GCC Control Flow and Plugins. we learnt about
- We can add, remove and maintain module independently in GCC called Plugin
- Both static and dynamic plugins are possible in GCC
- How control flow happen in intraprocedural and interprocedural passes
- Link Time Optimization (LTO) which enables interprocedural optimizations across different files (Use -flto option during compilation)
- Type of LTO possible i.e partitioned and nonpartitioned
After this we did lab assignment on Link Time Optimization which contains compiling simple C program using -flto with and without partition mode and analyzing different dump produced by them.
Post lunch there was presentation on GCC configuration and building. Here, we learnt about
- Code organization of GCC
- GCC build dependencies
- Configuring and building gcc
- Machine description
- Building GCC as cross compiler
There was presentation on Register Transfer Language (RTL). Here we learnt about
- Why GCC uses RTL like it supports low level optimizations, instruction scheduling, register allocation, etc
- How to analyze RTL dump generated during RTL passes
- Internal view of RTL i.e types of RTL objects, internal representation of RTL expression
- Manipulating RTL Intermediate Representation
Then there were presentation on Machine Descriptions, Spim Machine Description, Retargetability Model of GCC. Here, we learnt about
- Organization of Machine Description (MD)in GCC , essential constructs of MD
- Cross building GCC for SPIM
- SPIM machine description and changes in MD
- How basic instructions are performed in SPIM like move, add, compare, procedure call etc
- How retargetability is achieved in GCC
After presentation, we were doing lab assignment on Spim machine description which included building GCC for Spim, adding machine description for left shift operator and analyzing added assembly instruction for same.
There was presentation on Parallelization and Vectorization in GCC by Prof. Amitabha Sanyal. he taught about
- Parallelism i.e executing same operation on multiple operands and vectorization i.e performing N operation together in different vector register
- Possibility of parallelization and vectorization inside program e.g instructions inside multiple for loops
- Data dependence i.e Anti dependence ( Write after Read), Flow dependence ( Read after Write) , Output dependence ( Write after Write)
- Transformations enhancing vectorization and parallelization like loop interchange, loop distribution, loop fusion etc
This was last day of workshop which started with summary presentation regarding what we learnt in previous 4 days. Further, there was feedback session where few people gave feedback regarding workshop.
According to me overall workshop was very good, everything from start till end was so well. Teaching Assistant had main role along with Prof. Khedkar in success of this workshop. They helped participants with any kind of help needed like problem in doing lab assignments, arranging laptop for participants who didn’t have on, uploading assignment on website on time and other related stuff.
Thanks to Red Hat for sponsoring this workshop and travel. I am happy and confident that I know much more about what GCC does internally than I did earlier.
Here are few pictures I took, http://www.flickr.com/photos/49657487@N07/sets/72157634512542865/