On last Sunday, Bastian has arrived in Dresden for a code sprint to implement compilation of Fluid Templates to PHP code. This post explains first how we worked, and then shows what matters most: Results
On the first evening, we mainly worked on setting up the environment and create a reliable profiling environment – next to spending a nice evening in Dresden, of course.
On Monday morning, we used our newly created benchmark framework and did some baseline measurements, which you see below. Our performance measurement microframework is based on XHProf, and implements a little GUI on top of XHProf which can be used to see different runs and optimizations on a single glance.
We mainly tested the following setups:
- many objects
- many forms
- high nesting level
- many partials
After the baseline measurements, we started with implementing the compilation phase — and after a few hours, we were successful in rendering our first compiled template! After that, we were able to work iteratively: Checking why a particular rendering is slow, identifying the bottle necks, and testing if our improvements helped.
Numbers count most, and that’s why we want to share them with you. All numbers have been measured on my MacBook Pro 2.4 GHz Intel Core 2 Duo, with 4 GB RAM and a normal hard disk. Feel invited to test Fluid for yourself, making your own benchmarks.
The columns of the result table mean the following:
- Instanciations: Number of object creations
- Runtime: complete rendering time in seconds
- Memory: Maximum taken memory in MB
The rows mean the following:
- Before: before the optimization
- BuildCache: this is the first run, when the template is not yet cached; and a cached representation is built.
- Cached: all other runs after the first run are served from the cached template.
Testing setup: <f:for> loop over 500 objects.
Testing setup: <f:for> loop over 5000 objects.
recursive list – nesting level 6
Testing setup: seven nested <f:for> loops, yielding around 5 500 object accesses
recursive list – nesting level 7
Testing setup: seven nested <f:for> loops, yielding around 22 000 object accesses
Testing setup: rendering a single partial 1000 times
Improving Performance Based On Clean Design
The above performance improvements are absolutely transparent to the developer or the template designer; everybody using Fluid will benefit from these improvements.
Well, that’s not 100% true: We have one single change which is not backwards-compatible; however this change was not even in the public API, and we believe that the performance improvements are absolutely worth it!
We designed Fluid in a clean and object-oriented way — mainly using an object-based syntax tree for internal representation of a template. This enables performance optimizations as we did now; without changing the public API or the behavior of the parser.
This shows very nicely that performance optimizations based on a clean architecture and design are perfectly possible – and we’ll definitely do such a performance code sprint for FLOW3 at some point.
Pair Programming in Dresden with Bastian was awesome — and at some point, we will definitely repeat it. Aside to programming, we also spent a nice time in Dresden, went climbing in the Saxon Swizerland, relaxing, discussing and having fun.
The whole optimization work has been financed by AOE Media, so we’d like to thank them very much for their generosity and support! I cannot express in words how much your sponsoring helped. It really made these optimizations possible.
I also hope that some other agencies follow your lead, financing other core developers when they need improvements on various parts of the core.
Thanks for Reading