Using Nginx to Migrate Legacy Systems

Sebastian Kurfürst27.10.2019

tl;dr: Using Nginx to migrate legacy systems is very powerful. By clever combinations of rewrite rules and reverse proxies, one can implement seamless transitions!

We demonstrate the above by replacing an existing CMS with Neos CMS, but similar ideas can be applied to many kinds of systems.

The Problem - Adopting a new CMS incrementally

For a customer of us, it became evident that he could not implement his needs with his previous CMS – and after we have shown Neos CMS, it became quickly apparent that they would profit a lot by its functionalities, especially flexible content modelling, redirect management, and SEO functionalities.

However, the customer had quite a lot of content in the old CMS; and also special functionalities like a shopping basket and different kinds of forms – so we figured out that replacing everything in a big bang would be too expensive.

For us, the core question was: How can we unlock the potential of Neos CMS for the customer, in a very cost-effective way?

On the plus side, the customer had recently adopted a new design which he was very happy with; so we wouldn't need to touch any of this.

Solution Idea: Deliver Neos Content first, then fall back to the old CMS

At Neos Conference 2018, I've attended a talk by Sebastian Heuer and Arne Blankerts about escaping legacy hell - where they introduced the pattern of using Nginx reverse proxies, configured cleverly. Basically, the idea is to ask the new system first, and when the new system returns a 404 page, intercept that at the Nginx-layer; and instead ask the old system for the same URL.

This way, we were able to re-build the template in Neos, and migrate incrementally: We took a page where the current structure was not as expected, and we re-built this page in the new system.

Nginx Configuration to migrate legacy systems

Let's zoom in a bit on the required Nginx configuration - which, at the core, boils down to the following settings:

location / { # default for Neos CMS; to redirect all pages which do not exist on disk to index.php try_files $uri $uri/ /index.php?$args; } # default config - passing PHP calls to PHP-FPM via FastCGI location ~* \.php$ { fastcgi_pass unix:/tmp/php7-fpm.sock; # ... quite some more fastcgi parameters here # when PHP delivers an HTTP error, do not send this to the end-user directly, # but instead evaluate the error_page directive. fastcgi_intercept_errors on; # redirect to the named @legacy location in case of a 404 from the new system. error_page 404 = @legacy; # also allow if that happens multiple levels. recursive_error_pages on; } # definition for the legacy backend location @legacy { # ... (see below) }

Now, I had some trouble defining the legacy backend correctly, because with my first tries defining proxy_pass, the URL /index.php was always called on the legacy system. Then, a bit of debugging and docs reading revealed what was going on:

The try_files directive works like some internal redirect; effectively rewriting the Nginx internal variables such as $uri to /index.php.
then, the PHP location block is executed - and if PHP returned a 404 error, the current request (which was still a request to /index.php) was returned to the legacy system.
the legacy system (of course) could not deal with requests to /index.php and returned an error.

So, the question was: How can we use the original URL (before starting to rewrite) and pass that to the legacy upstream system?

The solution is quite easy, but it took a bit to figure this out. The core idea was to use variables inside proxy_pass, to build up the target URL dynamically:

location @legacy { # we use the ORIGINAL request URI (including all arguments) here. # To quote the docs at https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_pass: # When variables are used in proxy_pass the URL # is passed to the server as is, replacing the original request URI. proxy_pass https://old.my-url.de$request_uri; # we need to define a resolver, as we use variables and a full domain name in proxy_pass resolver 8.8.8.8; # for the dynamic parts to work, we need to rewrite Location headers to be relative again, because we use a # dynamic $request_uri in proxy_pass - and thus the default rewriting of proxy_redirect does not happen anymore. proxy_redirect https://old.my-url.de/ /; proxy_redirect http://old.my-url.de/ /; }

Add a Fast-Path for static assets

In our setup above, the dynamic pages are handled by php-fpm; and all static files (like images, CSS, ...) are delivered directly from the Nginx system which also does the proxying. Thus, we want to avoid asking our new backend for every static asset; just to get the 404 returned, and then to ask the legacy system (which has the asset).

To fix this, a rather simple additional rule is needed for handling static file types directly:

location ~* \.(?:css|png|js|woff|woff2|svg|jpg|jpeg|ttf|pdf)$ { # if these static files do not exist on disk, directly ask the legacy backend. error_page 404 = @legacy; }

Adding Caching

Finally, to further improve response times, we added a simple cache which caches all legacy-upstream resources for 10 minutes, by using the rules below:

proxy_cache_path /tmp/nginx-cache levels=1:2 keys_zone=legacy:50m inactive=24h max_size=1g; location @legacy { # ... the config from above, and additionally the following lines: proxy_cache legacy; proxy_cache_background_update on; proxy_cache_use_stale error timeout invalid_header updating http_500 http_502 http_503 http_504; proxy_cache_valid 200 302 10m; }

Summary

In this post, we have shown how Nginx can be used to direct requests dynamically between a new and a legacy system - which can be a powerful and cost-effective way to migrate from a legacy system. Let's finish this post with some properties of this approach, to ease the decision whether this approach is worth it.

This approach (for websites) can only be used if the old and the new system have the same visual look-and-feel; i.e. HTML markup can be re-used.
It best works for systems which are well-segmented; so we do not have lots of interconnections between the "old" and the "new" system. It is a lot harder if the old and the new system need to access the same underlying data store.
The approach does not contain an in-built way to access shared data; i.e. a page is always rendered purely in the new, or purely in the old system. This makes it a bit harder to deal with global data like a search index, menus, or sitemap.xml (which has to be taken care of differently).
The approach works best for read-heavy systems.
This approach allows using many powerful Neos features, like redirect management, also for content in the old system (as redirects are returned first, from the new system).

Looking forward to your feedback on Twitter (@skurfuerst)!
All the best,
Sebastian