Essix Reloaded – part 1: test setup

Introduction

You might have missed that new kid on the block, Essix (pronounce: S6), an essential simple secure stable scalable stateless server for HTTP resources. It’s my take at a just-what-you-need approach to the things-you-need-in-every-project; a resilient barebone backbone for whatever you find is still missing on the web. Maybe you missed it because it was lacking some rubber-hit-road existential litmus testing. Then read on to see how that got fixed.

The test case presented itself to me in the form of a petition running in the Netherlands about all the damage and suffering caused by decades-long greedily drilling for natural gas in one relatively peripheral part of the country. Each time the subject was brought up in one of the nightly talk shows on TV, some 10 to 20 thousand more people signed the petition, but each time, that was despite of their website crashing severely under the sudden load. So I rebuilt that petition page in Essix, put it on the rack, and tightened the thumbscrews.

In this first part, I describe:

  • how the test site was built
  • problems encountered
  • decisions taken, and their motivation
  • features added to the underlying packages
  • the composition of the actual load test
  • how to pull up a computing environment to run the tests

A second part will follow, describing the test results.

Test setup

To set up the test site, I installed Essix,

$ go get -u github.com/wscherphof/essix

initialised a new app,

$ essix init github.com/wscherphof/petities

took the html from petities.nl, stripped it down to its bare bones, solidified it in a template, defined some messages, designed a data model, defined some routes & their handlers, and styled the thing up.

Since this project is also arguably the first serious drive with Essix altogether, I ran into some things that are worth mentioning here:

Backend

Multi-language entity fields

Though multi-language “messages” were an integral part of Essix already, using the message type as field values in the entity data model is new, eg:

groningen := model.InitPetition("groningen")
groningen.Caption = msg.New().
 Set("nl", "Laat Groningen niet zakken").
 Set("en", "Don’t let Groningen down")

Counting

The representation of a specific petition includes a statement on the current number of signatures for that petition. To keep that statement fresh, we could update the signature count for a petition with each new signature. Then again, on a high signing load, that counter will be hot like hell. If there must be one main performance bottleneck, it really shouldn’t be this trivial counter thing.

One part of the fix is to only update an in-memory variable on each signature’s confirmation, and to spin off a parallel goroutine that persists the new counts in the entity model at a regular interval. Since that will lead to inconsistencies when stopping application service instances, there’s an accompanying synchronisation resource, which enables scheduling (e.g. nightly) of the relatively costly operation of actually counting all confirmed signatures for each petition.

Multiple tables for same entity type

Another fix for the counting problem is the new ability to use different database tables for one and the same entity type; in this case: one separate table of signatures for each specific petition record. The signatures table is registered when a new petition record is saved.

Confirmation emails

Gotcha! 😀 This is all copy & paste from basic Essix 😀 👊 💫

Frontend

Progressive enhancement

Did anyone mention it had to be resilient? The core functionality stands in just plain HTML + HTTP:

Schermafbeelding 2017-05-09 om 11.18.46
Look mama, no styles!

Admittedly, with support for the CSS style sheet, things improve considerably. Even without a single line of JavaScript:

Schermafbeelding 2017-05-09 om 14.23.58.png
Look mama, no JavaScript!

There is some JavaScript involved though, mainly about the signature form. The tricky thing there is that the signature form isn’t really part of the petition resource.

Resource-in-a-resource

A GET request to the /signature resource renders the form that can POST the data to sign a petition. That same form is shown as part of the representation of a /petition resource, as seen above. One can of course cheat a bit, and just include the signature form as an integral part of the petition page. A major downside then is that submitting the form will result in a refresh of the entire page.

Luckily, HTML provides a nice solution for this out of the box: with the iframe element, we position an inline frame on the /petition page, that issues a separate request for the /signature resource, and shows the resulting form. When the form is posted, the /petition page remains untouched, while the iframe renders the /signature response:

Schermafbeelding 2017-05-09 om 21.39.44.png
iframe on /petition page, showing POST /signature response

As a fallback for (ancient) browsers that don’t support the iframe element, the signature form is included inline. In that case, if JavaScript is available, we prevent the full page refresh by intercepting the form submit, sending the POST request through Ajax, plucking the body contents out of the response, and putting that in the place of the form element.

Note that the wildly popular Ajax solution is not the first thing we turn to, but some kind of a last resort – we won’t depend on JavaScript if we don’t have to.

This whole package is sitting on the /petition template. There’s also some dynamic resizing of the iframe’s height going on, to enable its sticky positioning.

CSS prefixes

In the stylesheet, we only declare standard CSS properties and values. On build, Essix adds any needed vendor prefixes using autoprefixer.

Remote debugging

To see what works and not on mobile, I use Xcode’s Simulator + Safari Web Inspector for iOS:

Schermafbeelding 2017-05-09 om 22.49.55.png
iOS Simulator

And Android Studio’s Emulator + Chrome DevTools for Android:

Schermafbeelding 2017-05-10 om 10.01.31
Android Emulator

Load test

One or two extra things need to be taken care of, before the application is prepared for testing:

Provisioning

The /provision resource is used to initially populate the petition, and some considerable amount of signatures in the database, in a reasonable amount of time. Before each test run, we hit /provision to delete the signatures from the previous run.

Confirmation

On a successful submit of the signature form, the signature is saved in the database, and an email is sent, asking to confirm the signature. For a successful confirmation, a token value is needed, which is part of the link in the email, and tested against the token saved with the signature in the database. This is to invalidate signatures submitted with other people’s email addresses.

The load test should include confirmation to be comparable with a real world load, but it won’t try to manage several thousands of email account to receive the confirmation tokens. Instead, just for the purpose of load testing, the confirmation token is returned in the signature response, but only if the GO_ENV environment variable is set to “test”.

Though the emails from the test generated signatures aren’t read, the server will send the messages out to the mail server, so the work the server does for each signature in test isn’t any less than in production.

Rate limit

In real life, we’d need to set a rate limit on posting a signature, to prevent bots from loading our database with bogus. Luckily, in Essix, that’s just a matter of passing the handler function to ratelimit.Handle(). For the load test, the rate limit is bypassed by setting the RATELIMIT environment variable, that normally sets the default timeout in seconds, to “0”.

Test definition

The aim is to see how many new signatures we can support without service interruption, and what configuration supports the highest load. To sign a petition, a user would: load the petition page, fill in the signature form, check their email, click the link to load the confirmation form, and submit their confirmation. The test should run many parallel request sequences of:

  1. GET /petition
  2. GET /signature
  3. POST /signature
  4. GET /signature/confirm
  5. PUT /signature/confirm

This setup is configured in a Apache JMeter test file.

Think times

For a realistic scenario, variable delays are added in the test script:

  • 5 – 45 seconds between seeing the petition form & submitting the signature
  • 15 seconds – 1 minute between submitting the signature & navigating to the confirmation form
  • 1 – 5 seconds between seeing the confirmation form & submitting it

Though we can disable the delays to put the server under a constant load, which might seem as a way to “really see what it can do”, it arguably doesn’t bring any actual insights, since such a scenario will never occur in practice.  It might make sense to play a bit with the limits of the various delays, and maybe also cater for the scenarios where people do visit the petition page, but don’t sign it, or people do sign the petition, but don’t confirm their signature. All that is for maybe later; I’ve only tested with confirmed signatures, and with the delays set as above.

Tokens

Essix refuses to process PUT, POST, PATCH, or DELETE requests that don’t carry a valid encrypted form token, as a protection to cross site request forgery (CSRF). The form tokens also carry the input for the rate limiting function. Since signing a petition is done without logging into an application account, CSRF is not a real risk in the case at hand. One might argue we should skip the form token tests to gain performance. On the other hand, providing the ability to bypass token testing, would introduce large possible security holes in other applications. Since a quick test proved the performance impact of token computing to be negligible, I hastily decided to keep the platform secure, and just deal with the tokens.

Environment

1. nodes

To setup a multi-node test environment for the application to run in, we start off with three 2GB ($20/month, $0.03/h) droplets on DigitalOcean:

$ export DIGITALOCEAN_ACCESS_TOKEN="945g4976gfg497456g4976g3t47634g9478gf480g408fg420f8g2408g08g4204"
$ export DIGITALOCEAN_SIZE="2gb"
$ essix nodes -d digitalocean -F -m 1 -w 2 create petities

Where ‘petities’ can be replaced with however the swarm is to be named. Oh, and if you haven’t installed Essix, it’s:

$ go get -u github.com/wscherphof/essix
2. r

Install the database cluster (two servers per node):

$ essix r -n 2 create petities
3. cert

Clone the petities repo, cd to it, and generate an SSL certificate:

$ essix cert petities.yourdomain.org you@email.com
4. build

Build your image & push it to the Docker Hub. (Note: since the image does include your server certificate, you’d eventually want to make it private on Docker Hub, or set up a registry of your own)

$ docker login
$ essix build you 0.1
5. run

Run the application service, setting the environment variables:

$ essix -e DOMAIN=petities.yourdomain.org \
-e DB_POOL_INITIAL=100 -e DB_POOL_MAX=1000 \
-e DB_SHARDS=1 -e DB_REPLICAS=3 \
-e RATELIMIT=0 -e GO_ENV=test \
run you 0.1 petities
6. config

Load the email config in the database:

Schermafbeelding 2017-05-15 om 21.10.05.png
Rethinkdb’s web admin

That’s:

r.db('essix').table('config').get('email').update({
 EmailAddress: 'elj.asegr@vjsvkajv.com', PWD: '8t763w4c87tcw39',
 PortNumber: '587', SmtpServer: 'smtp.gmail.com'
})
7. scale

Restart the application to load the updated config from the database, by first scaling it to 0 replicas, then to e.g. 6 on each node:

$ essix -r 0 run you 0.1 petities
$ essix -r 18 run you 0.1 petities

Or:

$ docker-machine ssh petities-manager-1 docker service scale petities=0
$ docker-machine ssh petities-manager-1 docker service scale petities=18
8. /provision

Now browse to /provision to generate the petition record plus a number of signatures. The real Groningen one has around 200K, loading that number should take a minute or two – monitor the rate of writes on the RethinkDB Dashboard to see when it’s done.

9. load

Use Apache JMeter to open the petities.jmx file in the root of the repo:

Schermafbeelding 2017-05-15 om 21.18.04.png
The test plan in JMeter

Outtroduction

That’s all now for the delicate details. The next part of this post will discuss the test outcomes. It might take a while, because things currently seem to point out there’s no getting around setting up a distributed load generating solution as well 🙂

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s