End Point Corporation
Utah Open Source Conference, May 3–5, 2012
Some optimization techniques
lead to a cleaner architecture.
Others make things more complicated,
so maybe should be delayed.
479popcorn.com before tuning
479popcorn.com after tuning
If nothing is faster than nothing,
ideally we’d like to serve nothing at all.
Cache the entire page and every asset on it
entirely in the browser
for as long as the user is using the site.
The browser won’t make
any more requests to the server
except e.g. Google Analytics tracking code.
And that’s fast and not our scalability problem.
Probably your site has some dynamic components.
Put into cookies or HTML5 LocalStorage:
If so, the page is still fully cacheable.
Still a win to cache the entire page
only for unauthenticated users.
May cover a lot of traffic.
It’s worth the work to cache the whole page. Why?
If you can cache the whole page,
everything lower in the stack
is included — job done.
If only a small part of the page must be dynamic,
maybe fetch only that via Ajax so
everything else is statically cached.
from Google’s CDN or others.
Many HTTP requests for embedded assets are slow
due to TCP round trips,
even with HTTP keepalive.
Worse on mobile networks.
Even if you’re only using Apache,
some simple tuning can help a lot.
KeepAlive On KeepAliveTimeout 2
Enable mod_deflate and set:
If using a caching reverse proxy, you may be better off
instead letting the cache store a single uncompressed copy
and gzip it to the clients (or not) as needed.
On modern CPUs gzip is really fast.
Mark static assets as cacheable by both
the reverse proxy and the user’s browser.
Cache-Control: max-age=7200 Expires: Wed, 02 May 2012 06:06:18 GMT
ExpiresActive On ExpiresDefault "access plus 2 hours"
Contra advice about a month or even a year…
URL changes, mistakes, tradeoffs, new/old app.
Compromise of 1–8 hours seems good.
Apache generates unique ETags per server.
That’s ok if you have only one server. :)
Worth keeping if done right.
Caching in memcached or files (local or NFS):
Most web frameworks have conventions for these things.
Which pages does everyone go through?
Which functions are busiest?
But also tons of fast queries
that collectively bog down the system.
Use query log analysis tools.
… for the database speed not to matter much?
Cache at the highest layer of the stack possible,
and work down as needed.
if you never have to tune your database
because it is so rarely hit.
if you rarely use server-side sessions.
if your app server is bored.
if your main web server
is bored because a CDN
handled most traffic.
if the CDN is bored once visitors
have warmed their caches.
if their cache comes pre-warmed
with same jQuery from Google that
other sites use.
Things will always be changing.
Measure and do what you can.
executed over time add up.
Big plans never executed
are a total waste.
Firewalls can run out of space in their state tables!
Test forward & reverse DNS resolution on each host.
Logging shouldn’t use DNS, but maybe it is!
Investigate your hunches.
The “impossible” often isn’t,
both for good and bad.