Dragging old web apps into modernity

Jon Jensen
End Point Corporation

Perl Dancer conference
Hancock, New York
8 October 2014

The good old days

What did web application server hosting look like when Interchange was young, 15 years ago?

Hardware then

Around year 2000, a Verio dedicated server with:

  • 600 MHz Pentium III CPU
  • 512 MB RAM
  • 2 × 9 GB SCSI HD in RAID 1
  • 40 GB network transfer per month
  • 2-year contract term

$900/month

(It would’ve been cheaper with a young startup called Rackspace.)

Hardware now

2014, a Linode cloud instance with:

  • 2.8 GHz Intel Xeon CPU, 1 core
  • 1 GB RAM
  • 24 GB SSD storage
  • 2 TB network transfer per month
  • monthly or hourly contract term

$10/month
= $7.24/month inflation-adjusted to year 2000

What else has changed?

IP addresses then

Pay for blocks of IPv4 addresses. Only.

That Verio server included 32 IP addresses.

IP addresses now

Pay for individual IPv4 addresses. Running out.

The Linode VPS includes 1 IPv4 + 1 IPv6 address.

IPv6 is real and adoption is growing.
Google reports 4.5% of all its traffic is now native IPv6.

SSL/TLS then

HTTPS expensive and rare:

  • RSA patent
  • Expensive certificates
  • Use scarce CPU power
  • 1 IP address per HTTPS site

SSL/TLS now

HTTPS cheap and commonplace:

  • Elliptic curve crypto easy on faster CPUs
  • Cheap DV certs
  • Even EV certs cheaper than most certs were in old days
  • SNI for multiple HTTPS sites per IP address
  • Can now do SPDY and soon HTTP/2

JavaScript then

Used as a rare and unreliable garnish.

Websites had to work with JavaScript disabled.

JavaScript now

Essential.

A year ago, starting with Firefox 23,
Mozilla removed the user-facing option to disable JavaScript.

Cookies then

Optional.

Cookies now

Required.
With bonus annoying policy notifications required by some governments.

HTML5 Storage (aka localStorage) now also widely used.

Web servers then

Apache for most everything.
One child process per connection.

Custom compiled CGI binary to connect Apache to Interchange.

(And good old TUX.)

Web servers now

Apache still alive and well.

nginx growing fast. Event-driven, no forking
or separate processes per connection.

Proxy to heavy-weight Apache or Plack servers.
Use HTTP as the universal glue protocol.

More central to the focus of this conference …

What did web application servers look like back then?

Monolithic

  • App server handled all pages
  • Session state everywhere
  • Every page dynamic
  • Inconsistent separation of concerns (not MVC)
  • Minimal JavaScript
  • No Ajax
  • Separate from rest of ecosystem (CPAN)

What’s wrong with that?

Some things were suited to the times.

Other things we had not yet learned how to do better.

Old school

When MiniVend and Interchange were created,
nothing else met all those needs.

It predates CPAN.

MiniVend innovations

  • App server daemon
  • Sessions (with or without cookies)
  • Page and email templating
  • Database abstraction
  • Database seeding
  • Order routing
  • Shipping and tax calculation
  • Form validation
  • Payment gateway integration
  • Safe credit card handling
  • Search
  • Customizable filters
  • User accounts
  • Later, admin back-end

Why change?

Giants with shoulders to stand on

In the 2000s, many high-quality CPAN modules started to cover most of those areas.

Except ecommerce, for which Interchange 6 fills a gap.

Building on Dancer and other CPAN modules makes us part of a bigger community, with more activity and better quality.

Less cohesive, but that’s the tradeoff.

Big bang rewrites are risky

Websites comprised of many applications

Now people are mixing off-the-shelf open source and other applications to make a single website.

Scale culturally

People are not willing to reinvent wheels in a single framework and language, but instead expect to mix a CMS, ecommerce component, Q&A or forums or comments features, search and faceting functions, and maybe use separate back-end ERP systems.

Adapt, survive, and thrive

Perl has a serious reputation problem.

Help end all-or-nothing thinking by being open to other languages and frameworks.

It’s easier to be open if we don’t feel we have to throw away everything we have to try something new.

What we are doing:
one site as an example

In the beginning

Pure Interchange 4/5 frontend + admin.

All customer-facing content served via homegrown CMS
in Interchange using custom database fields.

Entire web space handled by Interchange.

Usual ecommerce functions.

Stage 1

Put code in Subversion, and later Git.

Even production is just a Git checkout.
Catches Interchange admin local changes.

Set up independent development environments.

Here is an early place that DevCamps was used and refined.

Stage 2

Started using regular Perl modules for new development.

  • Reduce parsing confusion
  • Enable compile-time syntax checking
  • Avoid some ITL performance problems

Stage 3

Added new subsections of the site on different technologies such as WordPress for blog, which lived on same server and used Apache configuration for some of the URL space.

Better reverse proxying to a different server.

Stage 4

Interchange-based CMS was too inflexible, expensive to change.

Moved customer-facing content to PHP frontend.

Managed by in-house developers who work with us,
and keep code in the same Git repository.

“Git is my CMS.”

Stage 5

Set up customer-facing nginx proxy servers that cache objects from Apache (images, CSS, JavaScript) and Interchange (fully-cacheable pages).

Set default cache lifetime to 2 hours, but can override by directory and leave Interchange uncached.

Adding a CDN in front is then easy and caches at the edge.

Stage 6

Moved cart, checkout, and receipt pages to JavaScript frontend talking to Perl web service living in Interchange 5.

Now uses HTML5 Storage instead of cookies.

Much faster feel for users. Easier to adapt next steps to choices made.

Keep backend ecommerce and database away from PHP. Don’t need more cooks in the kitchen.

New web services

countries


GET https://the.site/service/countries/en
{
   "timestamp" : "2014-10-08 04:26:04+0000",
   "locale" : "en",
   "countries" : [
      {
         "states" : [
            {
               "name" : "Alabama",
               "shippable" : true,
               "id" : "AL"
            },
            {
               "name" : "Alaska",
               "shippable" : true,
               "id" : "AK"
            },
            // etc.
         "name" : "USA",
         "shippable" : true,
         "id" : "US"
      },
      {
         "states" : [
            {
               "name" : "Alberta",
               "shippable" : true,
               "id" : "AB"
            },
            {
               "name" : "British Columbia",
               "shippable" : true,
               "id" : "BC"
            },
            // etc.
         "name" : "Canada",
         "shippable" : true,
         "id" : "CA"
      },
      {
         "name" : "France",
         "shippable" : true,
         "id" : "FR"
      },
      {
         "name" : "Germany",
         "shippable" : true,
         "id" : "DE"
      },
      // etc.
      {
         "name" : "Zimbabwe",
         "shippable" : false,
         "id" : "ZW"
      }
   ]
}

products


GET https://the.site/service/products/en/USD
{
   "currency" : "USD",
   "timestamp" : "2014-10-08 04:35:30+0000",
   "locale" : "en",
   "processing" : {
      "booklet" : "0",
      "maximum" : "4",
   },
   "products" : {
      "00238" : {
         "now" : {
            "effective_from" : "2013-12-26",
            "effective_until" : "2015-01-01",
            "add_tax" : 0,
            "downloadable" : false,
            "deliverable" : true,
            "units_per_gift_wrap" : 10,
            "revision" : "10",
            "price" : 389,
            "title" : "Super Excellent MegaProduct",
         }
      },
      // etc.
   }
}

shipping


GET https://the.site/service/shipping/en/USD/2:00007,1:00014
{
   "locale" : "en",
   "currency" : "USD",
   "order_date" : null,
   "timestamp" : "2014-10-08 04:41:16+0000",
   "shipping" : {
      "order_date_is_holiday" : null,
      "countries" : [
         {
            "shipmodes" : [
               {
                  "cost" : 10,
                  "estimated_delivery_date" : "Thursday, October 16",
                  "title" : "UPS Ground (7-10 days)",
                  "code" : "UPSG"
               },
               {
                  "cost" : 14,
                  "estimated_delivery_date" : "Friday, October 10",
                  "title" : "UPS 2nd Day Air",
                  "code" : "UPS2DA"
               },
            ],
            "processed" : "Wednesday, October 8",
            "title" : "Continental US",
            "code" : "US"
         },
         // etc.
      ],
      "order_date_formatted" : "Tuesday, October 7 at 22:41"
   },
   "cart" : [
      {
         "sku" : "00007",
         "quantity" : 2
      },
      {
         "sku" : "00014",
         "quantity" : 1
      }
   ]
}

order


POST https://the.site/service/order
{
   "address1" : "123 Main St.",
   "address2" : "",
   "affiliate" : null,
   "attempt_count" : "1",
   "b_address1" : "123 Main St.",
   "b_address2" : "",
   "b_city" : "Sacramento",
   "b_company" : "",
   "b_country" : "US",
   "b_fname" : "Beloved",
   "b_lname" : "Customer",
   "b_phone" : "123-456-7890",
   "b_state" : "CA",
   "b_zip" : "95834",
   "campaign" : "",
   "city" : "Sacramento",
   "company" : "",
   "country" : "US",
   "credit_card_csc" : "❤❤❤",
   "credit_card_expiration_month" : "❤❤",
   "credit_card_expiration_year" : "❤❤",
   "credit_card_number" : "411111**1111",
   "delivery_mode" : "downloadable",
   "discount_amount" : "0",
   "discount_group" : null,
   "discount_handling" : "0",
   "email" : "someone@gmail.com",
   "fname" : "Beloved",
   "handling" : "5",
   "lname" : "Customer",
   "mail_list" : false,
   "payment_method" : "credit_card",
   "salestax" : 0,
   "shipmode" : "None (downloadable)",
   "shipping" : "0",
   "state" : "CA",
   "subtotal" : "518",
   "total_cost" : "523",
   "zip" : "95834",
   "user_agent" : "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.124 Safari/537.36",
   "orderlines" : [
      {
         "add_tax" : 0,
         "sku" : "00009",
         "line_number" : 1,
         "subtotal" : 436,
         "discount_amount" : 0,
         "salestax" : 0,
         "quantity" : 4,
         "description" : "A Really Awesome Product",
         "cutoff_choice" : null,
         "price" : 109,
      },
      // etc.
   ]
}

Other web services

  • amazon_payments
  • paypal_express
  • receipt_feedback

Automated testing

With web service-based ordering, more easily automated testing of dozens of kinds of orders:

  • carts comprised of various products
  • shipping methods
  • payment types
  • discounts
  • affiliates
  • delivery destinations
  • downloadable vs. shipped
  • invalid order submissions

Continuous integration.

Stage 7

Stop storing encrypted credit cards.

Interchange, PGP, PCI DSS

Interchange’s PGP encryption and careful handling of form submissions was good.

PCI DSS still allows it, but increases compliance cost:

  • mixed server functions
  • key custodians
  • key rotation
  • regular data purging

Throw it away

Much easier to never store the credit card numbers on your own servers at all.

  • Throw it away
  • Use tokenization
  • Use browser-based payment gateway integration

Avoid public relations disaster in case of a breach.

But no silver bullet

Even with no card numbers stored on the server, attackers can compromise checkout page and sent card number to their own server.

Stage 8 (underway)

Move Interchange 5 admin to separate internal-only website on its own subdomain.

Beneficial for security, but also because Interchange 5 admin expects to own the URL space for its /process route, actionmaps, etc.

Stage 9 (underway)

Moving order web services backend to Dancer web services.

Stage 10

Separate server for each service, to keep them independent and limit dependencies, maximize ease of making changes.

Stage ∞

Ongoing database cleanup. We kept the database pretty clean:

  • Removed many unused columns and tables.
  • Added check constraints, referential integrity.
  • Switched to UTF-8 encoding.
  • Used separate database for new PHP front-end.

More stage ∞

Ongoing dead code removal. We do that pretty well too.

Mistakes were made

What should we have done differently, in hindsight?

Switch earlier to Dancer

We should have introduced Dancer or another modern Perl web service earlier, and proxied to it.

  • Stop adding to the legacy code! Write new code on the new framework.
  • Reduce entanglements:
    • Interchange 5
    • old versions of CPAN modules
    • old custom code
  • Force thinking in web service architecture.
  • Would get to use newer Perl and CPAN modules.
  • Improve testability.

Riskier database cleanup

transactions table columns predating Standard Demo of 2004:

nitems     | VARCHAR(9)
subtotal   | VARCHAR(12)
shipping   | VARCHAR(12)
handling   | VARCHAR(12)
salestax   | VARCHAR(12)
total_cost | VARCHAR(16)
order_date | VARCHAR(32)
archived   | VARCHAR(1)
deleted    | VARCHAR(1)
complete   | VARCHAR(1)

… Database cleanup …

Change to SQL-friendly types:

nitems     | INTEGER
subtotal   | DECIMAL(12,2)
shipping   | DECIMAL(12,2)
handling   | DECIMAL(12,2)
salestax   | DECIMAL(12,2)
total_cost | DECIMAL(12,2)
order_date | TIMESTAMP WITH TIME ZONE
archived   | BOOLEAN
deleted    | BOOLEAN
complete   | BOOLEAN

Should have moved to UTF-8 sooner

Huge scope

  • Web server
  • Static HTML pages
  • JavaScript files
  • Web app configuration
  • Template pages
  • Perl code generating HTML
  • Payment gateways
  • Shipping integration (ODBC to UPS WorldShip)
  • Email newsletter service
  • Email receipts
  • Generated PDF
  • Database content
  • Database declared encoding

UTF-8 repertoire blindness

  • Easy: western Latin
  • Mostly easy: eastern European Latin
  • Sometimes ok: Turkish, Cyrillic
  • Often broken: Chinese, Japanese, Korean, right-to-left scripts such as Hebrew and Arabic

Database UTF-8 conversion

We should have upgraded PostgreSQL sooner
and declared UTF-8 encoding.

But then validation kicks in.

Terrifying possibility of data rejection!

Unit test end-to-end UTF-8

  • Test data should include all sorts of character sets.
  • Exercise path from browser all the way through to database & 3rd-party integrations.
  • Handle everything or else scrub/transform appropriately.

“Chinese won’t matter!” … It often ends up mattering.

Separate servers

PCI DSS prefers limiting servers to one function each.

That makes virtualization the affordable way to go for small functions.

  • Web server
  • App server
  • Database server
  • Log server
  • Separate admin servers with limited access

Little servers = easier migrations

Migrating is a pain. But you get a lot in return:

  • Newer hardware
  • Newer OS
  • Newer web server and OpenSSL
  • Newer perl
  • Newer app server and modules
  • Newer database

Development environments

Should be as close to identical to production as possible.

Mostly similar, but don’t have nginx caching in front of camps yet.

Camp system not keeping up. Need to replatform.

Looking at using Docker + Fig + new camp tools.

Live and learn

Newer is better, or not

Things aren’t necessarily better just because they’re newer.

But many newer things are better, having been refined based on lessons of past mistakes, and improved hardware and software available today.

Technical debt
venture capitalists

Sometimes we have to take on technical debt to get a project off the ground, to survive. That’s ok.

Software maintenance

Software requires maintenance similar to houses, cars, airplanes.

It’s a judgement call how much or little to invest, how often, but “none” is not a wise way to respect your investment.

Refactoring is hard work

We must take on the hard task of refactoring what we have, and improve it piece by piece.

If we shirk that duty, our code will atrophy and rot and become increasingly difficult to grow, fix bugs in, or bring new developers into.

Own your rot

Often we blame technology for such software rot.

No framework or programming language can prevent bit-rot.

New technologies’ advantage: haven’t been around long enough for code built on them to decay yet.

Build on others’ work

We can’t do it all ourselves:

  • Character set encodings
  • Basic HTML, HTML5, CSS
  • JavaScript, Flash, Java applets
  • Image formats
  • Browser debugging
  • Web and proxy servers, CDNs
  • HTTP, TLS
  • Operating system, networking
  • Application server
  • Server-side programming languages
  • Automated testing
  • Databases, SQL, and search engines
  • 3rd-party integration
  • Version control systems
  • Development workflows
  • Security considerations

The Internet is made of meat

We are that meat.

Questions?

Twitter: @jonjensen0

Email: jon@endpointdev.com