Opportunities at Scale — Trading Problem Sets
This isn’t a finished essay by any means. It’s an in-progress record of the problem I’m dancing with, and the paradigms I’m cuddling with. But first some history.
At Flextrip (travel startup 2011–2012) we were tiny but already playing around with Chef and Chef clones (Rubber https://github.com/rubber/rubber) as a way of automating our deployments — talk about premature optimization. I also played around with Vagrant as a way of trying to reduce disconnect being Prod and Dev environments during freelance work but the word DevOps never touched my lips.
When I worked on Data Science at SITA (2013) we were a 60 year old Airline owned tech services company that offered mostly enterprise data warehouse appliances for airport operations, reservation systems (still half on TPF mainframes) for Airlines, and Enterprise Service busses for the entire aerospace industry. The data and the traffic were at-scale (petabytes of data and 10 Gbps of throughput of actual messages), the lingua franca was Java and MVC Spring was the hot new thing for a lot of our products. But we were also working on all the new startup inspired open source projects/paradigms like Cassandra, Hazelcast, NOSQL and we were embarking upon development of data streaming, event driven architectures. But ultimately the tooling for data science had not reached the level of ease and maturity it has today. I was using SciRuby, and R to try to build streaming ingestion to feed a Monte Carlo simulation that exported coefficients to a curve fit model estimations of passenger volumes and locations in an airport. MLaaS was still a ways off and we weren’t talking about FaaS or Serverless, and the closest we came to Docker was designing VM deployment architectures.
At HP Travel (2014–2015) the scale was even bigger but we were mostly supporting legacy appliances hosted in a 3rd Party On Prem Data center (emulated mainframe software). Pieces of modern architecture beginning to show up in applications all across the org but these were being designed with the REST/MVC paradigm and very little attention was being paid to the performance, cost and organization efficiencies on offer from Public Cloud vendors. One area where we began to move towards today’s paradigms is by deploying Kafka as messaging system to provide as close to immediately consistent primary/replica architecture as we could for Airline Availability servers (the database that tells selling systems what price levels different products should be sold for to maximize revenue on a flight).
At Airhelp our Architecture choices were almost primarily driven by Devops culture but it was still a major struggle to get everyone on-board with the benefits of docker, docker orchestration tools (like Kubernetes), and serverless / FaaS technologies.