The Open Commerce Conference - Premature Optimisation: The Root of All Evil

81
PREMATURE OPTIMIZATION The Root of ALL Evil @akitaonrails Hi, I am Fabio Akita, better known as @akitaonrails on social networks. I have been a programmer for the last 20 years doing all sorts of projects.

Transcript of The Open Commerce Conference - Premature Optimisation: The Root of All Evil

PREMATUREOPTIMIZATION

The Root of ALL Evil

@akitaonrails

Hi, I am Fabio Akita, better known as @akitaonrails on social networks. I have been a programmer for the last 20 years doing all sorts of projects.

In 2011, I co-founded a company named Codeminer 42. [CLICK] We have offices in 6 different cities in the country, with almost 60 developers. We do offshore outsourcing software development for medium to big clients in Brazil and also in the USA.

In 2011, I co-founded a company named Codeminer 42. [CLICK] We have offices in 6 different cities in the country, with almost 60 developers. We do offshore outsourcing software development for medium to big clients in Brazil and also in the USA.

PREMATUREOPTIMIZATION

The Root of ALL Evil

@akitaonrails

So, “Premature optimization”.If you graduated in computer science you probably heard about this many times.

“Premature Optimization is the Root of All Evil”

- Don Knuth

It comes from the famous professor Donald Knuth, author of the legendary “The Art of Computer Programming”, one of the most difficult mathematical pieces to understand, but also the author of useful tools such as Latex. Among many things he said, one of the most famous it the following, and I quote:"Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."

This is even more important nowadays, with so many new technologies.We are in the cross road to define the direction that our market will take for the next decade. And we’ve been in this cross road for the last 5 years already, it’s taking time and there are lots of EXPERIMENTS along the way. We have been guinea pigs to big corporations.

Performance & Scalability

Many of us, programmers, are constantly concerned about “performance" or “scalability" or both.We tend to fool ourselves into believing that we are making rational decisions based on objective criteria such as theses, when it’s usually not the case.

Take a real example I had last year. A client of mine called out of desperation.His freelancer developer had just bailed, just after deploying their e-commerce.He said he chose the best tools available: Node.js and Angular.js.But the client was upset because although they apparently did everything right, Google was unable to index any of the website’s products.And the complaint was not that the products didn’t show up in page 1, or page 2, they were nowhere to be found in page 10, or page 500. Nowhere.

SPA e-commerce

I checked it out and I found out why: it was all implemented as an SPA

SPA e-commerce http://www.store.com/#!/products/item-001

If you don’t know SPA is a Single Page Application. A web application that serves just one HTML page. Then it asks to load a shit ton of Javascript. Then after it loads it will start rendering the actual content that the user sees. But before it loads, the HTML usually has no content. So, the google engine crawler sees nothing unless the Javascript loads.Anything after the hashbang is not a real URL, it only makes sense to the client-side javascript application.[CLICK]Google will even be friendly enough to try to convert the hashbang URL into that other alternative, so you have the chance to responde to that query and serve a static version of the aforementioned page.So in the end building up an SPA requires you to do double the work or use a jerry rigged solution such as a 3rd party proxy service that will load your SPA, render the javascript bits and save a cache for the static version of your pages.A whole lot of work for something that should’ve been so simple.The main lesson learned: unless you’re doing a Spotify-like interface or anything behind an authenticated section, such as a user dashboard, DO NOT DO IT AS AN SPA, particularly if it’s content. Content is better off static, specially if you intend users to find your content over search engines.

SPA e-commerce http://www.store.com/#!/products/item-001

http://www.store.com/?_escaped_fragment_=/products/item-001

If you don’t know SPA is a Single Page Application. A web application that serves just one HTML page. Then it asks to load a shit ton of Javascript. Then after it loads it will start rendering the actual content that the user sees. But before it loads, the HTML usually has no content. So, the google engine crawler sees nothing unless the Javascript loads.Anything after the hashbang is not a real URL, it only makes sense to the client-side javascript application.[CLICK]Google will even be friendly enough to try to convert the hashbang URL into that other alternative, so you have the chance to responde to that query and serve a static version of the aforementioned page.So in the end building up an SPA requires you to do double the work or use a jerry rigged solution such as a 3rd party proxy service that will load your SPA, render the javascript bits and save a cache for the static version of your pages.A whole lot of work for something that should’ve been so simple.The main lesson learned: unless you’re doing a Spotify-like interface or anything behind an authenticated section, such as a user dashboard, DO NOT DO IT AS AN SPA, particularly if it’s content. Content is better off static, specially if you intend users to find your content over search engines.

Now, why do programmers make such misguided choices? Of course, because they have not a lot of experience, but mostly because they fall for the oldest trick in the world: following flashy speeches with lots of flare.It’s almost like a religion, blindly following false prophets, and still thinking that they are making rational decisions based on objective criteria, such as performance and scalability.

CONTEXT

What many people ignore is the context.The technologies and practices being pitched, where are they actually used, why were they chosen? Why did Google, Facebook or Apple built them in the first place?

Client #1

I’d like to give real life examples. I can’t show my clients real names and real data at the same time so let’s refer to this first example as just Client 1

So, Client 1 is a medium brazilian white label e-commerce platform that we built from scratch to them.[CLICK]As new relic shows, in a normal week they have this throughput behaviour and an average of, let’s say, 400 rpm (requests per minute).This is not nearly Shopify’s level of traffic but it’s also not too shabby.But more importantly, how much does it cost to sustain this level of throughput?

So, Client 1 is a medium brazilian white label e-commerce platform that we built from scratch to them.[CLICK]As new relic shows, in a normal week they have this throughput behaviour and an average of, let’s say, 400 rpm (requests per minute).This is not nearly Shopify’s level of traffic but it’s also not too shabby.But more importantly, how much does it cost to sustain this level of throughput?

This client is deployed over Heroku. And this is the list of most of the services they use.12 web dynos, 3 worker dynos, database, memcache, redis, sendgrid, etc. [CLICK]And this is the breakdown of the costs of each service per month.

This client is deployed over Heroku. And this is the list of most of the services they use.12 web dynos, 3 worker dynos, database, memcache, redis, sendgrid, etc. [CLICK]And this is the breakdown of the costs of each service per month.

$ 4,000

They spend close to 4k a month.Is this a lot?Depends.[CLICK]This is a company that has a gross revenue of around 7M a year!So, this IT cost represents less than 1%. Definitely not a lot. Could be better, but again, not too shabby.

$ 4,000 ($ 7 MI/yr revenue - 0.68%)

They spend close to 4k a month.Is this a lot?Depends.[CLICK]This is a company that has a gross revenue of around 7M a year!So, this IT cost represents less than 1%. Definitely not a lot. Could be better, but again, not too shabby.

Client #2

Now, on to client 2This is a very different kind of client

We hear a lot about micro services, and this is one example of a big company that because of the size of their software solutions had to go all in into microservices.[CLICK]They have no less than 90 of them.[CLICK]Now, keep in mind that this client has dozens of developers onsite, in the USA, and a lot more in Europe and South America, including some of my company. They had to divide their solution into more manageable pieces.Now, how much does all this cost?

We hear a lot about micro services, and this is one example of a big company that because of the size of their software solutions had to go all in into microservices.[CLICK]They have no less than 90 of them.[CLICK]Now, keep in mind that this client has dozens of developers onsite, in the USA, and a lot more in Europe and South America, including some of my company. They had to divide their solution into more manageable pieces.Now, how much does all this cost?

We hear a lot about micro services, and this is one example of a big company that because of the size of their software solutions had to go all in into microservices.[CLICK]They have no less than 90 of them.[CLICK]Now, keep in mind that this client has dozens of developers onsite, in the USA, and a lot more in Europe and South America, including some of my company. They had to divide their solution into more manageable pieces.Now, how much does all this cost?

$ 100,000+

I didn’t have access to their entire IT expenses but just the 90+ services I had access to, I can make an estimate that they don’t spend less than 100k a month! At the very least (and I am sure it’s a lot more actually).And this is just for heroku’s services, I am not even talking about internal IT such as ERPs, and I am also not talking about the programmers, managers, POs, designers and so forth in their payrolls, or the real estate cost of the building they have to maintain all those people.So, is this a lot?Definitely, but is it unreasonable?[CLICK]Not for a company that makes 800M a year!Now the IT cost is less than a fraction of a percent. 800M makes a 100k almost feels like small change.

$ 100,000+ ($ 800 MI/yr revenue - 0.15%)

I didn’t have access to their entire IT expenses but just the 90+ services I had access to, I can make an estimate that they don’t spend less than 100k a month! At the very least (and I am sure it’s a lot more actually).And this is just for heroku’s services, I am not even talking about internal IT such as ERPs, and I am also not talking about the programmers, managers, POs, designers and so forth in their payrolls, or the real estate cost of the building they have to maintain all those people.So, is this a lot?Definitely, but is it unreasonable?[CLICK]Not for a company that makes 800M a year!Now the IT cost is less than a fraction of a percent. 800M makes a 100k almost feels like small change.

So, it always depends.[CLICK]I came up with a simple checklist for small companies that want to invest in technology.What is the bare minimum any small company should expect to spend?(slide)Doing the reverse math, what kinds of companies should spend that much in IT?This is the bare minimum, if you make less than that, don’t do it.And if you’re a developer, keep this in mind when you suggest to rewrite everything to save on that 1k bucks. Even if you’re such a freaking genius that you can make the entire 1k disappear, is it worth it? No! It’s cheaper to let the developer go.

• Minimum IT Cost:

• USD 1000 - Cloud Services

• USD 3000 - 1 Developer

• USD 4000 - 1 “Responsible” (Manager, Marketing, etc)

• Total: USD 8.000/month (USD 96.000/year)

So, it always depends.[CLICK]I came up with a simple checklist for small companies that want to invest in technology.What is the bare minimum any small company should expect to spend?(slide)Doing the reverse math, what kinds of companies should spend that much in IT?This is the bare minimum, if you make less than that, don’t do it.And if you’re a developer, keep this in mind when you suggest to rewrite everything to save on that 1k bucks. Even if you’re such a freaking genius that you can make the entire 1k disappear, is it worth it? No! It’s cheaper to let the developer go.

• Minimum IT Cost:

• USD 1000 - Cloud Services

• USD 3000 - 1 Developer

• USD 4000 - 1 “Responsible” (Manager, Marketing, etc)

• Total: USD 8.000/month (USD 96.000/year)

• Minimum Business Requirement:

• Revenue: > USD 2 MI / year

• IT Cost / Revenue ratio: < 5%

So, it always depends.[CLICK]I came up with a simple checklist for small companies that want to invest in technology.What is the bare minimum any small company should expect to spend?(slide)Doing the reverse math, what kinds of companies should spend that much in IT?This is the bare minimum, if you make less than that, don’t do it.And if you’re a developer, keep this in mind when you suggest to rewrite everything to save on that 1k bucks. Even if you’re such a freaking genius that you can make the entire 1k disappear, is it worth it? No! It’s cheaper to let the developer go.

• Minimum IT Cost:

• USD 1000 - Cloud Services

• USD 3000 - 1 Developer

• USD 4000 - 1 “Responsible” (Manager, Marketing, etc)

• Total: USD 8.000/month (USD 96.000/year)

• Minimum Business Requirement:

• Revenue: > USD 2 MI / year

• IT Cost / Revenue ratio: < 5%

So, it always depends.[CLICK]I came up with a simple checklist for small companies that want to invest in technology.What is the bare minimum any small company should expect to spend?(slide)Doing the reverse math, what kinds of companies should spend that much in IT?This is the bare minimum, if you make less than that, don’t do it.And if you’re a developer, keep this in mind when you suggest to rewrite everything to save on that 1k bucks. Even if you’re such a freaking genius that you can make the entire 1k disappear, is it worth it? No! It’s cheaper to let the developer go.

The priority is not to just lower costs. It’s always good to save a few bucks, but the bulk of your efforts should be in adding features or improving the product in order to generate more revenue.I am not saying that you’re allowed to make bad quality code though, I will come back to that later.

But on this idea of generating revenue, you may have this genius idea that your e-commerce could use some real time chat or push notifications so your users engage more and buy more, for example.Genius idea, and what do you think you should do?[CLICK]Node.js right?DO NOT DO IT UNLESS REALLY NECESSARY!You don’t want the headache, believe me. Unless you’re Whatsapp, Messenger, Telegram, don’t build your messaging engine in-house.[CLICK]Use any of the many existing WebSockets/messaging broadcast providers such as Pusher, Pubnub, etc

But on this idea of generating revenue, you may have this genius idea that your e-commerce could use some real time chat or push notifications so your users engage more and buy more, for example.Genius idea, and what do you think you should do?[CLICK]Node.js right?DO NOT DO IT UNLESS REALLY NECESSARY!You don’t want the headache, believe me. Unless you’re Whatsapp, Messenger, Telegram, don’t build your messaging engine in-house.[CLICK]Use any of the many existing WebSockets/messaging broadcast providers such as Pusher, Pubnub, etc

But on this idea of generating revenue, you may have this genius idea that your e-commerce could use some real time chat or push notifications so your users engage more and buy more, for example.Genius idea, and what do you think you should do?[CLICK]Node.js right?DO NOT DO IT UNLESS REALLY NECESSARY!You don’t want the headache, believe me. Unless you’re Whatsapp, Messenger, Telegram, don’t build your messaging engine in-house.[CLICK]Use any of the many existing WebSockets/messaging broadcast providers such as Pusher, Pubnub, etc

But on this idea of generating revenue, you may have this genius idea that your e-commerce could use some real time chat or push notifications so your users engage more and buy more, for example.Genius idea, and what do you think you should do?[CLICK]Node.js right?DO NOT DO IT UNLESS REALLY NECESSARY!You don’t want the headache, believe me. Unless you’re Whatsapp, Messenger, Telegram, don’t build your messaging engine in-house.[CLICK]Use any of the many existing WebSockets/messaging broadcast providers such as Pusher, Pubnub, etc

Pusher, for example will allow 500 concurrent users, persistently connected through Websockets, and you can broadcast over 1M messages a day to all of them, for 49 bucks a month.A good team can beat it, of course. But bear in mind that just AWS will cost at least half of that, add to that the hundreds of men-hours necessary to not just build but hard proof the solution and maintain it online reliably, 24/7 and you will realize that 49 bucks a month is nothing.

Pusher, for example will allow 500 concurrent users, persistently connected through Websockets, and you can broadcast over 1M messages a day to all of them, for 49 bucks a month.A good team can beat it, of course. But bear in mind that just AWS will cost at least half of that, add to that the hundreds of men-hours necessary to not just build but hard proof the solution and maintain it online reliably, 24/7 and you will realize that 49 bucks a month is nothing.

SaaS > IaaS (avoid doing yourself)

Do not reinvent the wheelAlmost everything you may need to build a scalable, fast web application is already available. Job queues, databases, caches, monitoring tools, email solutions, etc. And they are all maintained by people that are way more experienced in each of those solutions than you can ever be. So don’t try.

y = x * 320

The problem that we, programmers, have is that we always want to outsmart ourselves, and more often than not end up screwing up badly.[CLICK]We think that by shaving off a few microseconds here and there, we will end up with a much faster solution that makes us feel that much smarter.But this is the opposite of being smart.

y = x * 320

y = (x << 8) + (x << 6)

The problem that we, programmers, have is that we always want to outsmart ourselves, and more often than not end up screwing up badly.[CLICK]We think that by shaving off a few microseconds here and there, we will end up with a much faster solution that makes us feel that much smarter.But this is the opposite of being smart.

• NO Test Suite with below 70% coverage

• NO CodeClimate below 3.0 rating

• NO source code file with hundreds of LOC

• NO COPY AND PASTE EVERYWHERE!!!

• NO functions with more than 1 page down

• NO table with dozens of fields

• NO hours to clone, setup and deploy!!

Instead of thinking about low level hackings, big bang rewrites, extreme tech shifts, or whatnot, the basics are way more valuable and often ignored. We are strange like this. But let me pinpoint a few items that are true real priorities.

This is the code climate page for Spree, as an example.You have the means to diagnose your heart, your diabetes, most health conditions, and there is no excuse to keep your valuable code locked away under a veil of mystery that only your guru programmer know how to unlock. The company must have the code exposed, transparent all the time.What modifications made it better, what modifications made it worse, When it became worse, what exactly made it worse. All valuable information to all programmers to make the code healthier again.

Spree: 68k LOC 40k are Specs! (60%)

Spree is an OSS project with around 68k LOC 60% of which are just tests!

Magento2: 300k LOC 127k are Specs! (< 30%)

In comparison, Magento 2 (which is already a big rewrite from version 1) has an astounding 300k LOC and less than 30% are tests. This is so bad.

For example, magenta does have a way larger user base but it also has an unmanageable pile of more than a thousand reported issues. The majority of the users rarely engage in the development issues threads so it’s probably much worse.[CLICK]And any fix is a liability because there are no reliable tests, they can’t know for sure if a fix solved the problem without introducing new bugs.There is no reliable way to know if new features didn’t reintroduce regression errors, and so forth. It’s a cascade of problems all the time.Which is why having tests was the #1 thing in my list of priorities a few slides back.

For example, magenta does have a way larger user base but it also has an unmanageable pile of more than a thousand reported issues. The majority of the users rarely engage in the development issues threads so it’s probably much worse.[CLICK]And any fix is a liability because there are no reliable tests, they can’t know for sure if a fix solved the problem without introducing new bugs.There is no reliable way to know if new features didn’t reintroduce regression errors, and so forth. It’s a cascade of problems all the time.Which is why having tests was the #1 thing in my list of priorities a few slides back.

Maintainability

> Performance

Instead of constantly obsess over microseconds in small performance gains, always prioritise maintainability. If you need to make your code run a bit slower in order to make it more understandable, it should be a no brainer to choose the path of maintainability.It doesn’t matter if you make the fastest code if no one else can add value to it later, including yourself after you forget what you’ve done. Bad code will always grow with bad code until it gets to a points where there is no other option but to rewrite everything too soon. And then you waste time, you can’t serve your active customers, you lose time to market, and sooner than later a competitor will take your place, and you’re out of business.So much for you microseconds faster code.

Programmers are the worst when it comes to prioritizing.

Let’s get back to when clients call me, in despair.They tell me “my system is super slow, my customers are complaining, orders are not being completed and we are losing sales. my programmers tell me we need to trash the legacy system and rewrite everything in language X because they say it’s going to be 5x as fast”This is BS I say. Give me a week to figure this out for you, but rest assured that rewriting everything is the last resort.

And usually, after a few days, I am able to fix most, if not all of their problems.What do I do? Am I some sort of magician? A conjurer? with mystic black magic powers?Of course not.

First things first, if they don’t have it already, I ask them to install New Relic RPM.It’s super easy, even the worst programmers can do it.And I ask them to let it run for a few days.[CLICK]Once I get back I start with this ranking. Every ranking in real life has this exact shape: a power law distribution, or a Pareto Law as you may have heard it. 80% of the problems are usually caused by 20% (or less) of the code. [CLICK]New Relic can show me what those 20% are, and if I prioritise on fixing just that, I can usually solve almost all of the most pressing problems.

First things first, if they don’t have it already, I ask them to install New Relic RPM.It’s super easy, even the worst programmers can do it.And I ask them to let it run for a few days.[CLICK]Once I get back I start with this ranking. Every ranking in real life has this exact shape: a power law distribution, or a Pareto Law as you may have heard it. 80% of the problems are usually caused by 20% (or less) of the code. [CLICK]New Relic can show me what those 20% are, and if I prioritise on fixing just that, I can usually solve almost all of the most pressing problems.

First things first, if they don’t have it already, I ask them to install New Relic RPM.It’s super easy, even the worst programmers can do it.And I ask them to let it run for a few days.[CLICK]Once I get back I start with this ranking. Every ranking in real life has this exact shape: a power law distribution, or a Pareto Law as you may have heard it. 80% of the problems are usually caused by 20% (or less) of the code. [CLICK]New Relic can show me what those 20% are, and if I prioritise on fixing just that, I can usually solve almost all of the most pressing problems.

• SQL N+1 Queries

• Too much SQL

• Too much SQL LIKE instead of Elastic

• Lack of proper SQL indexes

• Too much unused code that was not removed

• No CDN or proper HTTP Cache invalidation headers

• Too much synchronous work that should be async jobs

And those 20% are usually a few tweaks, few lines of code, and sometimes it’s mostly removing code instead of adding.Let’s see the most common errors we find in most projects all the time.

No Metrics No Optimization

The point is, whenever you see yourself having problems in your production environment, slow performance, not scaling, you don’t want to jump into conclusions without proper metrics first to guide you out. If your programmer does not show you proper metrics, any suggestion for optimisation is more often than not a waste of time.

The same way you don’t want to be in a plane where the pilot foregoes instrumentation and wants to land by instinct. I believe you wouldn’t like it. Don’t do it in your own airplane.

- Alan Kay

Alan Kay, another of the legendary programmers of our time said it better.

“Make it Work

- Alan Kay

Alan Kay, another of the legendary programmers of our time said it better.

“Make it Work Make it Correct

- Alan Kay

Alan Kay, another of the legendary programmers of our time said it better.

“Make it Work Make it Correct Make it Fast

- Alan Kay

Alan Kay, another of the legendary programmers of our time said it better.

“Make it Work Make it Correct Make it Fast Make it Cheap”

- Alan Kay

Alan Kay, another of the legendary programmers of our time said it better.

Unless, you’re a unicorn.If you have “infinite” amounts of VC cash to spend, you can disregard everything I said so far.

Let’s take Google for example.Back in 2005, 2006, in the dawn of the Ajax years, they released this interesting web framework called "GWT"

What happens a couple of years later? They get tired of it. Their business don’t rely on it to move forward. google.com does not rely on it, so in their perspective it’s a toy project, and they can just trash it. Whenever you see something going to the tenure of a committee, you can rest assured it won’t go anywhere, specially forward.

And what happens to GWT users? Google has an answer: move everything to this fabulous new thing: Angular. It’s in javascript, it’s awesome.

Until it’s not awesome anymore, the core team gets bored, maintaining old code becomes a hassle. So let’s scrap everything and start over. But we are so bored, we will keep the name and add “2.0” to it and call it a day. But is it compatible with 1.0? Nope, probably not. But what about the hundreds of projects using 1.0? Ow, screw them, make them move to 2.0.

You != Unicorn (not Facebook, Google, Amazon, etc)

Again, you are not a Unicorn.Unicorns can make big experiments and scrap everything out of whim. If you’re a small or medium company, or a company that really depends on the underlying tech, you should be careful.I am not saying that you should never use any of those new tech, just that you shouldn’t throw away everything and commit it all to it.

"The shoemaker's son always goes barefoot"

If you want to figure out what to trust, keep this is mind: is the tech being offered vital to the company that is pitching it? For example, Sendgrid makes a living out of sending millions of emails, reliably, fast. If this tech fails, they’re out of business. You can have a high degree of trust in a pitch about email tech coming from Sendgrid.You’re in the middle of a war between the unicorns, trying to increase their vanity metrics by amassing more followers and securing their mindshare for the next decade.

So, in summary, this is what we need to have in mind.

• Increase Revenue > Lower Costs

So, in summary, this is what we need to have in mind.

• Increase Revenue > Lower Costs

• Maintainability > Performance

So, in summary, this is what we need to have in mind.

• Increase Revenue > Lower Costs

• Maintainability > Performance

• PRIORITIES!!

So, in summary, this is what we need to have in mind.

• Increase Revenue > Lower Costs

• Maintainability > Performance

• PRIORITIES!!

• No Metrics, No Optimization

So, in summary, this is what we need to have in mind.

• Increase Revenue > Lower Costs

• Maintainability > Performance

• PRIORITIES!!

• No Metrics, No Optimization

• You != Unicorn

So, in summary, this is what we need to have in mind.

PREMATURE OPTIMIZATION is the Root of all Evil

THANKS!www.codeminer42.com

@akitaonrails

You don’t want to become obsolete, but you don’t want to be too extreme either. There is a balance in between, sorting out your priorities, getting proper metrics, coming to rational decisions out of real technical criteria, one step at a time, with technologies that will really help you out.I hope I was able to shed some light into how most programmers decision making process works and how to avoid most traps.Thank you.