MacBook Pro 2012 and 2015 Performance Benchmarks

Apple tends to make small improvements with each iteration of it’s laptop lines. Between my personal mid-2012 MacBook Pro with Retina and my new(er) mid-2015 work MacBook Pro with Retina I didn’t think there was a much of a difference. Visually they are identical. Most of the tech specs also line up such that I assumed they were equivalent in terms of performance as well.

tl;dr I was wrong.

Looking at the system information and/or the laptops themselves they appear to be quite similar:

From the specs I assumed my personal computer would have the better performance (but that would be odd given it’s age), although my work computer does seem to handle Docker better. To determine which was more performant I decided to use Geekbench to take benchmarks and compare.

Geekbench takes two benchmarks:

Here’s how the two computers compare:

 Score Type 2012 2015
Single Core CPU 3447 3914
Multi Core CPU 11631 13660
OpenCL 6499 27007

Both CPU scores are pretty similar but the massive difference is in the OpenCL compute power which includes the CPU, graphical processing, hardware accelerators, etc. The newer 2015 MacBook Pro is dramatically more powerful. This makes me wonder how much more powerful the newest MacBook Pros are..?

These benchmarks highlight that as consumers (and developers) we shouldn’t only rely on big feature advances when considering an upgrade. Especially with Apple devices where such big feature advances (like the Touch Bar) are few and far between. Instead we should look to the performance gains over time of less obvious features like compute power to make our decisions. (I think I’ve convinced myself!)

Humans and Machines: Getting The Model Wrong

It seems like one of the more prominent and perpetual debates within the software testing community is the delineation between what the computer and human can and should do. Stated another way, this question becomes “what parts of testing fall to the human to design, run and evaluate and what parts fall to the computer?” My experience suggests the debate comes from the overuse and misuse of the term Test Automation (which in turn has given rise to the testing vs. checking distinction). Yet if we think about it, this debate is not just one within the specialty of software testing, it’s a problem the whole software industry constantly faces (and to a greater extent the entire economy) about the value humans and machines provide. While the concerns causing this debate may be valid, whenever we hear this rhetoric we need to challenge its premise.

In his book Zero to One, Peter Thiel, a prominent investor and entrepreneur who co-founded PayPal and Palantir Technologies, argues most of the software industry (and in particular Silicon Valley) has gotten this model wrong. Computers don’t replace humans, they extend us, allowing us to do things faster which when combined with the intelligence and intuition of a human mind creates an awesome hybrid.

Peter Thiel and Elon Musk at PayPal

He shares an example from PayPal: 1

Early into the business, PayPal had to combat problems with fraudulent charges that were seriously affecting the company’s profitability (and reputation). They were loosing millions of dollars per month. His co-founder Max Levchin assembled a team of mathematicians to study the fraud transfers and wrote some complex software to identify and cancel bogus transactions.

But it quickly became clear that this approach wouldn’t work either: after an hour or two, the thieves would catch on and change their tactics. We were dealing with an adaptive enemy, and our software could adapt in response.

The fraudsters’ adaptive evasions fooled our automatic detection algorithms, but we found that they didn’t fool our human analysts as easily. So Max and his engineers rewrote the software to take a hybrid approach: the computer would flag the most suspicious transactions on a well-designed user interface, and human operators would make the final judgment as to their legitimacy.

Thiel says he eventually realized the premise that computers are substitutes for humans was wrong. People can substitute for one another – that’s what globalization is all about. People compete for the same resources like jobs and money but computers are not rivals, they are tools. (In fact, long-term research on the impact of robots on labor and productivity seems to agree.) Machines will never want the next great gadget or the beachfront villa on its next vacation – just more electricity (and it’s not even smart enough to know it). People are good at making plans and decisions but bad at dealing with enormous sets of data. Computers struggle to make basic decisions that are easy for humans but can deal quickly with big sets of data.

Substitution seems to be the first thing people (writers, reporters, developers, managers) focus on. Depending on where you sit in an organization, substitution is either the thing you’d like to see (reduce costs – either in terms of time savings or in headcount reduction) or the thing you dread the most (being replaced entirely or your work reduced). Technology articles consistently focus on substitution like how to automate this and that or how cars are learning to drive themselves and soon we’ll no longer need taxi or truck drivers.

Why then do so many people miss the distinction between substitution and complementarity, including so many in our field?


And nothing else funny happened

I was recently talking with someone about their testing strategy and process when I noticed they were trying to build overly-detailed test scripts (procedures). It didn’t take them long to realize specifying to such detail often left them bored (writing became redundant) and so each test became less and less detailed. I offered to take a look at their tests to see if I could help improve things and as I saw, what I consider to be, their “typical” scripted tests with each line having a step and expected result I started thinking of something Pete Walen once said:

…and nothing else funny happened.

Peter Walen, STPCon

Pete Walen dropped this gem at STPCon in San Diego a few years ago during Matt Heusser’s talk Where do Bugs Come From? Matt had just shown the Moonwalking Bear video:

I’d guesstimate at least half of the packed room hadn’t seen this. The discussion turned to scripted testing and the inherent problem of in-attentional blindness. Pete shared a past experience where he told some testers he was working with to include the phrase in the expected results of their scripted tests. It’s a kind-of-heuristic to help someone remember, even though they are focusing on one thing at depth, they need to be aware of the potentially many other things going on so that “nothing else funny happens”.

It was such a simple, powerful idea it continues to stick with me, especially when I see someone trying to specify “most everything” in their scripted tests.

As it happens, this post coincides with STPCon 2015 back in San Diego right now. There looks like a lot of awesome speakers (Matt Heusser, Michael Larsen, and Smita Mishra) but I’m particularly intrigued by Andy Tinkham’s High Volume Automation in Practice and Dave Haeffner’s Using Selenium Successfully. I wonder, is anyone live blogging?

First principle reasoning

When I was young I remember wanting to be an awesome football player like Joe Montana, or an FBI agent working on the X-Files like Fox Mulder. These days I want to have the skills to identify and solve problems like Elon Musk.Musk is an interesting person. He’s created and sold numerous companies and with the profits he’s created a rocket building / space exploration company that is now the first (private) company to make it to space. He’s also built an American electric car company. While all these things make Musk an interesting person on the surface, its his approach that makes him enviable.In his TED talk Musk credits his training in physics with his ability to see and understand difficult problems. He says physics is about how to discover new things that might seem counterintuitive and physics first principle provides a good framework for thinking. The video is a good conversation between Elon Musk and TED curator Chris Anderson, I recommend watching it. Musk mentions first principle reasoning at about 19:37:

According to Wikipedia first principle in physics means you start directly at the lowest levels – at the laws. Musk provides a slightly easier understanding saying first principle is reasoning from the ground up as opposed to reasoning by analogy, which is copying what other people do with slight variations.

Musk further elaborates on first principle reasoning in this video. To summarize he says you look at the fundamentals of things, make sense of it, construct your reasoning and conclusion and (if possible) compare that to whatever is the current understanding. A part of that process involves questioning conclusions, asking whether or not something could be true. Sounds like Musk’s constantly modeling, learning, testing and re-modeling.

In thinking about my job there seems to be more reasoning by analogy than perhaps there should be (or at least its obvious to someone new). Whenever one of my developers or I ask why something is, why some conclusion has been reached – the typical response is “that’s how its done”. If I ask my test team why they do something a certain way its always “that’s how we’ve always done it” and there seems to be no desire (at least I haven’t seen it yet) to know whether something makes sense or is based on a real understanding of the problem. Perhaps there should be more modeling, learning and testing?

We all do some reasoning by analogy, in many ways it’s a much simpler way to communicate and learn but for many of us in the software engineering fields (testers and developers) perhaps we confuse reasoning methods? So how do we determine when we need to use first principles and when its ok to use analogy in reasoning? That’s the million-dollar question. I think we do like Musk: create a model, ask questions to help us learn, test and when we aren’t satisfied with the answer, we reason from the ground up.

Building context like Thomas Jefferson

During a recent trip to Washington DC I got to view the library Thomas Jefferson sold to the Federal Government in 1815 for $24,000. The sale contained some 6,487 books which are now part of the Library of Congress.

Jefferson built his library over the course of his life, collecting books from every place he visited, in every category known to man. Anything he needed to learn or wanted to learn came from books. In fact Jefferson had so many books he had to come up with categories to place them:

  • Memory
  • Reason
  • Imagination
  • History
  • Philosophy
  • Fine Arts

What does this have to do with testing? Testing like most things is about learning. In the context-driven school of testing (yes I used the word school) good practices come from context and the way we place things in context is to have a broad base of understanding. Just like Jefferson had (I’m assuming). When he would deal with problems in his private or public life his wide ranging education allowed him to frame problems and solutions.

Today we don’t have to learn just by reading, although books are our largest source for information, we have many forms of communication for learning to be better testers (I’m not talking about formal education) that haven’t existed for very long like virtual conferencing, conferences, blogs, etc. We (including myself) don’t really have an excuse for not using them!