Jul 27, 2012

Rapid Testing Intensive 2012: Day 4

9:02 AM – James starts us off on Day 4. We are going to look at the status of the test project in terms of what we need to accomplish and look for the holes. This is a typical rapid testing management maneuver. James is showing a graph and reiterates he doesn’t believe in fake metrics. The pink bands represent off hours and the clear bands represent on hours. At the beginning there is a very big jump in the number of bugs and then it flattens out.

9:08 AM – Turns out Paul Holland is going slow with checking the bugs – he claims its because there are only 3 people checking the bugs while 100 are reporting them. James wants to go through the bugs and check the risk areas to get general impressions so one of the activities together today or tomorrow might be to place risk measures on beach bug. The graph may make it on the front of the bug report. Dwayne says he isn’t sure of the value of the graph and James says he also isn’t sure but he doesn’t need to know the value because he thinks it will provoke interest of the reader – in this case eBay.

9:12 AM – In rapid testing we don’t put up graphs that give the impression we want to give, which is why James will filter out all duplicates and clean out the rest of the noise that could mislead readers. The graph could give the general impression of industriousness of the group over the four days we were here. Keep that skepticism in mind before considering showing metrics like this. You should always have someone doing bug triage otherwise you get a lot of noise in your reports and nothing gets corrected – no pressure. If you don’t have a big team, if you can’t dedicated someone, you can do it one at time at the time of the reporting.

9:22 AM – If you don’t do bug triage then you get a lot of complaints from developers and managers even though they’ve never looked at them. It takes time but its worth it. James says at Borland they could do 20 bugs an hour and they determined out of the 800 bugs about 400 were legit bugs. You’ve got to maintain the quality of the list. After the first triage you get a much better feedback loop from that information. After the scrubbing we will want to see what eBay’s final decisions are about the bugs. How do they rate the bugs we’ve created, what do they think of the bugs we’ve reported, how many do they end up fixing? That’s the big thing.

9:26 AM – To clarify test strategy James starts making a test report. He pulls up the Test Report for eBay Motors that he and Jon are working on building for this Intensive. This is going to be a professional and comprehensive test report because James is aware of the multiple constituencies for this report. eBay Motors has a number of groups who may look at the report. Apparently Jon had to pay for his trip here and maybe other groups in eBay will want to use this event next year and perhaps pay for Jon and his families trip. Jon wants to get away from test cases and automation as a first path at eBay. One constituency is eBay Motors, another is eBay’s other groups, another is us – because we will get a professional artifact to go on our CV. James is going to have to edit extensively because there are so many people reporting artifacts and there are so many overlaps but he’s going to get everyone in the report who contributed.

9:35 AM – Different sections of the report for different constituencies which makes it comprehensive. Some parts of the report will be comprehensive and others will not. If James forces himself into thinking – what do I need for a final final report what holes do I have? That can focus him. It points out what is not getting done. It’s called a forward backward method, from a book called how to read and do proofs chapter 2.

9:40 AM – Remember the three levels from yesterday? The same thing goes into the test report. It’s a challenge to identify all of the testing that has been done, especially from James point of view, because there are so many groups doing things outside of James and Jon’s view.

9:45 AM – Jon is showing his screen which is a to do list – he calls it a punch list, apparently it’s a home building term? James is reading out of his report from the risk area section. Karen says she uses her low tech testing dashboard and she can use that information to contribute to the lean areas of the report.

9:52 AM – James took a poetry class to meet women when he was 24 and it turns out only middle aged women take that class. (The entire class started laughing.) The upside from taking the class was he learned poetry and he learned consensus can happen which brings people together. The risk areas of the test report are phrased in terms of a question – for this specific list. James ask the class why he changed from a statement to a question? He is trying to name the problem he is interested in without making a statement that it’s actually a problem. To make it less confusing James turned the risk areas into a question.

10:15 AM – Looking and talking about test coverage outline of things we accomplished during our sessions that James put together and include in the test report. James films all his testing that he does professionally, he does note taking which can creates time stamps so its easy to refer back. He also does session notes which are crude but can help you locate the relative area for referring back to your videos.

10:24 AM – Both part 1 and part 2 of our reports are available on the internal site, which I’ll try and download and post online later. This is one of the reasons James went offline yesterday so we could get feedback on our reports. You can use screen capture tools that take regular, automatic screen shots or a video capture tool to watch you test.

10:30 AM – James is showing us a video of how he records his testing. He’s got a small tripod for his camera, the camera is placed under his left shoulder, the screen is zoomed in, and you can see he is using some log to record the inputs. With a detailed recording you can have confidence in what you tested. Scripts aren’t the answer because no one really follows scripts which invalidates the script – they didn’t setup and follow an oracle.

10:40 AM – Rapid testing is based on the idea of skill, testing credibility and trust and without that all these things are empty documents. You can’t stand behind a report you created using other peoples work unless you’ve done the testing work. This is the reason why Jon and James have to examine the work we’ve done this far before they can include our work in a testing report. Most of us don’t have the reputation with James and Jon where they can except without question the work.

10:48 AM – Break time. Jon and James are going to try to make their punch list a bit bigger.

11:06 AM – We are back and apparently are going to do some triage with Jon leading. I found the camera that James uses to record his meetings: Samsung HMX-Q10 camcorder. We are looking one of the bugs from JIRA and trying to figure out how many of the steps are relevant. Jon is editing the bug, it looks like this particular bug is not actually a bug because its mis-categorized, however Jon is taking notes in a separate document.

11:30 AM – Still discussing this one particular bug.

11:40 AM – How can you make the bug triage meetings go faster? We are still having a conversation like this. But as you develop as a team and as the project proceeds conversations like this go away because people understand the process. Unfortunately when you add or change team members you have to bring them up to speed. The culture can perpetuate itself as long as the project and people stay together. Process does not improve when someone writes a document, it improves when everyone adjusts what they say or are about to say based on something they learn.

11:47 AM – We have moved on to another bug. Except now James is questioning why we moved on. James and Jon have reproduced the bug and as Andrew is pointing out we may have a data consistency error with the criteria for vehicle compatibility in eBay Motors. James mentions black flagging: a situation where you see a bug, it means reporting is not enough for this bug because if you report it the developer will fix it. As James puts it we want its whole family hunted down. According to Jon Black Flag is a racing term that can mean get all the cars off the road because a car is causing damage than can affect other cards on the road. This is the type of bug you want to have a meeting on to understand its consequences.

12:02 PM – With eBay we should pay attention to the URL to check, for searches, whether the URLs are similar or different because the system could be passing different variables despite the same interface selections. James says he uses burp and other proxies to record this type of information.

12:05 PM – Andrew Prentice (part of team TRON) recommended we talk through the bug list and the real value “as we come together as a family and have family time” and agree on things. We are taking a group photo and then lunch time now!

1:15 PM – We are postponing Matt Johnston of uTest and James is talking about deep testing in rapid software testing. Apparently Michael Bolton’s “daughter” found a bug in this game: http://www.horse-games.org/Horse_Lunge_Game.html and James is going to talk about state based testing in this game. We get to think about what state based testing might be – we don’t have to know what it is exactly.

1:23 PM – What is x based testing? Put anything in for x – for example state based test. Any testing organized around x model. You change the state of something when you test it but that’s state related testing, state based testing means focusing on the states on purpose. We need to know what the states are. There will always be questions about what the states are and you need to make a practical decision on what they are. State based testing is deep coverage testing.

1:32 PM – (Combination testing slides.) What if you have a lot of variables, they interact, and you want to test them systematically? It’s called combinatorial testing. The first step is to identify the variables that might interact in a way you need to worry about. Remember all testing is based on models. Actually the first step, or step 0, is to learn enough about the product to identify interacting variables – survey the product, interview people, exploratory testing, etc.

1:38 PM – James says testers need to understand Cartesian products. Testing something that has no risk is called inexpensive testing or free time and you do that because your idea of a model or risk might be wrong. We talked about Ashby’s Law of Requisite Variety and galumphing which all fit into combinatorial testing by helping us pay attention to strange and subtle outputs. Combinatorial testing goes hand in hand with tools. In combinatorial testing you use test cases and not test activities because they are mainly the same but slightly varied. This is one of the rare times you can count metrics because in combinatorial testing they are comparable.

1:52 PM – A derangement in combinatorics is a dis-ordering of a set to make sure the set isn’t in it’s natural state. I’ve found it here on WolframAlpha. James talking about gray code – arranging combinations so that only one thing changes between each test case. In fact one of the participants, Leslie, pointed out we do this in the dice game. It’s a focusing concept to reduce chaos.

2:10 PM – A de Bruijn sequence packs combinations into a sequence and James is again showing a slide. You don’t use de Bruijn sequences or gray code very often but its a tool for combinatorics testing. James is now talking about pairwise testing where there is a slide with 27 combinations which you can reduce to 9 test cases. Another slide shows a Microsoft Word Options Panel with 12,288 tests and using an ALLPAIRS tool to reduce to 10 tests – but those 10 tests may not include some important things like the defaults, all on all off (the Christmas tree heuristic), popular settings, etc.

2:20 PM – We have Matt Johnston from uTest on the phone. Talking about the differences between beta testing and crowd sourcing. How do you know when people who say they’ve tested something have actually tested something? uTest will suspend user who falsify work, it can affect their rating and the uTest system is built to monitor those kinds of patterns. People are paid for approved bugs, reports, etc. Customers pay per cycle (I’ve blogged about this earlier) or monthly. James said he was negative and doubtful of uTest and now is coming around – he likes uTest for the fact people with no experience can come and get experience. Matt mentions, which I think is the best reason for uTest, you get variety in the things you test.

2:40 PM – If someone wants to get into testing, they can sign up for uTest, fill out a tester profile, go to the forums, they will get invited to the sand box (which is unpaid) to try it out. James said he might sign up except he’s worried that his reputation would take a hit from it. He also says he can demonstrate to European people who want to get into testing that they don’t have to become certified to get into testing, they can go home, join uTest and start testing.

2:45 PM – Break time.

3:00 PM – Finally a session! We are going to do a search on My Vehicles that returns little returns and a few that returns a lot of results. This is an informal combinatorial testing session on factors of the left filter and we are going to file our session report in JIRA.

4:18 PM – Session time is up even though I’m not quite done working on a problem. James is thanking everyone now for coming because a few people are leaving before the final session tomorrow. It took a lot of James’ friends to bail him out and help get this Intensive done – Scott you aren’t getting paid. It takes a lot of people to keep up with the onliners – we had James and Jon Bach, Karen Johnson, Paul Holland and Rob Sabourin with Scott Barber and Michael Bolton, etc. online. Jon is very happy to have everyone. James wants feedback (for his wife to review) on whether we felt this was a good thing for people. An email will be sent out asking for that feedback.

4:25 PM – Debrief on our sessions based on our groups. Jon and James came up with a combinatoric testing charter for the session we completed, it seemed like a good idea to them and then through our onsite people it turned into something else. We did some combinatoric testing but Andrew, Mark and Thomas switched into a privacy testing mode off an inspiration from Andrew and Dwayne and I focused on filters once we started seeing problems.

5:11 PM – Done!

Photos from the event have been posted on Flickr.
Check out the other days:

Subscribe to Shattered Illusion by Chris Kenst